Suppress Reporting Hit Number

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Suppress Reporting Hit Number

0

7.6 years ago by

Hsin-l (Sam) Chiang • 10

Hsin-l (Sam) Chiang • 10 wrote:

Hi, I used the Megablast function (in the NGS: Mapping\ROCHE-454\) to analyze my FASTA sequences against nt database and it worked fine for me. However, it generated 56,804 hits although my query has only 1000 sequences. I am wondering is there any way to suppress the number of reported alignments to just one best hit per sequence? (In the local BLAST there are parameters such as -K1 -v 1 -b 1 to do so, but I can't find similar options in Galaxy). Many thanks! Sam

galaxy • 757 views

ADD COMMENT • link •

modified 7.6 years ago by Jennifer Hillman Jackson ♦ 25k • written 7.6 years ago by Hsin-l (Sam) Chiang • 10

0

7.6 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello Sam, When running Megablast, filtering by identity or evalue can help reduce the hits (the default values are all fairly permissive, if you are performing the query vs the same species target genome and the query has been filtered for base calling quality). Filtering out low-complexity would also be a big help, as a guess, considering the number of hits generated from your initial data. There is also the "Parse blast XML output" tool. Modifying the data into interval format would allow the use of the "Operate on Genomic Intervals -> Cluster the intervals of a dataset". This is based on coverage, if that is one of your criteria (could be, if the threshold for identity is a range you consider to be candidate choices for "best"). Identity & coverage are commonly combined to identify "best", but this is just a suggestion. The same type of logic could be used with top scoring evalue matches combined with coverage (would likely be similar as using evalue alone, if the identity is set to be high). The idea to add a filter for "single best" is a good one, but has some complexity associated with it. I will pass it along to the team as an enhancement request to consider. Hopefully this helps! Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org

ADD COMMENT • link written 7.6 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

Using Megablast to return a Single Hit
Hello, I am trying to use Megablast in Galaxy to search a file full of environmental 16s samples...
Gi Error/Shift In The Output Of Megablast ?
Dear all, I have a trouble with the Megablast program available in NGS Mapping and I hope that y...
Gi Error/Shift In The Output Of Megablast ?
Dear all, I have a trouble with the Megablast program available in NGS Mapping and I hope that y...
Galaxy Download
Hi all I downloaded galaxy om my Mac OS 10.6.8 and have followed the following steps: Install G...
Bowtie on Galaxy - -v or -n ?
Good morning, I'm using Galaxy with Bowtie for Illumina to map smallRNA sequencing (illumina) on ...
Need help with "Stitch MAF blocks" tool
The "stitch-maf-blocks" report that I get 763,251 sequences. After I download the result, the num...
Cuffdiff
Hi, I got confused while trying to perform Cuffdiff for my RNA sequencing analysis. So I have fi...
Metagenomic Workflow
Hi, I am trying to analyse my eukaryotic metagenome data using yours workflow for windshield spl...
pipeline for DNA-seq analysis
Thanks for all your help. Finally I got the data uploaded on the Galaxy. As suggested there was a...
No error with CD-HIT-EST-2D
I am using a HPC at my facility to cluster 8 denovo transcriptomes for downstream analysis using ...
FASTQC for RNA-Seq data in Galaxy
Hello, I have 3 paired end fastaq file and when i loaded to Galaxy through EBI-SRA, it shows XX...
RNA-seq analysis for a small number of genes
Hi all, I am new to RNA seq analysis and want to look at the expression of a small number of gen...
fetch taxonomic representation tool only returns a small subset of original
I used the NCBI BLAST+ blastn in Galaxy and get a return of approximately the same number of hits...
FastQC Kmer in centre of sequence, quality trim?
I have run a FastQC on some fastqc sequences and the report tells me I have contaminating Kmers. ...
Megablast Question
Hi, I am using megablast and was wondering how can I get chromosome number and coordinates of its...
Keep only the best match after Megablast
Dear all, Is there a way, after Megablast, to filter for each read the best match? In my case, ...
Duplicated sequences within a gene_id in fasta entry?
Hello, I used the Extract Genomic DNA function in Galaxy and it outputted a fasta file using my ...
Error using stringtie - AttributeError: 'NoneType' object has no attribute"
Hi, I have **RNA-seq data** and I am interested in whole gene expression results but also transcr...
how to identify the adapter sequence?
After getting the FastQc reports, which report can I use to know the adapter sequence used in the...
Diamond makedb and Diamond alignment
I made a protein database using diamond makedb and then a blastx using a nucleotide query. First ...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 182 users visited in the last hour