Tool: How to count HiSeq sequences at Galaxy?
gravatar for kp5091
4.2 years ago by
United States
kp50910 wrote:

Is there any way to count transcripts?  I am now grooming and filtering my HiSeq data.  I would like to know how many sequences were excluded by these steps.  

rna-seq tool • 1.2k views
ADD COMMENTlink modified 4.2 years ago by Jennifer Hillman Jackson25k • written 4.2 years ago by kp50910
gravatar for Jennifer Hillman Jackson
4.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:


For sequence data (.fastq types), tools in the group "Text Manipulation", "Filter and Sort", and "Join, Subtract, Group" can be used in combination to do many counting and summary tasks. Convert any Fastq data to tabular first ("NGS: QC and manipulation -> Fastq to tabular) to make the data available to these tools, or use "Filter" to pull out lines that are just for sequence identifiers. 

You mention "transcripts" - does this mean that you have proceeded with a pipeline such as the tuxedo analysis in "NGS: RNA-seq"? If so, and you want stats from that level as well - summary counts can be obtained from certain of these output files (specifically the tracking files, and more advanced counts by comparing to input GTF/GFF3 reference annotation). See the CuffDiff manual for how these datasets are formatted (the dataset name and original file name from the tool will be similar). Our hub for RNA-seq analysis with many link-outs to resources can be found here:

Hopefully this helps but if you need more details for a specific task not covered here please let us know!

Best, Jen, Galaxy team

ADD COMMENTlink written 4.2 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 82 users visited in the last hour