Is there any way to count transcripts? I am now grooming and filtering my HiSeq data. I would like to know how many sequences were excluded by these steps.
Hello,
For sequence data (.fastq types), tools in the group "Text Manipulation", "Filter and Sort", and "Join, Subtract, Group" can be used in combination to do many counting and summary tasks. Convert any Fastq data to tabular first ("NGS: QC and manipulation -> Fastq to tabular) to make the data available to these tools, or use "Filter" to pull out lines that are just for sequence identifiers.
You mention "transcripts" - does this mean that you have proceeded with a pipeline such as the tuxedo analysis in "NGS: RNA-seq"? If so, and you want stats from that level as well - summary counts can be obtained from certain of these output files (specifically the tracking files, and more advanced counts by comparing to input GTF/GFF3 reference annotation). See the CuffDiff manual for how these datasets are formatted (the dataset name and original file name from the tool will be similar). Our hub for RNA-seq analysis with many link-outs to resources can be found here: https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq
Hopefully this helps but if you need more details for a specific task not covered here please let us know!
Best, Jen, Galaxy team