Question: Interpreting Stats output of SAM tools
0
gravatar for hpremathilake
21 months ago by
hpremathilake20 wrote:

Would greatly appreciate if someone can explain to me the difference between the two BAM summary numbers, namely; "raw total sequences" and "sequences" that appears on the output of the Stats module in SAMTools.

I used Stats of SAMTools to generate summary statistics for a pooled BAM file. In the output I got, "raw total sequences" = "sequences". Does it mean that all my raw sequences were usable sequences and none were filtered out?

Best Rgds, Hasitha.

bam samtools chip-seq • 2.4k views
ADD COMMENTlink modified 21 months ago by Jennifer Hillman Jackson25k • written 21 months ago by hpremathilake20
0
gravatar for Jennifer Hillman Jackson
21 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

If none were filtered, then yes, these counts will be the same.

This is an example stats report from a BAM dataset that did have some minor filtering:

# Summary Numbers. Use `grep ^SN | cut -f 2-` to extract this part. 
SN  raw total sequences:    48520                                                                                                                                                                   
SN  filtered sequences: 2                                                                                                                                                               
SN  sequences:  48518

Samtools help - most command line options are implemented in the Galaxy wrapped tool versions: http://www.htslib.org/doc/samtools.html

Thanks! Jen, Galaxy team

ADD COMMENTlink written 21 months ago by Jennifer Hillman Jackson25k

Hi Jen, So the "sequences" = usable sequence is it? that can be used to map against the reference genome. Best, Hasitha.

ADD REPLYlink written 21 months ago by hpremathilake20

Hi Hasitha,

The sequences pass the filtering criteria you used, but that doesn't necessarily mean that they are usable/mappable. Run a tool like FastQC to check the quality and trim/filter if needed.

Assessing fastq data quality and many other common operations are explained here: https://new.galaxyproject.org/tutorials/ngs/#fastq-manipulation-and-quality-control

Take care, Jen

ADD REPLYlink written 21 months ago by Jennifer Hillman Jackson25k

Thanks, Jen. You too. With all this adverse weather

ADD REPLYlink written 21 months ago by hpremathilake20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour