Question: Why gene counts from RNA STAR don't match total uniquely mapped counts
I used RNA STAR to map my reads for a stranded-RNAseq library. Within RNA STAR, i turned on the option to include gene counts and so I included an hg38 GTF file from UCSC table browser. RNA STAR said that of my 20 million total reads, 86% uniquely map (so 17 million reads uniquely mapping), however, when I take the sum of all the gene counts that RNA STAR outputs, it only adds up to 5 million reads. My question is how come the total sum of all the gene counts doesn't add up to 17 million reads and why is it only 5 million? Considering that I'm using RNA STAR to do both the alignment and gene counting I thought they should be concordant.


Many alignments in your dataset aren't to genes, but rather intronic or intergenic regions. Those can't be in the counts but can still align uniquely.

Hi Devon,

This is RNAseq data not DNAseq. Using Agilent Universal Human Reference RNA that has undergone additional DNAse treatment so it can't be from DNA.

Some RNA-seq reads will map to non-transcript/gene defined regions on the genome.

