I mapped paired-end sequencing with RNA-STAR and got the BAM file. I then use featureCounts to count the reads with a gene annotation file (an GTF file generated from UCSC table browser). But I found that some genes have extreme low counts. I checked the BAM file with IGV and the BAM file shows lots of reads in these low count genes.
Two genes for example:
GAPDG
and T
Whereas their counts are:
Initially I though I input the wrong strand parameter so I tried run featureCounts with Unstranded, Forward and Reverse strand and check their results, but that doesn't solve the problem. (Btw my sequencing file should be "Reversed.")
I also checked the GTF file used as Gene annotation input file.
Their "exon" position is definitely correct in the mm10 genome! (I sure every of my data were processed with mm10)
Interestingly, if I run featureCounts with builtin Gene annotation file, the output file seems to have correct counts in these genes. But the builtin Gene annotation is not able to provide transcription isoform information due to that it only output the counts with the geneID.
Does anyone know how to fix it?
Hum..... I think I know what's wrong.... I should set the "Allow read to contribute to multiple features" parameter in "Advanced options" to "Yes." So I guess the problem is caused by that the reads belong to GAPDH and T (and other genes with same problem) were already counted on other RefSeq feature in the supported GTF file. Thus, if counts of each transcription isoform is desired, the parameter of "Allow read to contribute to multiple features" should be set to "Yes."