While doing RNA-seq analysis, when I mapped reads for each condition to the reference genome (of same stain of Geobacillus sp.) with TopHat I get quite low percentage (lower then 60 % in each condition) of overall mapped alignment rate, for example, in following alignment summary I am not able to understand why the alignment rate is low even though the genomic data and RNA-seq data are from same stain. Can anyone please help me to interpret from the following alignment summary? Is something wrong with RNA-seq data?
Even size of mapped bam files are 6G (size on drive) and Unmapped bam files are less than 100M.
Input : 13923415 Mapped : 7248369 (52.1% of input) of these: 6893771 (95.1%) have multiple alignments (306448 have >20)
Input : 13923415 Mapped : 7103432 (51.0% of input) of these: 6748616 (95.0%) have multiple alignments (306338 have >20)
51.5% overall read mapping rate.
Aligned pairs: 5267947
of these: 4923439 (93.5%) have multiple alignments 29026 ( 0.6%) are discordant alignments
37.6% concordant pair alignment rate
hi mayank ..could you improve your mapping perentage. ia m also stuck with the sam e. have tried trimming , clipping etc. but n o lucky yet.