I am new to RNA seq analysis and want to look at the expression of a small number of genes (8) in some publicly available RNA seq datasets.
I came up with a simple method that avoids me having to align the RNA seq reads to the genome and do a full tophat/cufflinks analysis (or similar). Briefly, what I did was: Download and QC filter SRA dataset > map the reads to a multi-fasta file containing the exonic sequences for my genes of interest with bowtie2 > filter out hits with MAPQ<30 > obtain FPKM values using eXpress.
Does anyone have any thoughts on whether my method is valid or not? My main concern is that mapping reads to such a low complexity reference may artificially inflate the number of reads that map to my genes of interest and bias the FPKM values?
Thanks in advance :)