When looking closely to my alignments data I found something interesting. Some of my reads are aligned to the Y chromosome while the sample is from a ovarian cancer cell line - in short a female donor.
Indeed all of the these reads are aligned to repeated regions and for each gene on the Y chromosome having any reads aligned I can find a paralogue on the X chromosome.
Although these reads do not represent a high occurence, I still fear that it may falsify the calculation for gene / transcript expression level, since the genes on autosomes are not affected by the duplication.
I wonder if there is any way to turn off the Y chromosome when using tophat2 (I'm aware of the simple method of removing Y chromosome temporally) or merge the read counts before doing downstream analysis.