We have performed 100bp PE RNAseq with libraries prepared with the Illumina Truseq stranded kit.
We are trying to determine the sense and antisense transcription, however it is no working very well to separate them. What seems to happen is the following. The analysis recognizes from the forward read which strand is being transcribed, because if we analyse only the forward reads, we can generate a positive strand file and a negative strand file. For most genes, transcription is confined to one or the other file, as expected (genes going from left to right on the genome browser are in the plus file, and right to left in the minus file). However, if we include the paired end sequence, we loose all this and actually get a mirror image (transcription from both strands with minor differences between the two). To me, this indicates that the program ("Create a BedGraph of genome coverage" within BedTools) does not keep in mind that when sequencing from the other side of the DNA molecules (the paired end), it is sequencing the opposite strand compared to the first sequences, and therefore it would have to invert the strand specificity.
So, my question therefore is: Am I correct and what would the solution be?