I just got back RNA seq data produced on the HiSeq 4000. They prepared the libraries and performed paired end sequencing. We have 4 biological replicates from 4 experimental groups. Each sample was also run in two separate lanes (we were told these technical replicates willl give us more information on isoform expression). I therefore have 16 files per experimental group (forward and reverse for each of the 4 biological replicates, each run in two lanes.
I believe that the files are in fastqsanger format. My question is how should I combine the data and process it in such a way that it is ready to be mapped?
From what I've read I will change the data type to fastqsanger, concatenate the technical replicates for the forward and reverse reads and then trim the files. Which tool should I use for the trimming? Once it is trimmed can I go straight to mapping, and should I use tophat or bowtie2?
I would be able to speak on the phone if that makes it easier to assist me, and I appreciate and help and advice you can provide.
Patrick Desmond, PhD Postdoctoral fellow UCSD School of Medicine