I am currently a final year undergraduate studying at UoM set with the final yet project of RNA Seq differential expression. I have been using a breast cancer cell line with added E2 and RNA Seq data collected over a time course; 0, 30, 60 & 90 mins.
I have used the standard workflow: Groomed, TopHat, Cufflinks, Cuffmerge then CuffDiff. When I run for example 0 vs 30 to search for gene differential expression without clicking yes to the generate SQLite the cuffdiff outputs run to completion. However, when I click yes to generate SQLite database for cummeRbund the runs fail to complete.
Anyone have any idea where I am going wrong?
Also if you could help me with these questions that I was be grateful.....
When using cuffmerge to merge cufflinks assembled transcripts output do I need to use a annotated reference, if one was used for the cufflinks input.
The paper used Ilumina HiSeq 2000 but I do not know the insert size and the read length. Does anyone know what the mean inner distance between mate pairs value would be for the Tophat mapping stage?