Question: Cuffcompare successful but generate empty files.
0
gravatar for roxy.zhang
19 months ago by
roxy.zhang0
roxy.zhang0 wrote:

Hello friends

I am currently doing RNA-seq for 3 datasets (obtained publicly through NCBI-SRA), each dataset is for 1 individual and all three individuals have different conditions. (c9ALS, sALS, Control). I have successfully ran tophat, and cufflinks. I am now trying to use Cuffcompare/ cuffmerge to examine all the transcripts. However, I am having some difficulties. Cuffmerge would not run and sends out a message that says "Fatal error: Matched on Error Error running cuffmerge. The output file is empty, there may be an error with your input file or settings.". And with cuffcompare, although it will run successfully and turn green, the files will be empty. This is really strange to me, because my tophat and cufflink files are large and I am able to view splice junctions on IGB. (Though, through looking at splice junctions, I am under the impression that there should be multiple assembled transcripts but there is only 1, again this is looking at the files using IGB, maybe there is something wrong there as well?).
I am not very familiar with RNA-seq; very confused. Thank you very much in advance for all your help!

Thank you, Roxy Zhang

ADD COMMENTlink modified 18 months ago by Jennifer Hillman Jackson25k • written 19 months ago by roxy.zhang0
0
gravatar for Jennifer Hillman Jackson
18 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The Cuffmerge error is likely because the inputs are not sorted, including BAMs and GTF datasets. This is how: https://galaxyproject.org/support/sort-your-inputs/

Cuffcompare should also be given sorted inputs and those cannot be empty.

Any reference annotation used with any tools should have the same exact chromosome identifiers as the reference genome used for or with the mapping step (optional input for Tophat or HISAT). All inputs should be based on hg19 for your case, with iGenomes as one source.

  1. iGenomes reference annotation for hg19 (have chromosome identifiers that match the hg19 genome)
  2. http://cole-trapnell-lab.github.io/cufflinks/getting_started/#using-pre-built-annotation-packages
  3. or http://cole-trapnell-lab.github.io/cufflinks/igenome_table/index.html.
  4. Download the .tar file, uncompress it locally, and upload just the genes.gtf file to Galaxy for use. Compressed data in .tar format cannot be loaded directly and would be very large for this genome plus the complete archive includes data you won't need that will use up much of your quota.
  5. Checking for mismatched identifiers https://galaxyproject.org/support/chrom-identifiers/
  6. Galaxy RNA-seq tutorials https://galaxyproject.org/learn/
  7. Manual with sample protocols and a link to the google forum for the tool suite: http://cole-trapnell-lab.github.io/cufflinks/manual/

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 18 months ago • written 18 months ago by Jennifer Hillman Jackson25k

Hi Jen

Thank you very much for your prompt replies. I have encountered another problem however. I have permanently deleted some datasets in my history but it is still showing in my work space/ quota? Is it possible for you guys to take a look and perhaps recalculate it? Thanks a lot!

Best, Roxy

ADD REPLYlink written 18 months ago by roxy.zhang0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour