I am attempting to use tophat>cufflinks>cuffmerge>cuffdiff to compare
transcript expression in 3 samples (no replicates, illumina single-end
reads). Using the built in UCSC mm9 reference genome I can complete
the analysis just fine, with the caveat that there is no annotation.
When I repeat the analysis using the illumina igenome UCSC mm9 .gtf
annotation file I get the following error in Cufflinks:
An error occurred running this job: cufflinks v1.3.0
cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8
-G /galaxy/main_pool/pool5/files/004/309/dataset_4309547.dat -N
Error running cufflinks.
return code = -11
cufflinks: /lib64/libz.so.1: no version information available
I have set the identifier/build as "Mouse July 2007 (NCBI37/mm9)
(mm9)" so that does not seem to be the probelem. Suggestions as to how
to amend this problem OR add annotations to the already completed
analysis would be terrific.
This specific error code has been seen before from Cuffdiff when there
is a format problem with the GTF file.
A few things to double check:
1) That you downloaded the iGenomes .tar archive to your local
or server, unpacked it using a utility or on the command line (tar
-xvf), then uploaded *only* the .gtf annotation file to Galaxy? The
Galaxy datatype is .gtf and there are no metadata problems (these
usually show up as warnings within a yellow box, in dataset or on
Attributes" form (click on pencil icon for dataset). This particular
archive is known is give a few problems (various warnings due to age
software used to create archive and not all data is usable) while
unpacking, but the .gtf data is intact and is small enough to load
through a direct browser upload. (FTP is not needed, but you could use
FTP if you wanted). If you needed to use FTP because of the data size,
or you loaded via a URL directly from the source, then you very likely
loaded the .tar archive itself and will need to start over: download
.tar, unpack, and load just the .gtf data.
2) When you uploaded the .gtf file to Galaxy, you *did not check* the
box next to "Convert spaces to tabs:". The original and upload .gtf
should have nine, tab-delimited, columns of data. If you have 12
columns, then this means the box was checked and the format is
incorrect. You will want to reload the .gtf dataset (without
spaces to tabs), after loading confirm GTF format is correct and
assigned, and re-run cuffdiff.
If you have any problems or are unsure about how your data fits with
these checks, please submit a bug report directly from the failed
CuffDiff run from the job as executed on the public main Galaxy
instance. Be sure to leave all inputs and the error dataset undeleted
your history (undelete if necessary).
Hopefully this helps!