Hi, so as a disclaimer, I am very new to RNA seq analysis. I received the following error while running Cuffdiff using a Cuffmerge file (containing assembled transcripts from Cufflinks) and TopHat files for different conditions. Everything seems to run smoothly until it gets to the cummeRbund commands when it starts saying there are 50 or more warnings, and then it says there is a column name mismatch. Can anyone look through this code and give me an idea about what I can do to troubleshoot this error? I just ran this job recently and it worked, but I forgot to check yes to output the cummeRbund SQlite file, so I had to run the program again, and now of course I keep getting this error. Even though it says the tabular outputs have failed, it does still show expression data for those.
I have been using the same annotation file and sequence file throughout, so I don't understand why all of the sudden it would be the wrong format, specifically for cummeRbund.
Thanks in advance for all help!
Fatal error: Exit code 1 () [17:51:49] Loading reference annotation and sequence. [17:52:01] Inspecting maps and determining fragment length distributions. [18:09:52] Modeling fragment count overdispersion. [18:12:21] Modeling fragment count overdispersion.
Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 16153094.02 Number of Multi-Reads: 1965402 (with 3790180 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 49043103.93 Number of Multi-Reads: 6121226 (with 11643243 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 14471910.68 Number of Multi-Reads: 1636721 (with 3204837 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 23503527.33 Number of Multi-Reads: 2653745 (with 5182948 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 56203142.81 Number of Multi-Reads: 7295836 (with 13709161 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 21234100.82 Number of Multi-Reads: 2718019 (with 5013349 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 16959329.71 Number of Multi-Reads: 1983337 (with 3801036 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 Map Properties: Normalized Map Mass: 23440835.56 Raw Map Mass: 17319539.62 Number of Multi-Reads: 2082599 (with 3995206 total hits) Fragment Length Distribution: Truncated Gaussian (default) Default Mean: 200 Default Std Dev: 80 [18:14:53] Calculating preliminary abundance estimates Processed 21351 loci.
[18:58:19] Learning bias parameters. [19:08:57] Testing for differential expression and regulation in locus. Processed 21351 loci.
Performed 14392 isoform-level transcription difference tests Performed 11870 tss-level transcription difference tests Performed 10891 gene-level transcription difference tests Performed 12781 CDS-level transcription difference tests Performed 1878 splicing tests Performed 866 promoter preference tests Performing 1713 relative CDS output tests Writing isoform-level FPKM tracking Writing TSS group-level FPKM tracking Writing gene-level FPKM tracking Writing CDS-level FPKM tracking Writing isoform-level count tracking Writing TSS group-level count tracking Writing gene-level count tracking Writing CDS-level count tracking Writing isoform-level read group tracking Writing TSS group-level read group tracking Writing gene-level read group tracking Writing CDS-level read group tracking Writing read group info Writing run info Loading required package: BiocGenerics Loading required package: methods Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which, which.max, which.min
Loading required package: RSQLite Loading required package: ggplot2 Loading required package: reshape2 Loading required package: fastcluster
Attaching package: ‘fastcluster’
The following object is masked from ‘package:stats’:
hclust
Loading required package: rtracklayer Loading required package: GenomicRanges Loading required package: stats4 Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:BiocGenerics’:
colMeans, colSums, rowMeans, rowSums
The following objects are masked from ‘package:base’:
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges Loading required package: GenomeInfoDb Loading required package: Gviz Loading required package: grid
Attaching package: 'cummeRbund'
The following object is masked from 'package:GenomicRanges':
promoters
The following object is masked from 'package:IRanges':
promoters
The following object is masked from 'package:BiocGenerics':
conditions
There were 50 or more warnings (use warnings() to see the first 50) Creating database ./cummeRbund.sqlite Reading Run Info File ./run.info Writing runInfo Table Reading Read Group Info ./read_groups.info Writing replicates Table Reading Var Model Info ./var_model.info Writing varModel Table Reading ./genes.fpkm_tracking Checking samples table... Populating samples table... Error: Column name mismatch. In addition: There were 50 or more warnings (use warnings() to see the first 50) Execution halted
Where are you using Galaxy? At Galaxy Main https://usegalaxy.org or elsewhere?
I have been using Galaxy Main
Would you please send in a bug report from the error dataset so we can troubleshoot the problem?
How-to: https://galaxyproject.org/issues/
In short, leave all inputs/outputs undeleted, click on the green bug icon, paste in the link to your post here in the comments, and submit. All data review is done privately by our admin team.
https://usegalaxy.org/u/lindsaywebb/h/gmrko . I submitted the error report and this is the link to the error dataset
Thanks - The inputs all look Ok. The problem is probably with one of these two options: generate SQLite or use multi-read correct. Cuffdiff was just recently updated - so thanks for reporting this. I am running some tests, more feedback once I have results.