Question: Issues with Using CuffDiff Tool in Galaxy
gravatar for gkuffel22
2.6 years ago by
United States
gkuffel22170 wrote:



I am analyzing microRNASeq data. My pipeline is qc reads, bowtie for alignment, cufflinks to assemble transcripts, and cuffdiff to get differential expression. Here is my problem, I KNOW that cuffdiff requires a gff/gtf file in a certain format. I got my gff3 from miRBase and it corresponds to the fasta file I am using for mature miRNA Sequences that I also downloaded from miRBase. When I run Cuffdiff there is no info in the "gene" column and although the job finishes in Galaxy I get warning stating:  "Warning: couldn't find fasta record for 'chr1'! This contig will not be bias corrected. Warning: couldn't find fasta record for 'chr10'! 

Also, all of my values such as log2(fold change), value_1, and value_2 are zero!! I realize this is some kind of formatting issue but I don't understand how to fix the issue. If someone could help, I would be eternally grateful. I have had this same issue for months.

Here is my very sad looking output:

test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
XLOC_000001 XLOC_000001 - chr1:30365-30503 Control A549 Cells Ad14 Infected Cells NOTEST 0 0 0 0 1 1 no
XLOC_000002 XLOC_000002 - chr1:1167103-1167198 Control A549 Cells Ad14 Infected Cells NOTEST 0 0 0 0 1 1 no
XLOC_000003 XLOC_000003 - chr1:1167862-1167952 Control A549 Cells Ad14 Infected Cells NOTEST 0 0 0 0 1 1 no
XLOC_000004 XLOC_000004 - chr1:1169004-1169087 Control A549 Cells Ad14 Infected Cells NOTEST 0 0 0 0 1 1 no
XLOC_000005 XLOC_000005 - chr1:3127974-3128035 Control A549 Cells Ad14 Infected Cells NOTEST 0 0 0 0 1 1 no

Here are what my files look like:


>hsa-miR-576-3p MIMAT0004796
>hsa-miR-140-5p MIMAT0000431
>hsa-miR-522-5p MIMAT0005451
>hsa-miR-1298-5p MIMAT0005800


##gff-version 3
##date 2014-6-22
# Chromosomal coordinates of Homo sapiens microRNAs
# microRNAs: miRBase v21
# genome-build-id: GRCh38
# genome-build-accession: NCBI_Assembly:GCA_000001405.15
# Hairpin precursor sequences have type "miRNA_primary_transcript". 
# Note, these sequences do not represent the full primary transcript, 
# rather a predicted stem-loop portion that includes the precursor 
# miRNA. Mature sequences have type "miRNA".
chr1 . miRNA_primary_transcript 17369 17436 . - . ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1
chr1 . miRNA 17409 17431 . - . ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705
chr1 . miRNA 17369 17391 . - . ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705
chr1 . miRNA_primary_transcript 30366 30503 . + . ID=MI0006363;Alias=MI0006363;Name=hsa-mir-1302-2
chr1 . miRNA 30438 30458 . + . ID=MIMAT0005890;Alias=MIMAT0005890;Name=hsa-miR-1302;Derives_from=MI0006363
chr1 . miRNA_primary_transcript 187891 187958 . - .




gff3 micrornaseq cuffdiff • 890 views
ADD COMMENTlink modified 2.6 years ago by Jennifer Hillman Jackson24k • written 2.6 years ago by gkuffel22170
gravatar for Jennifer Hillman Jackson
2.6 years ago by
United States
Jennifer Hillman Jackson24k wrote:


For the reference genome used by the Bias Correction option, use the human genome build associated with the reference annotation. The specific base sequences for the miRNAs are not needed by Cuffidff (just the locations, which the reference annotation GFF3 dataset already provides).

In short, the chromosome identifiers in the GFF3 file (first column) must match the fasta identifiers (>identifier) exactly, or the tool cannot use the content. 

Best, Jen, Galaxy team

ADD COMMENTlink written 2.6 years ago by Jennifer Hillman Jackson24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 80 users visited in the last hour