Question: Error in CuffDiff
1
gravatar for gkuffel22
3.2 years ago by
gkuffel22170
United States
gkuffel22170 wrote:

I would be forever grateful if someone could take the time to help me. I am attempting to analyze micro RNASeq data. I downloaded the microRNA sequences from miRBase and mapped my samples (fastq files) using Bowtie. After this, I ran flagstat to check the alignments. The alignments looked great, 75% and above. I then ran Cufflinks to assemble transcripts and estimate abundances in each sample. This data also looked great, here is what it looked like:

tracking_id gene_id tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
CUFF.1 CUFF.1 - hsa-let-7a-1:4-80 - - 1.04E+07 1.03E+07 1.05E+07 OK
CUFF.2 CUFF.2 - hsa-let-7a-2:3-70 - - 1.50E+07 1.48E+07 1.52E+07 OK
CUFF.3 CUFF.3 - hsa-let-7a-3:2-74 - - 1.22E+07 1.21E+07 1.24E+07 OK

Next, I ran Cuffdiff. When running this tool a gff or gtf file is needed so I used the gff file available for grch38 on miRBase. Seemingly this should have worked perfect but when I run the tool the values are all zero and I seem to lose the names of the miRNAs. I get the following output:

gene_id locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
XLOC_000001 chr1:30365-30503 Control Treatment NOTEST 0 0 0 0 1 1 no
XLOC_000002 chr1:1167103-1167198 Control Treatment NOTEST 0.00E+00 0.00E+00 0 0 1 1 no
XLOC_000003 chr1:1167862-1167952 Control Treatment NOTEST 0.00E+00 0.00E+00 0 0 1 1 no
XLOC_000004 chr1:1169004-1169087 Control Treatment NOTEST 0.00E+00 0.00E+00 0 0 1 1 no

I realize this is probably some sort of compatibility issue between the fasta file containing the miRNAs and the gff file, does anyone know how I can solve this issue. Here is what the fasta file looks like:

>hsa-let-7a-1 MI0000060
TGGGATGAGGTAGTAGGTTGTATAGTTTTAGGGTCACACCCACCACTGGGAGATAACTATACAATCTACTGTCTTTCCTA
>hsa-let-7a-2 MI0000061
AGGTTGAGGTAGTAGGTTGTATAGTTTAGAATTACATCAAGGGAGATAACTGTACAGCCTCCTAGCTTTCCT
>hsa-let-7a-3 MI0000062
GGGTGAGGTAGTAGGTTGTATAGTTTGGGGCTCTGCCCTGCTATGGGATAACTATACAATCTACTGTCTTTCCT

Here is what the gff file looks like:

chr1 miRNA_primary_transcript 17369 17436 ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1
chr1 miRNA 17409 17431 ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705
chr1 miRNA 17369 1.74E+04 ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705
ADD COMMENTlink modified 3.2 years ago by Jennifer Hillman Jackson25k • written 3.2 years ago by gkuffel22170
0
gravatar for Jennifer Hillman Jackson
3.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Cuffdiff requires a specific set of GTF/GFF3 attributes in order to generate the full compliment of statistics. iGenomes is the best source, if available for your reference genome/build. Reference annotation data can be obtained here: http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/index.html#cuffdiff-input-files

Best, Jen, Galaxy team

 

ADD COMMENTlink written 3.2 years ago by Jennifer Hillman Jackson25k

Jen,

This will not help me as fasta files and gff files for micro RNA are not available through igenomes. Other people are using CuffDiff for miRNA analysis so there must be another way. Thanks for your help.

ADD REPLYlink written 3.1 years ago by gkuffel22170

Hi, I am new in Galaxy. I would like to analyze miRNA-Seq data but I realized the same problem: cuffdiff didn't work. I would be interest if you could solve this problem. Thanks for your reply

ADD REPLYlink written 13 months ago by valasek0

Hello, Cuffidff utilizes the information provided in the given GTF to combine and annotate results.

Hopefully this helps! Jen

ADD REPLYlink modified 13 months ago • written 13 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 87 users visited in the last hour