Hi,
I am analyzing microRNASeq data. My pipeline is qc reads, bowtie for alignment, cufflinks to assemble transcripts, and cuffdiff to get differential expression. Here is my problem, I KNOW that cuffdiff requires a gff/gtf file in a certain format. I got my gff3 from miRBase and it corresponds to the fasta file I am using for mature miRNA Sequences that I also downloaded from miRBase. When I run Cuffdiff there is no info in the "gene" column and although the job finishes in Galaxy I get warning stating: "Warning: couldn't find fasta record for 'chr1'! This contig will not be bias corrected. Warning: couldn't find fasta record for 'chr10'!
Also, all of my values such as log2(fold change), value_1, and value_2 are zero!! I realize this is some kind of formatting issue but I don't understand how to fix the issue. If someone could help, I would be eternally grateful. I have had this same issue for months.
Here is my very sad looking output:
test_id | gene_id | gene | locus | sample_1 | sample_2 | status | value_1 | value_2 | log2(fold_change) | test_stat | p_value | q_value | significant |
XLOC_000001 | XLOC_000001 | - | chr1:30365-30503 | Control A549 Cells | Ad14 Infected Cells | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
XLOC_000002 | XLOC_000002 | - | chr1:1167103-1167198 | Control A549 Cells | Ad14 Infected Cells | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
XLOC_000003 | XLOC_000003 | - | chr1:1167862-1167952 | Control A549 Cells | Ad14 Infected Cells | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
XLOC_000004 | XLOC_000004 | - | chr1:1169004-1169087 | Control A549 Cells | Ad14 Infected Cells | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
XLOC_000005 | XLOC_000005 | - | chr1:3127974-3128035 | Control A549 Cells | Ad14 Infected Cells | NOTEST | 0 | 0 | 0 | 0 | 1 | 1 | no |
Here are what my files look like:
Fasta:
>hsa-miR-576-3p MIMAT0004796 AAGATGTGGAAAAATTGGAATC >hsa-miR-140-5p MIMAT0000431 CAGTGGTTTTACCCTATGGTAG >hsa-miR-522-5p MIMAT0005451 CTCTAGAGGGAAGCGCTTTCTG >hsa-miR-1298-5p MIMAT0005800 TTCATTCGGCTGTCCAGATGTA
GFF3:
##gff-version 3 | ||||||||
##date 2014-6-22 | ||||||||
# | ||||||||
# Chromosomal coordinates of Homo sapiens microRNAs | ||||||||
# microRNAs: miRBase v21 | ||||||||
# genome-build-id: GRCh38 | ||||||||
# genome-build-accession: NCBI_Assembly:GCA_000001405.15 | ||||||||
# | ||||||||
# Hairpin precursor sequences have type "miRNA_primary_transcript". | ||||||||
# Note, these sequences do not represent the full primary transcript, | ||||||||
# rather a predicted stem-loop portion that includes the precursor | ||||||||
# miRNA. Mature sequences have type "miRNA". | ||||||||
# | ||||||||
chr1 | . | miRNA_primary_transcript | 17369 | 17436 | . | - | . | ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1 |
chr1 | . | miRNA | 17409 | 17431 | . | - | . | ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705 |
chr1 | . | miRNA | 17369 | 17391 | . | - | . | ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705 |
chr1 | . | miRNA_primary_transcript | 30366 | 30503 | . | + | . | ID=MI0006363;Alias=MI0006363;Name=hsa-mir-1302-2 |
chr1 | . | miRNA | 30438 | 30458 | . | + | . | ID=MIMAT0005890;Alias=MIMAT0005890;Name=hsa-miR-1302;Derives_from=MI0006363 |
chr1 | . | miRNA_primary_transcript | 187891 | 187958 | . | - | . |
ID=MI0026420;Alias=MI0026420;Name=hsa-mir-6859-2
|