Cufflinks Annotation Problem

Question: Cufflinks Annotation Problem

7.1 years ago by

Colleagues, I am having trouble running cufflinks with an annotation file on the public Galaxy. The assembled transcripts gtf file has all the FPKM at 0 although the gene expression and transcript expression tab files have values for FPKM. I have seen the SEQanswers threads about the compatibility of tophat bam files relative to chromosomes labeled as 1,2,3 versus Chr1, Chr2, Chr3 I am using the iGenomes bovine UMD3.1 genome and annotation file (chromosomes are 1,2,3) from the history. I altered the gtf file to Chr1, Chr2, Chr3 but it did not help. Another potential discrepency/conflict is that the genome and gtf file have the bosTau6 database attribute from when I uploaded them. However I am running them from the history (bosTau6 is not an option for tophat). I do not seem to be able to remove the attribute. Am I missing something else? Here is the command line Info: cufflinks v1.0.3 cufflinks -q --no-update-check -I 50000 -F 0.050000 -j 0.050000 -p 8 -G /galaxy/main_database/files/003/142/dataset_3142240.dat -N -b ref.fa Here is the details page Tool: Cufflinks Name: Cufflinks on data 69, data 4, and data 5: assembled transcripts Created: Nov 09, 2011 Filesize: 44.6 Mb Dbkey: bosTau6 Format: gtf Tool Version: Input Parameter Value SAM or BAM file of aligned RNA-Seq reads 4: Tophat for Illumina on data 4 and data 69: accepted_hits Max Intron Length 50000 Min Isoform Fraction 0.05 Pre MRNA Fraction 0.05 Perform quartile normalization Yes Conditional (reference_annotation) 1 Reference Annotation 5: iGen_UMD3_1_genes.gtf Conditional (bias_correction) 0 Conditional (seq_source) 1 Using reference file 69: UMD31_iGen_1-29X.fa Conditional (singlePaired) 0 Cordially, Chris

rna-seq cufflinks • 1.7k views

ADD COMMENT • link •

modified 7.1 years ago by Jennifer Hillman Jackson ♦ 25k • written 7.1 years ago by Bidwell, Christopher A. • 30

7.1 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hi Chris, Clearing the incorrect "database" assignment can be done by clicking on the pencil icon for the dataset, and in the database pull down menu, select the list header: As you mention, chromosome names need to be identical between the reference fasta genome, any BAM files, and any input GTF files. Correct sorting and the GTF file's content are also important. Since the iGenomes dataset is specifically designed to work with this software package, it seems worth contacting the tool authors if expected results are still coming up. They will know the dataset the best. Should a problem with Galaxy be uncovered, we would be very glad to learn about it and make necessary corrections. Tool author's: web site: http://cufflinks.cbcb.umd.edu/ mailing list: tophat.cufflinks@gmail.com Thanks! Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support

ADD COMMENT • link written 7.1 years ago by Jennifer Hillman Jackson ♦ 25k

Hi everyone: i met a problem with the GTF file from ucsc. I uploaded the GTF file from ucsc to galaxy and use this file to run cuffcompare. The running is fine. But in the output file I cannot find the gene_name( gene symbol) only gene_id and transcript_id. it was difficult for me to analyze the result if there was no gene name inforation. so where i can get GTF file that contain the gene_name information thus it will be convenient for the downstream analysis. xiangming Quoting Jennifer Jackson <jen@bx.psu.edu>:

ADD REPLY • link written 7.0 years ago by Xiangming Ding • 40

Similar posts • Search »