Question: Question In Converting Gtf File For P_Id
gravatar for Yanxiang Shi
4.9 years ago by
Yanxiang Shi20
Yanxiang Shi20 wrote:
Hi all, I've been trying to use the cufflinks-cuffmerge-cuffdiff flow to analyze my RNAseq data. However, cuffmerge lost my p_id. My p_id was originally from changing the protein_id to p_id by myself in the gtf file. The current p_id showed up in the same attributes column as gene_id in the gtf file. Does anyone know how to make it reachable by cuff? I'm using galaxy public platform. Apparently in the history people say solving the problem by writing an extra code. But I cannot find anywhere to input codes in galaxy. Do I have to run everything on my own computer? Thanks a lot!!!!! Nancy
rna-seq cuffmerge cufflinks • 2.1k views
ADD COMMENTlink modified 3.0 years ago by malcolm.cook0 • written 4.9 years ago by Yanxiang Shi20
gravatar for Jennifer Hillman Jackson
4.9 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Nancy, The attribute sounds as if it is the correct place in the reference annotation file (the 9th field), but perhaps there are other format/content problems with the file. Do you have a tss_id? Do you have exons labeled? This is the area of the manual that covers the formatting and usage of these attributes: I am not sure if I understand what you mean by input/writing codes. But if you need more help after reviewing the manual, please let us know, Best, Jen Galaxy team -- Jennifer Hillman-Jackson
ADD COMMENTlink written 4.9 years ago by Jennifer Hillman Jackson25k
Hi Jennifer, I did see tss_id in my results and also exon labels. The tss_id was assigned during the calculation, having the numbers tss1, tss2, etc. By saying writing codes I mean such as in the link you sent to me, there is: "*Note: *If an arbitrary GTF/GFF3 file is used as input (instead of the *.combined.gtf* file produced by Cuffcompare), these attributes will not be present, but Cuffcompare can still be used to obtain these attributes with a command like this: cuffcompare -s /path/to/genome_seqs.fa -CG -r annotation.gtf annotation.gtf The resulting cuffcmp.combined.gtf file created by this command will have the tss_id and p_id attributes added to each record and this file can be used as input for cuffdiff." but where do I can I type in "cuffcompare -s /path/to/genome_seqs.fa -CG -r annotation.gtf annotation.gtf" in galaxy? I don't know where I can find the -s... Is there a command line anywhere? Thanks for your help! Nancy
ADD REPLYlink written 4.9 years ago by Yanxiang Shi20
Hi Nancy, It is not quite clear in which steps you used the reference annotation or how these attributes were lost exactly. Cuffcompare is a tool in Galaxy - but before we go any further I think that examining the history would be the speedest path to a solution. Would you share a history with me? You can email me back the link direct to keep your data private. Please note the problematic dataset #, leaving all undeleted (or at least one complete analysis path). Here is how to share: Thanks! Jen Galaxy team -- Jennifer Hillman-Jackson
ADD REPLYlink written 4.9 years ago by Jennifer Hillman Jackson25k
gravatar for malcolm.cook
3.0 years ago by
United States
malcolm.cook0 wrote:

I have developed an Rscript, cuffdiff_gtf_attributes, which can provided the additional attributes p_id and tss_id as required by cuffdiff to perform all the differential splicing/coding/expression contrasts.  I have tested it with Ensembl GTF.

ADD COMMENTlink written 3.0 years ago by malcolm.cook0

Great! Please consider wrapping it for Galaxy and adding to the Tool Shed! Thanks, Jen, Galax team

ADD REPLYlink written 3.0 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour