Question: Names For Genes In Rna-Seq Analysis
0
gravatar for GANDRILLON OLIVIER
7.1 years ago by
GANDRILLON OLIVIER70 wrote:
Hello I am using Galaxy to analyse RNA-seq libraries made from chicken cells. I just groomed my sequences, passed them through TopHat and then Cufflinks. This worked well and in the end I get a list of genes and their respective FPKM values. My only problem is that the names of the genes do not appears in the listing, they are simply reference as "CUFF.1, CUFF.2, " etc… Could you please tell me how I could obtain gene names? (I went through the FAQ and could not get the answer). Sincerely Olivier New mail adress: olivier.gandrillon@univ-lyon1.fr Dr Olivier Gandrillon Centre de Génétique et de Physiologie Moléculaires et Cellulaires UMR CNRS 5534 Université Claude Bernard Lyon I Bat Gregor Mendel (ex 741) 16, rue Raphaël Dubois 69622 Villeurbanne Cedex Phone : 04-72-44-81-90 Fax : 04-72-43-26-85 Web adress : Lab: http://cgphimc.univ-lyon1.fr/spip.php?rubrique33&lang=en Perso: http://www.cgmc.univ-lyon1.fr/Gandrillon/OG/OG1.html "Comment obtenait-il l'adhésion du peuple aux nouveaux mensonges qu'il inventait chaque jour? Précisément parce que c'était des mensonges et précisément parce qu'ils étaient une insulte ŕ la perception. Le peuple était hypnotisé par l'aplomb, ce droit qu'il s'octroyait de contredire l'évidence. Les gens portaient un regard médusé sur un Goebbels déchaîné. Ils voyaient en transparence son souhait énorme de nier le nain boiteux" Tobie Nathan, in "Qui a tué Arlozoroff"
rna-seq cufflinks • 2.0k views
ADD COMMENTlink modified 7.1 years ago by Jennifer Hillman Jackson25k • written 7.1 years ago by GANDRILLON OLIVIER70
0
gravatar for Carl Schmidt
7.1 years ago by
Carl Schmidt10
Carl Schmidt10 wrote:
I am also using Galaxy to analyze RNA-seq libraries from chicken. While the names of the genes appear in the Cufflinks output, the FPKM values are all zero. Carl Schmidt Associate Professor Animal & Food Sciences University of Delaware Newark, DE 19716 051 Townsend Hall schmidtc@udel.edu 302-831-1334
ADD COMMENTlink written 7.1 years ago by Carl Schmidt10
Hello Carl, Your question is similar to Olivier's, even though the problem is presenting in a different way. There is most likely a data mismatch problem as explained in my earlier reply to this thread and this prior mailing list question: http://gmod.827538.n3.nabble.com/Cufflinks-reporting-FPKM-values-of- all-zeroes-0-tt3183517.html#a3183928 Best wishes for your project as well, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD REPLYlink written 7.1 years ago by Jennifer Hillman Jackson25k
There is one more required data point to consider in the GTF file, for full functionality. Specifically, it must also contain the gene_name attribute (in the 9th column), such as: chr1 unknown exon 14362 14829 . - . gene_id "WASH7P"; gene_name "WASH7P"; transcript_id "NR_024540"; tss_id "TSS6960"; Thanks for using Galaxy! Jen -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD REPLYlink written 7.1 years ago by Jennifer Hillman Jackson25k
Dear Carl After having solved a couple of other problems, I am now to the point where I can: 1.either get the FKPM and no names or 2. have the names and FKPM values to zero (or "OK" ?) So I was wondering whether you had solved this FKPM value problem with chicken you had? Best Olivier Le 26/10/11 02:21, Ť Jennifer Jackson ť <jen@bx.psu.edu> a écrit :
ADD REPLYlink written 7.1 years ago by GANDRILLON OLIVIER70
0
gravatar for Emilie Chautard
7.1 years ago by
Emilie Chautard10 wrote:
Hi Olivier, Did you try to run Cuffcompare (part of Cufflinks) on your results? According to the Cufflinks manual (http://cufflinks.cbcb.umd.edu/manual.html ): transfrags you assemble. The program cuffcompare helps you: In the Galaxy version of Cuffcompare, I think that you can provide a reference annotation file using "Use Reference Annotation:", which will be compared to your results with Cufflinks. It makes an "union" of the transcripts obtained with Cufflinks with the annotation file (both in *.gtf format). You can then obtain a transcript identifier for those already annotated. It also provides a class code for the transcripts, which can inform about a potential isoform for example. Hope this helps. Emilie -- Emilie Chautard, PhD Postdoctoral Fellow Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 416-673-8518 Toll-free: 1-866-678-6427 www.oicr.on.ca
ADD COMMENTlink written 7.1 years ago by Emilie Chautard10
Hello Olivier, Emilie and the Galaxy community - I have run into a similar problem with my RNA-seq analysis, in that I can run the analysis up to the point of Cufflinks producing a list of FPKM values for my genome of interest (in this case, Staphylococcus aureus strain Newman). However, I cannot find a place to download a compatible .GTF file with the reference annotation. Would you or anyone else in the community know of tool or database where .GTF files could be created from another input file (such as GFF3), or better yet, just downloaded? As for possibilities with file conversion, most microbial genomes are available from NCBI in a variety of formats (but not GTF). For S. aureus Newman, these files can be found at the following link: ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Staphylococcus_aureus_ Newman_uid18801 Many thanks for your help! Joe Joe J. Harrison Senior Fellow Department of Microbiology University of Washington 1705 NE Pacific Street, HSB J181 Seattle, WA USA 98195
ADD REPLYlink written 7.1 years ago by Joe Harrison20
0
gravatar for Jennifer Hillman Jackson
7.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Olivier, Are you using a reference gene annotation GTF file? This would be required to obtain gene symbols. If yes and this is still an issue, the two things to double check are: 1) the chromosome names between the GTF file (most commonly the problem) and the reference genome and the SAM alignment file are all the same (exactly the same) and 2) the 9th column of the GTF file contains valid gene_id (also commonly incorrect) and transcript_id attributes If this and everything else in the FAQ appear to be in the correct format, it might be time to contact the tool authors for advice at tophat.cufflinks@gmail.com. Best wishes for your project, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 7.1 years ago by Jennifer Hillman Jackson25k
0
gravatar for Michael Gooch
7.1 years ago by
Michael Gooch10 wrote:
Regarding the GTF files for cuffllinks, how do I obtain one for all human mRNA that actualy contains gene names rather than accession numbers. I went to the UCSC table browser but their files contain accession numbers that I dont know how to decode en-masse.
ADD COMMENTlink written 7.1 years ago by Michael Gooch10
Hello Michael, The UCSC RefSeq Genes track's has the data: 1) a transcript accession, in column "name" 2) a gene symbol, in column "name2" but not from the Table Browser's GTF format output, as explained at: http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format Ensembl is another data source choice for full functionality, at it contains: transcript_id, gene_id, and gene_name. This help from the tool authors is also worth reviewing: http://cufflinks.cbcb.umd.edu/gff.html Note that specific questions about these tools can also be directed at: tophat.cufflinks@gmail.com Hopefully this helps, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD REPLYlink written 7.1 years ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
7.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Olivier, When deleting data, it takes the server a short amount of time to refresh. It may take a bit longer right now since many people are performing this action at the same time. For the RNA-seq analysis question, reference annotation GTF files are used by the Cuff* programs. (These are different than the result GTF files produced by the programs). For reference annotation GTF files, there are many sources, including Ensembl and UCSC. Here are links to a tutorial and an FAQ that can help with your usage question. http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq But, there are many small details to running the tools to get the optimal results. These types of questions concerning functionality are best directed to the tool authors at tophat.cufflinks@gmail.com Take care, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 7.1 years ago by Jennifer Hillman Jackson25k
Dear Jennifer Actually the problem was coming from a non usable Ensembl GTF file. I used the workflow in http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq And remade the Cufflink analysis using the new GTF file as a reference and, BINGO, it worked. Thank's for your help Best Olivier Le 28/10/11 20:19, Ť Jennifer Jackson ť <jen@bx.psu.edu> a écrit :
ADD REPLYlink written 7.1 years ago by GANDRILLON OLIVIER70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour