Question: Should I Use Igenomes Verson Of A Reference Gtf For Tophat?
0
gravatar for Du, Jianguang
6.3 years ago by
Du, Jianguang380
Du, Jianguang380 wrote:
Dear All, I am analysing RNA-seq datasets for differential splicing events between cell types. These are mouse cells. Jen suggested me to use the iGenomes version of reference GTF to take full advantage of the options in CuffDiff. My question is: should I use this iGenome version reference GTF when I run Tophat? Thanks. Jianguang
rna-seq cuffdiff • 4.1k views
ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by Du, Jianguang380
1
gravatar for Jennifer Hillman Jackson
6.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Jianguang, When in the analysis process to start using the reference GTF file can depend on whether or not you intend to do any discovery along with differential expression testing. At the TopHat and Cufflinks steps, using reference GTF file can influence how datasets will map and assemble. In general, if your intention is to do discovery (e.g. work with novel isoforms in your data, but not in the reference), then do not add in the reference GTF until the CuffMerge step (to produce the input annotation GTF file for Cuffdiff). But if you want to guide the analysis toward known isoforms, then use the reference GTF. This is the process our RNA-seq example protocol follows: http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise For reference, there are other variations of this on the Cufflinks web site, some that never lead to Cuffdiff, but still may be useful to review. Please see the Cufflinks paper (linked from right side bar as "Protocol" for many more options/discussion. http://cufflinks.cbcb.umd.edu/tutorial.html --> Common uses of the Cufflinks package The end decision will be up to you, and a few runs with different options may be a useful way to make the final call, but hopefully this provides some resources to help you understand the option, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD COMMENTlink written 6.3 years ago by Jennifer Hillman Jackson25k
Hi Jen, Thanks for your help. Do you mean that if I want to find novel isoform/splicing, I need to select "No" under "Use Reference Annotation" when I run Cufflink, and then use iGenome version of reference GTF when I run Cuffmerge? Based on your information and some protocols found online, my understanding is that: 1) if use iGenome version of reference GTF, I only need to run Cuffmerge with the Cufflink ouputs, because iGenome version reference GTF already contains attributes such as p_id and tss_id. Then the Cuffmerge output can be used for Cuffdiff. 2) however, if I use the reference GTF from Ensembl/UCSC (rather than from iGenome), I need to run Cuffcompare to create p_id and tss_id, which is required for Cuffdiff. Am I right? Another question is: should I use iGenome version of reference GTF when I run Tophat if I want to see novel isoforms/splicing? Thanks. Jianguang ________________________________________ To: Du, Jianguang Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Should I use iGenomes verson of a reference GTF for Tophat? Hello Jianguang, When in the analysis process to start using the reference GTF file can depend on whether or not you intend to do any discovery along with differential expression testing. At the TopHat and Cufflinks steps, using reference GTF file can influence how datasets will map and assemble. In general, if your intention is to do discovery (e.g. work with novel isoforms in your data, but not in the reference), then do not add in the reference GTF until the CuffMerge step (to produce the input annotation GTF file for Cuffdiff). But if you want to guide the analysis toward known isoforms, then use the reference GTF. This is the process our RNA-seq example protocol follows: http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise For reference, there are other variations of this on the Cufflinks web site, some that never lead to Cuffdiff, but still may be useful to review. Please see the Cufflinks paper (linked from right side bar as "Protocol" for many more options/discussion. http://cufflinks.cbcb.umd.edu/tutorial.html --> Common uses of the Cufflinks package The end decision will be up to you, and a few runs with different options may be a useful way to make the final call, but hopefully this provides some resources to help you understand the option, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.3 years ago by Du, Jianguang380
Hello Jianguang, Yes, according to the tool documentation, this is the method. Yes, this is the example protocol I shared. This can be tricky, it depends on what order you run the tools with and without the GTF annotation. The protocol in #1 is recommended. Yes, this is what I intended to answer in my original reply, I apologize if that was not clear. The reference GTF can influence both mapping and assembly. So, both Tophat and Cufflinks. The information on the TopHat web site for the parameter provides more information (see link on TopHat tool form). The tool authors can also be contacted if there are some details that you are curious about that are not covered in the primary documentation: tophat.cufflinks@gmail.com Others are welcome to add to the thread with their experiences if they have used a reference annotation GTF with Tophat (or chosen not to for a particular reason that they would like to share), Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.3 years ago by Jennifer Hillman Jackson25k
Hi Jen, Thank you very much for your help. Jianguang ________________________________________ To: Du, Jianguang Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Should I use iGenomes verson of a reference GTF for Tophat? Hello Jianguang, Yes, according to the tool documentation, this is the method. Yes, this is the example protocol I shared. This can be tricky, it depends on what order you run the tools with and without the GTF annotation. The protocol in #1 is recommended. Yes, this is what I intended to answer in my original reply, I apologize if that was not clear. The reference GTF can influence both mapping and assembly. So, both Tophat and Cufflinks. The information on the TopHat web site for the parameter provides more information (see link on TopHat tool form). The tool authors can also be contacted if there are some details that you are curious about that are not covered in the primary documentation: tophat.cufflinks@gmail.com Others are welcome to add to the thread with their experiences if they have used a reference annotation GTF with Tophat (or chosen not to for a particular reason that they would like to share), Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.3 years ago by Du, Jianguang380
0
gravatar for Du, Jianguang
6.3 years ago by
Du, Jianguang380
Du, Jianguang380 wrote:
Hi Jen, I had a problem when I tried to run Tophat with the iGenome reference GTF. What I did is: 1) uploaded iGenome version of mm9 genes.gtf by: Shared Data -> Data Libraries -> iGenomes -> click "genes.gtf" under "mm9" -> click "Go" for "Import to current history". The genes.gtf appeared in history and turned green. 2) click "Tophat for Illumina Find splice junctions using RNA-seq data" to open the window of "Tophat for Illumina (version 1.5.0)" 3) selected the dataset to be analysed under "RNA-Seq FASTQ file:". 4) chose "Use one from the history" under "Will you select a reference genome from your history or use a built-in index?:" Then the screen refreshed and the box (pulldown menu) under "Select the reference genome:" became smaller. Nothing showed up in the pulldown menu (actually the menu can not be pulled down). So that I could not input iGenome reference GTF. Looks like the Tophat can only "Use a built-in index". How can I solve this problem? Thanks in advance. Jianguang ________________________________________ To: Jennifer Jackson Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Should I use iGenomes verson of a reference GTF for Tophat? Hi Jen, Thank you very much for your help. Jianguang ________________________________________ To: Du, Jianguang Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] Should I use iGenomes verson of a reference GTF for Tophat? Hello Jianguang, Yes, according to the tool documentation, this is the method. Yes, this is the example protocol I shared. This can be tricky, it depends on what order you run the tools with and without the GTF annotation. The protocol in #1 is recommended. Yes, this is what I intended to answer in my original reply, I apologize if that was not clear. The reference GTF can influence both mapping and assembly. So, both Tophat and Cufflinks. The information on the TopHat web site for the parameter provides more information (see link on TopHat tool form). The tool authors can also be contacted if there are some details that you are curious about that are not covered in the primary documentation: tophat.cufflinks@gmail.com Others are welcome to add to the thread with their experiences if they have used a reference annotation GTF with Tophat (or chosen not to for a particular reason that they would like to share), Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
ADD COMMENTlink written 6.3 years ago by Du, Jianguang380
Hello Jianguang, Two different data are being mixed up: genome vs annotation reference genome (format: fasta) vs reference annotation (format: GTF) To annotation your sequences against the mm9 reference genome, choose locally cashed and select mm9 from the pull down menu. Then, optionally, if you want to guide the mapping with a reference annotation GTF file, that is what the genes.gtf file represents. The option is set on the TopHat form under: TopHat settings to use: Full Paramater list Use Own Junctions: Yes Use Gene Annotation Model: Yes Gene Model Annotations:
ADD REPLYlink written 6.3 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour