Should I "Use Raw Junction" And "Only Look For Supplied Junctions"

Question: Should I "Use Raw Junction" And "Only Look For Supplied Junctions"

6.3 years ago by

Du, Jianguang • 380 wrote:

Dear All, I have two more questions about settings for Tophat. My aim is to look for the defferential splicing events between cell types. After I checked "Use Own Junctions", three more options came out: 1) "Use Gene Annotation Model" 2) "Use raw Junctions" 3) "Only look for supplied junctions" As instructed by Jen, I checked "Use Gene Annotation Model", and input iGenome mm9 genes.gtf as "Gene Model Annotations". However, I am not sure if I should choose to "Use raw junctions" and "only look for supplied junctions". Please help me set up these two options. Thanks. Jianguang

rna-seq tophat • 1.4k views

ADD COMMENT • link •

modified 6.3 years ago by Jennifer Hillman Jackson ♦ 25k • written 6.3 years ago by Du, Jianguang • 380

6.3 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hi Jianguang, Using known reference annotation ("Use Own Junctions", etc.) is not a part of the example RNA-seq tutorial our team has published for this type of analysis. That doesn't mean that it cannot be used, but that how it could or should be used probably needs to be tested to see if the results from the various options meet your needs. "Use raw Junctions" = will combine both the reference annotation and the novel junctions in the final junctions called. "Only look for supplied junctions" = will limit to only those junctions in the reference annotation (no novel junctions from the input). As I explained earlier, based on the TopHat documentation, using reference annotation at all causes those junctions to be given some favorable bias during the mapping. When thinking about the options, a lot probably depends on how well the genome is annotated vs how novel your data is. This may not be known upfront. Also, if you are interested in discovery, it would probably be important to consider whether you want to bias towards known annotation early in the analysis - we didn't in our tutorial. However it may be that you want to map to primarily or to only characterized splicing events that are known (or suspected) to be linked to disease or other expression profiles of interest, and that are already present in the reference annotation, and for this case using reference annotation could help to focus the results. Ultimately this is a decision you will need to make according to your end goals - and some testing would be recommended. Try a few runs with the different options and compare the TopHat mapped & Cufflinks assembled transcripts differences at the gene level and see which make the most sense - the Trackster tool ("Visualization") would be good for this. Apologies for not being more specific, but there is no single answer for this question. You might try asking at tophat.cufflinks@gmail.com for advice from that community or the tool authors or searching at seqanswers.com to see what others have been doing. This may give you a feel for the general usage trends (but it probably won't replace your own testing). Take care, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org

ADD COMMENT • link written 6.3 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »