I have two more questions about settings for Tophat.
My aim is to look for the defferential splicing events between cell
After I checked "Use Own Junctions", three more options came out:
1) "Use Gene Annotation Model"
2) "Use raw Junctions"
3) "Only look for supplied junctions"
As instructed by Jen, I checked "Use Gene Annotation Model", and input
iGenome mm9 genes.gtf as "Gene Model Annotations".
However, I am not sure if I should choose to "Use raw junctions" and
"only look for supplied junctions". Please help me set up these two
Using known reference annotation ("Use Own Junctions", etc.) is not a
part of the example RNA-seq tutorial our team has published for this
type of analysis. That doesn't mean that it cannot be used, but that
it could or should be used probably needs to be tested to see if the
results from the various options meet your needs.
"Use raw Junctions" = will combine both the reference annotation and
novel junctions in the final junctions called.
"Only look for supplied junctions" = will limit to only those
in the reference annotation (no novel junctions from the input).
As I explained earlier, based on the TopHat documentation, using
reference annotation at all causes those junctions to be given some
favorable bias during the mapping.
When thinking about the options, a lot probably depends on how well
genome is annotated vs how novel your data is. This may not be known
upfront. Also, if you are interested in discovery, it would probably
important to consider whether you want to bias towards known
early in the analysis - we didn't in our tutorial. However it may be
that you want to map to primarily or to only characterized splicing
events that are known (or suspected) to be linked to disease or other
expression profiles of interest, and that are already present in the
reference annotation, and for this case using reference annotation
help to focus the results. Ultimately this is a decision you will need
to make according to your end goals - and some testing would be
recommended. Try a few runs with the different options and compare the
TopHat mapped & Cufflinks assembled transcripts differences at the
level and see which make the most sense - the Trackster tool
("Visualization") would be good for this.
Apologies for not being more specific, but there is no single answer
this question. You might try asking at firstname.lastname@example.org for
advice from that community or the tool authors or searching at
seqanswers.com to see what others have been doing. This may give you a
feel for the general usage trends (but it probably won't replace your