I have a very basic question. I have RNA-seq datasets of several cell
types and want to compare the alternative splicing events between cell
types. The reads are 36nt in length. Are these reads long enough to
map on the splicing jucntions accurately when I run Tophat with
stringent parameters (no mismatch)?
36bp reads will map across splice junctions but at a relatively low
rate; you can try changing segment length to get better mapping, but
you'll want to evaluate the results carefully to ensure that you're
getting good results.
No, in general the probability of mapping 5 bases + (N-5) remaining
bases incorrectly is higher than mapping 8 bases + (N-8) bases
incorrectly because (a) there are more matching 5-mers than 8-mers in
a genome and (b) there can mismatches when mapping the remainder.