Question: Cufflinks and inner mean distance
0
gravatar for CF
2.1 years ago by
CF0
CF0 wrote:

Greetings,

I am attempting to use Cufflinks on alignments produced by running Tophat on some paired-end data. When I try to set advanced Cufflinks options and change the inner mean distance to -20 (estimated previously using Bowtie and Picard, etc.), Cufflinks returns the following error:

Fatal error: Exit code 1 () Error running cufflinks. return code = 1 -m/ --frag-len-mean arg must be at least 0

When I set the distance to 0, Cufflinks runs just fine. I know that Tophat is able to accommodate negative inner distances, but I don't know about Cufflinks. Does anyone know how I should incorporate this into my analysis?

Thanks for your time.

options tuxedo cufflinks • 703 views
ADD COMMENTlink modified 2.1 years ago by Jennifer Hillman Jackson25k • written 2.1 years ago by CF0
1
gravatar for Jennifer Hillman Jackson
2.1 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The tool does not accept negative inner mean distance values (describing overlapping paired reads). However, the good news is that setting these values manually is no longer necessary. The latest versions of the tool now interpret the properly paired reads from the input BAM dataset alignments to calculate the values at run-time. http://cole-trapnell-lab.github.io/cufflinks/cufflinks/index.html#advanced-abundance-estimation-options

For all cases I can think of, using the actual alignment data to estimate the insert size/inner distance would be preferred. This is my opinion only, primarily because even carefully executed library construction protocols do not always produce the targetted/expected insert sizes/read lengths.

If you wish to test and compare, run a job with the values set (with inner mean == 0) and a job where Cufflinks interprets the BAM alignments (does not make use of this advanced setting). Then visualize a few example gene bounds to see which produces results that better suit your analysis goals. I suggest reviewing at least one well-characterized region and at least one region that contains novel data from your samples (novel transcripts from the "discovery" protocol, e.g. an analysis that includes a Cuffmerge GTF as the reference annotation). In the visualization, including the reference annotation GTFs - both the base-line known transcripts (public GTF) and the known+novel transcripts identified by Cufflinks that include your reads (the output from Cuffmerge) - will aid by adding context for the examined regions.

Others are welcome to offer their opinions and/or experiment advice!

Take care, Jen, Galaxy team

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour