Question: Cuffdiff Significance Issue
gravatar for mikewalk2
2.7 years ago by
United States
mikewalk20 wrote:

Hello, I was wondering what sort of parameters were used to calculate significance in Cuffdiff. I recently analyzed three replicates of RNA-seq data and successfully evaluated the differential transcripts between 2 conditions. However, with this most recent data set, I see a very large fold change in some transcripts, but none are deemed significant. What could be happening to make this happen? *My more recent run had data sets with much lower percentages of mapped reads (~5%). *My previous, successful run was using paired end alignments and this run is using single end.

All other parameters were kept the same.

I would appreciate any advice! Thanks, Mike

rna-seq tophat cufflinks • 653 views
ADD COMMENTlink modified 2.7 years ago by Jennifer Hillman Jackson25k • written 2.7 years ago by mikewalk20
gravatar for Jennifer Hillman Jackson
2.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:


The technical details of the algorithm are here. The command-line tool is wrapped in Galaxy, but the underlying tool is the same:

With only 5% of the data mapped, my guess is that there is not enough coverage to compute significance. Maybe something went wrong with the mapping? Double check that the fastq data is actually in "fastqsanger" format as when it is not, this is the most common reason for poor mapping rates. Also double check the target reference genome. Tophat is not designed to map well cross-species and it is easy to accidentally map against the default genome, instead of the true target (another common reason for poor mapping results).

Running FastQC on your fastq data may also give some clues about content that can lead to poor mapping rates. Be sure to run this again if you run it first to verify fastqsanger format and groom the data or perform other QC actions that alter the content. And know that the advanced option "Minimum length of read segments" must be at least one half the length of the input fastq sequence length, or bias and poor mapping rates can result.

Hopefully this leads to a solution, Jen, Galaxy team

ADD COMMENTlink written 2.7 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour