Question: DESeq2 gives fold changes for transcripts with zero counts in all samples
0
gravatar for dominique.tuerkowsky
6 weeks ago by
dominique.tuerkowsky0 wrote:

I have run DESeq2 on transcript counts generated by String Tie. For some transcripts, I have zero counts in both replicates of both samples in the String Tie Output. Still, DESeq2 gives me a fold change, a standard error etc., see below for an example. Is this a bug?

transcript_id   sample1
ENST00000604711.1   0

transcript_id   sample1
ENST00000604711.1   0

transcript_id   sample1
ENST00000604711.1   0

transcript_id   sample1
ENST00000604711.1   0

-

GeneID  Base mean   log2(FC)    StdErr  Wald-Stats  P-value P-adj
ENST00000604711.1   772.7192794 8.96405484  2.764982398 3.241993456 0.001186967 0.068577747
rna-seq software error • 120 views
ADD COMMENTlink modified 6 weeks ago by m.bernt40 • written 6 weeks ago by dominique.tuerkowsky0
0
gravatar for Jennifer Hillman Jackson
6 weeks ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

This looks odd but could be possible if some transcripts associated with a gene have counts and others do not.

The "GeneID " in your data is actually a transcript identifier. There might be a problem with the reference GTF. Are the transcript_id and gene_id values the same (9th field in the GTF)? You will need to find/use a reference GTF with distinct values for the DEseq2 summaries to be "by gene". If you do switch the GTF, be sure to rerun all jobs that use it as an input within the same analysis.

Gencode and iGenomes are both good sources for human GTFs that are a match for hg19/hg38. For how to get these into Galaxy, please see this prior Q&A: https://biostar.usegalaxy.org/p/29343/

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Jennifer Hillman Jackson25k
0
gravatar for m.bernt
6 weeks ago by
m.bernt40
m.bernt40 wrote:

Hi Jen,

thanks for your response.

We are actually trying to evaluate the differential expression of the transcripts. Therefore we want to use the counts that are determined by StringTie for the transcripts.

I guess we have only the counts of the transcripts in the count table.

We use the Reference annotation from Gencode.

Best, Matthias

ADD COMMENTlink written 6 weeks ago by m.bernt40

If all of the Stringtie count tables input to DEseq2 have zero as the count for a particular transcript, and the gene/transcript values are distinct (one or more transcripts link to a single gene), I'm not sure how you ended up with this result. If you need more feedback, see if you can you reproduce this at a public Galaxy where a link to history could be shared.

ADD REPLYlink written 6 weeks ago by Jennifer Hillman Jackson25k
0
gravatar for m.bernt
6 weeks ago by
m.bernt40
m.bernt40 wrote:

that's a great idea. dominique: could you upload the transcripts / count table and start deseq2 on usegalaxy.eu? I guess you need to register an account there to have a permanent history to share.

ADD COMMENTlink written 6 weeks ago by m.bernt40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 178 users visited in the last hour