Hi there, This issue has been posted about before but I couldn't find a definitive answer. I noticed that the DEseq2 normalization count table has one replicate that is not normalized. The number of reads in this table is exactly the same number as the htseq-count file for that replicate. I first reran DEseq2 with the same datasets, but still that replicate was not normalizing. Then I went back to the beginning; aligned that data set, did htseq-count, and used this new file in the DEseq2 run. But still, that replicate is not normalizing. I made sure that when redoing all of this, I was using the exact same settings as were used for the samples which were normalizing correctly. I'd really appreciate some more guidance on this! Thanks!
Hello,
I didn't find a Normalized Counts file with unnormalized values in your most current active history at Galaxy Main https://usegalaxy.org. Ht-seq reports whole numbers, the normalized counts are not. So, I don't think this is a Galaxy tool wrapper problem/bug.
The input content, factor matrix structure, and settings used all make a difference with how normalization is applied (and with how fold change is calculated).
Bioconductor help for DeSeq2: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html
If that does not clear things up, ask a question at their support site (or search first, then ask if the question is novel): http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-can-i-get-support-for-deseq2
Thanks, Jen, Galaxy team
Hi Jen, That is exactly what I'm seeing: 5 of the 6 replicates have values with multiple decimal points but one is just the whole numbers same as the htseq-count file. The normalized counts files in the Py Comparative data set should show this clearly, the SG-2 replicate is not normalizing. I tried reuploading the problem data set and even re-ran all six replicates with a new reference genome and transcriptome. And even then, this one replicate remains unnormalized. I checked the htseq-count file and there seems to be nothing wrong with that file and is similar to the other htseq-count files of different replicates. I also re-uploaded the tabular htseq-count file for that replicate and tried DEseq2 again, with no avail. All of these replicates were run with the exact same settings so I don't understand why 5 would normalize correctly but the other wouldn't. Could it be an issue with the raw fasta file from that run? Any additional guidance would be appreciated. Thanks