Using DESeq2 inside Galaxy

Question: Using DESeq2 inside Galaxy

2.7 years ago by

United States

marcelops • 20 wrote:

Hello,

I have been trying to use DESeq2 on Galaxy, and am having issues with this package.

To illustrate the problem, I have 6 datasets (T1, T2, T3, C1, C2, and C3), being T the treatment samples, and C the control samples.

Here are examples of the content of each sample (I am showing the first lines of T1 and C1 only, but the other datasets are all similar):

T1
gene1   331
gene2   74
gene3   50
gene4   1676.27
gene5   496.99
gene6   0
...

C1
gene1   361
gene2   59
gene3   30
gene4   1906
gene5   639
gene6   12
...

In the package DESeq2 1.8.2 on Galaxy, I am using the following arguments:

Factor: Treatment
1: Factor Level: Treated
- Count Files: T1, T2, T3
2: Factor Level: Control
- Count Files: C1, C2, C3

Then I got the following error:

DESeq2 run information

sample table:
               Treatment
dataset_1.dat     Treated
dataset_2.dat     Treated
dataset_3.dat     Treated
dataset_4.dat     Control
dataset_5.dat     Control
dataset_6.dat     Control

design formula:
~Treatment


primary factor: Treatment

-------------------

I couldn't find the documentation on how to use the Galaxy package DESeq2, and I am not sure about the format of the input files.

Has anyone successfully used DESeq2 inside Galaxy? Could you please let me know how your inputs look like, or if you have any info on how to properly use this package?

Thanks,

Marcelo

rna-seq galaxy deseq2 • 4.5k views

ADD COMMENT • link •

modified 2.7 years ago by Bjoern Gruening ♦ 5.1k • written 2.7 years ago by marcelops • 20

Hello, Some tests are running to determine if htseq-count is producing the correct input. This tool form is new to me as well, so am testing a few things out to see where the corner cases are that could trigger errors. Feedback from me early next week.

Thanks for the details here, very helpful. You usage case should work. But that is part of the test. If you can reproduce this at http://usegalaxy.org and want to submit a bug report (with a link to this post), that could be helpful if there are other minor input issues. If you do not actually have errror datasets, but rather "green" failure datasets, a shared history link sent to galaxy-bugs@lists.galaxyproject.org is another way to allow us to review. All datasets for review must be left undeleted. And it is best to not submit a very large history, as these are difficult to import. Just the inputs for this test (sam inputs, reference gff used, htseq-counts, then this tool's datasets).

Thanks, Jen, Galaxy team

ADD REPLY • link written 2.7 years ago by Jennifer Hillman Jackson ♦ 25k

sorry can't I use raw counts file coming from featurecounts???

ADD REPLY • link written 2.2 years ago by sa63_tanha • 0

Btw. why do you think this is an error? Is your dataset red? I can not see any error message in it, for me it looks like normal stdout messages about the design matrix.

ADD REPLY • link written 2.7 years ago by Bjoern Gruening ♦ 5.1k

2.7 years ago by

Bjoern Gruening ♦ 5.1k

Germany

Bjoern Gruening ♦ 5.1k wrote:

Hi,

we are using DESeq2 very successfully, so it should work. However your inputs are looking strange. This should be count data from htseq-count.

We have a small tutorial here: http://galaxyproject.github.io/training-material/topics/transcriptomics/ Maybe this gets you started.

Cheers, Bjoern

ADD COMMENT • link modified 14 months ago • written 2.7 years ago by Bjoern Gruening ♦ 5.1k

Hi, We are having trouble with DEseq2 on a local instance. Using featureCounts output as inputs we get the following error...

Fatal error: An undefined error occurred, please check your input carefully and contact your administrator. Warning messages: 1: multiple methods tables found for 'arbind' 2: multiple methods tables found for 'acbind' 3: replacing previous import 'IRanges::arbind' by 'SummarizedExperiment::arbind' when loading 'DESeq2' 4: replacing previous import 'IRanges::acbind' by 'SummarizedExperiment::acbind' when loading 'DESeq2' Error in data.frame(sample = basename(filenamesIn), filename = filenamesIn, : duplicate row.names:

I've check the input file(s) , which are tables with geneID and the BAM file name as headers. The geneID(s) are unique and the read counts are present.

Any help is appreciated. Thanks, Carmelo

ADD REPLY • link modified 19 months ago • written 19 months ago by calvare3 • 0

Hi Bjoern, the link you provided above is 404. Could you please share it again?

ADD REPLY • link written 14 months ago by syrez • 0

Hi, sorry for the broken link I moved this repository over to the main Galaxy organisation! https://galaxyproject.github.io/training-material/

ADD REPLY • link written 14 months ago by Bjoern Gruening ♦ 5.1k

Similar posts • Search »