Question: salmon gene quant to DESeq2
1
gravatar for matthew.johnson
4 months ago by
matthew.johnson20 wrote:

Hi again - I successfully ran salmon on my fastq files, including gene-level summary via a simple two-column map file of transcript-to-gene-ID. This gives me an output like this for each fastq:

Name    Length  EffectiveLength TPM NumReads
ENSMUSG00000114165  2016    1815.57 0.0732938   3.29409

I tried simply passing these outputs on as input to DESeq2 for differential expression, selecting under input "TPM values (e.g. from sailfish or salmon)", then for Gene mapping format selecting "Transcript-ID and Gene-ID mapping file" and specifying the same two-column table used for the salmon runs (haha).

I got this vague error:

Fatal error: An undefined error occurred, please check your input carefully and contact your administrator.
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 2 did not have 6 elements
Calls: read.table -> scan

So, not sure where the problem's arising, but for starters, the salmon output contains both TPM and NumReads (i.e., I presume, a read count estimate). Do I need to extract one or the other of these columns to pass on to DESeq2? And also, is the transcript-to-gene map even necessary for DESeq2 since the gene-level summary has already been done by salmon?

Thanks so much for your help!

rna-seq galaxy • 451 views
ADD COMMENTlink modified 3 months ago by devbt1530 • written 4 months ago by matthew.johnson20

Hi Matthew,

Have you resolved this issue? I've run into similar problems with my use of DESeq2 on Salmon files and TPM input coupled with a transcript-ID gene-ID mapping file.

ADD REPLYlink written 27 days ago by Dennis10
1
gravatar for Jennifer Hillman Jackson
4 months ago by
United States
Jennifer Hillman Jackson23k wrote:

Hello - I haven't seen this before but have mostly worked with count input (not TMP). The tutorials for Deseq also use count inputs. https://galaxyproject.org/tutorials/nt_rnaseq/#analysis-of-the-differential-gene-expression & http://galaxyproject.github.io/training-material/topics/transcriptomics/

Are you working at Galaxy Main (http://usegalaxy.org)? We would like to look at the exact usage and parameters to test if this option has a bug - or to help clarify usage - and sharing a history link or sending in a bug report is the most direct way to do that. How to: https://galaxyproject.org/issues/#usage-problem-reporting Please be sure to leave the datasets undeleted and include a link to this post so we can associate the two.

Others are still welcome to reply if they understand your use case from the given information.

Jen, Galaxy team

ADD COMMENTlink written 4 months ago by Jennifer Hillman Jackson23k

I didn't see a bug report sent in yet, but you can still do that if usage problems are still present.

ADD REPLYlink written 4 months ago by Jennifer Hillman Jackson23k
1
gravatar for devbt15
3 months ago by
devbt1530
devbt1530 wrote:

Dear all, I used the same pipeline and used the SALMON output for DESeq2 (there is an option of TPM input) along with .gtf of annotations. It gave me an error eventually:

Fatal error: An undefined error occurred, please check your input carefully and contact your administrator. Warning messages: 1: multiple methods tables found for 'arbind' 2: multiple methods tables found for 'acbind' 3: replacing previous import 'IRanges::arbind' by 'SummarizedExperiment::arbind' when loading 'DESeq2' 4: replacing previous import 'IRanges::acbind' by 'SummarizedExperiment::acbind' when loading 'DESeq2' Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... Error in .normarg_makeTxDb_genes(genes, transcripts_tx_id) : 'genes$gene_id' must be a character vector (or factor) with no NAs Calls: makeTxDbFromGFF ... makeTxDbFromGRanges -> makeTxDb -> .normarg_makeTxDb_genes Warning messages: 1: replacing previous import 'IRanges::arbind' by 'SummarizedExperiment::arbind' when loading 'GenomicAlignments' 2: replacing previous import 'IRanges::acbind' by 'SummarizedExperiment::acbind' when loading 'GenomicAlignments' 3: In makeTxDbFromGRanges(gr, metadata = metadata) : The following transcripts were dropped because their exon ranks could not be inferred (either because the exons are not on the same chromosome/strand or because they are not separated by introns): Lj0g3v0075389.1, Lj0g3v0124429.1, Lj1g3v4579120.1, Lj3g3v1775190.1

Could you please clear it where the error is happening? Is it due to incompatible SALMON output and .gtf file? Thank you in advance. Regards, Das.

ADD COMMENTlink written 3 months ago by devbt1530
1

This was solved with email but putting out the solution for others that may run into problems with this tool.

Most of the above error message contains warnings ("warnings" do not cause of a tool failure). The initial portion of the error message is where the tool failed ("Fatal error: An undefined error occurred..."). This generally indicates a format or content problem with one or more inputs.

This problem and other related problems were with the input GTF file not containing lines for type=transcript (3rd column of file and required by the tool). Switching to using a transcript-to-gene text input instead allowed the tool to process the data correctly.

ADD REPLYlink written 3 months ago by Jennifer Hillman Jackson23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 102 users visited in the last hour