Question: Fatal error with DESeq2 when processing TPM-based Salmon files and using a tabular transcript-gene mapping file
0
gravatar for Dennis
8 months ago by
Dennis10
Dennis10 wrote:

Hi there,

I have used Salmon to map RNAseq reads to a transcriptome. I then proceeded to analyze Salmon output with DESeq2: - choice of input data: TPM values (e.g., from salmon) - transcript-ID and gene-ID mapping file (tabular file with transcript-gene mapping)

I used a tabular text file that contains two columns - one with SeqName and one with Description. Sample below:

SeqName Description
TNI017526-RC PREDICTED: uncharacterized protein LOC106135801
TNI017526-RD PREDICTED: uncharacterized protein LOC106135801
TNI017526-RE PREDICTED: uncharacterized protein LOC106135801
TNI017526-RB PREDICTED: uncharacterized protein LOC106135801
TNI017526-RA PREDICTED: uncharacterized protein LOC106135801

However, I keep getting a fatal error message: Fatal error: An undefined error occurred, please check your input carefully and contact your administrator. Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1 did not have 5 elements Calls: read.table -> scan

What does that mean? Is there a specific input format requirement for the

I found two similar issues reported here https://biostar.usegalaxy.org/p/23985/ - however, I'm already using a tabular text file that has transcript and gene names only.

Thank you for your help!

Best, Dennis

rna-seq software error • 432 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by Dennis10
0
gravatar for Jennifer Hillman Jackson
8 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

I would try this first: Remove the header line from the tabular file. That is the reason for the current failure.

If that doesn't work or some other error comes up after, try to create the file so that both IDs do not contain any spaces. Use underscores instead or even better, get rid of the extra content (including the colon :).

A format like this would be ideal (with a tab between the two values):

TNI017526-RC LOC106135801
TNI017526-RD LOC106135801
TNI017526-RE LOC106135801
TNI017526-RB LOC106135801
TNI017526-RA LOC106135801

If that still doesn't work, where are you working? If at Galaxy Main https://usegalaxy.org or can reproduce the problem there, a bug report can be sent in. Leave all inputs/outputs undeleted (including the test with this format of input) and include a link to this Biostars post. There can be other problems with inputs, and you can try to double check your with the FAQs here first if you want (is quicker): https://galaxyproject.org/support/#troubleshooting

If not working at Galaxy Main, let us know where and we can follow up from there.

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 8 months ago • written 8 months ago by Jennifer Hillman Jackson25k
0
gravatar for Dennis
8 months ago by
Dennis10
Dennis10 wrote:

Hi Jen,

This was my first suspect so I removed all the spaces and merged all the columns into one producing a one long name, no spaces or any non-alphanumeric characters for the gene ID.

Now the tool runs, but comes back empty :-/

All three files: normalized counts, plots and files on data are empty...

I'm working on Galaxy Main, I can submit a report or something like that if it'll help resolving the issue.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Dennis10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 148 users visited in the last hour