Question: Unicycler genome assembly failure
0
gravatar for malini.kotak
7 weeks ago by
malini.kotak0 wrote:

Hi,

Over the weekend I tried to use Unicycler to create hybrid assembly of a microbial genome using PacBio and Illumina Sequences. However, I got the following error: ‘remote job server indicated a problem running or monitoring this job’. From the previous posts it looks like a galaxy problem. Please help me in fixing this issue.

ADD COMMENTlink modified 7 weeks ago by Jennifer Hillman Jackson25k • written 7 weeks ago by malini.kotak0

Hello, The job may be running out of memory during execution (due to the number of long reads) or there is a format problem. I am rerunning your original job as a test then also testing out if wrapping the long read fasta helps to get the data interpreted by the tool correctly (using the tool NormalizeFasta and wrapping at 80 bases).

FastQC on your short reads did not reveal any significant issues, so those datasets do not seem to be a problem.

You should also try two reruns now - one using the original job's inputs and one with wrapped fasta input for the long read sequences. If both fail, then the data is too large to run this tool at the public Galaxy server at https://usegalaxy.org. Options would then be to either downsample the data (submit fewer long reads) or moving to your own Galaxy where more memory resources can be allocated.

Other test Unicycler jobs went through Ok this morning, but those used fastq formatted long reads, and much smaller inputs. A third rerun I am doing is converting those fastq to fasta to test if it will go through (both wrapped and unwrapped). More feedback once the tests complete. You don't need to wait for the feedback to start your own reruns. My tests are primarily to check if fasta wrapping is required or not, and if so, to add that formatting requirement to the tool form's help.

Thanks! Jen, Galaxy team

ADD REPLYlink written 7 weeks ago by Jennifer Hillman Jackson25k
0
gravatar for Jennifer Hillman Jackson
7 weeks ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

It looks like you have discovered this too -- normalizing the fasta dataset is not enough to get the job processed (this means that formatting was not the issue, the data is simply too large at over 1 M reads).

I see a rerun with a downsampled data (now in fastq format) that is still executing. That is the way to move forward.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 7 weeks ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour