Question: Trinity "OutOfMemoryError: GC overhead limit exceeded"
gravatar for nancydong20
13 months ago by
nancydong200 wrote:


I am trying to use Galaxy Trinity to assemble a set of stranded paired-end reads (two 1.4GB FASTQ files). I keep getting this error:

succeeded(38798) 62.3471% completed.
succeeded(38799) 62.3487% completed.
succeeded(38800) 62.3503% completed.
succeeded(38801) 62.352% completed. succeeded(38802) 62.3536% completed.

We are sorry, commands in file: [failed_butterfly_commands.15715.txt] failed. :-(

Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.HashMap.newNode( at java.util.HashMap.putVal( at java.util.HashMap.put( at java.util.HashSet.add( at PairPath._cache_path_nodes( at PairPath.<init>( at TransAssembly_allProbPaths.update_PairPaths_using_overlapDAG_refined_paths( at TransAssembly_allProbPaths.create_DAG_from_OverlapLayout( at TransAssembly_allProbPaths.main( Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Picked up _JAVA_OPTIONS: Trinity run failed. Must investigate error above.

I understand that this is because the job ran out of RAM? Is there any way to circumvent this?

Thank you very much!

trinity • 725 views
ADD COMMENTlink modified 13 months ago by Jennifer Hillman Jackson25k • written 13 months ago by nancydong200
gravatar for Jennifer Hillman Jackson
13 months ago by
United States
Jennifer Hillman Jackson25k wrote:


This looks like an input sequence quality problem (that said, the resources/setting/tool version available at PCS Bridges that executes Trinity from may simply not be enough to handle your particular data as discussed here).

Did you run FastQC on the data yet? That can give some clues about what QA/QC could to be done before running through an assembly tool. Sometimes that can help, if input quality is a factor (hard to tell, the errors given by tool at failure are not always specific to the actual root cause of problems).

Note that QC will not solve all problems. An inherently low-quality sequence dataset cannot be fixed after the fact. Some tools are more sensitive to quality than others (and certainly assembly tools). The tools can get "stuck" at certain steps and then fail (exceeding memory or with other errors). The lab that created the data could be contacted. But if public data, that won't be possible and you might want to use different, higher quality data. You also might want to consider setting up your own Galaxy server with more resources/modified tool run parameters to see if these will actually solve your problems as they did for some of those involved in the linked ticket above).


Thanks! Jen, Galaxy team

ADD COMMENTlink modified 13 months ago • written 13 months ago by Jennifer Hillman Jackson25k

I see, thank you very much!

ADD REPLYlink written 13 months ago by nancydong200
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour