Question: Problem running CollectRnaSeqMetrics
1
gravatar for brian.hermann
2.8 years ago by
United States
brian.hermann40 wrote:

Hello,

I am trying to run Collectrnaseqmetrics in usegalaxy toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_CollectRnaSeqMetrics/1.126.0 and am encountering an error (see below). Parameters are below.  Can you help me troubleshoot?

Thanks!

Brian Hermann

Fatal error: Exit code 1 ()
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/scratch
Exception in thread "main" picard.PicardException: Sequence dictionaries differ in /galaxy-repl/main/files/014/232/dataset_14232457.dat and /galaxy-repl/main/files/014/232/dataset_14232298.dat
	at picard.analysis.directed.RnaSeqMetricsCollector.makeOverlapDetector(RnaSeqMetricsCollector.java:76)
	at picard.analysis.CollectRnaSeqMetrics.setup(CollectRnaSeqMetrics.java:109)
	at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:100)
	at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)
	at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
	at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
	at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)
Caused by: htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: Sequences at index 20 don't match: 20/227966/chr4_GL456350_random 20/14945/chr4_JH584292_random/UR=file:/space/jen/new_genome/mm10/picard_index/mm10.fa/M5=c2ff41899e0f684fd93b28c58756e02f
	at htsjdk.samtools.util.SequenceUtil.assertSequenceListsEqual(SequenceUtil.java:121)
	at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:169)
	at picard.analysis.directed.RnaSeqMetricsCollector.makeOverlapDetector(RnaSeqMetricsCollector.java:74)
	... 6 more

 

Input Parameter Value Note for rerun
Select SAM/BAM dataset or dataset collection 16: TopHat on data 2 and data 1: accepted_hits  
Load reference genome from cached  
Using reference genome mm10  
Gene annotations in refFlat form 197: mm10 refFlat  
Location of rRNA sequences in genome, in interval_list format 199: mm10 rRNA interval list (UCSC)  
What is the RNA-seq library strand specificity None  
When calculating coverage based values use only use transcripts of this length or greater 500  
This percentage of the length of a fragment must overlap one of the ribosomal intervals for a read or read pair to be considered rRNA. 0.8  
The level(s) at which to accumulate metrics All reads  
Assume the input file is already sorted True  
Select validation stringency Lenient

 

 

 

galaxy • 1.4k views
ADD COMMENTlink modified 2.8 years ago by Jennifer Hillman Jackson25k • written 2.8 years ago by brian.hermann40
0
gravatar for Jennifer Hillman Jackson
2.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

These tools had previous issues that have not been confirmed to be completely resolved. OR the input may just need to be coordinate sorted (use Sort Sam).

https://github.com/jennaj/support-known-issues/wiki

A sort order conflict or some other data mismatch problem are the two potentials for the immediate problem. Once corrected, if an error results, then the original bug report error (dependency related) would come up. See link above for those details and follow it for the confirmation of a successful fix at usegalaxy.org.  

Would you be able to share a bug report for testing? Replicate at http://usegalaxy.org (if that is not where you are working right now) and click on the green bug icon to send in the report. Please include a link to this Biostars post to help us associated the two.

If something else is going on, we will write back plus add those notes to the tickets tracking the issue with this tool (and another related tool).

Thanks, Jen, Galaxy team

ADD COMMENTlink written 2.8 years ago by Jennifer Hillman Jackson25k

Thanks Jen.

Bug report issued. I have had this issue with multiple datasets that were previous analyzed successfully with this tool (using mm9, mm10, Hg19 genome annotations).

Best,

Brian

 

ADD REPLYlink written 2.8 years ago by brian.hermann40

Hi Brian,

There is some sort of dependency issue with the tool. As I mentioned in the direct email, the path to the reference genome is incorrect (and I missed that when reviewing the first time). An issue has been created to track the problem that you, plus any others that have the problem, can follow for the resolution. It is linked from here: https://github.com/jennaj/support-known-issues/wiki

Thanks again for sending in the bug report details, Jen, Galaxy team

ADD REPLYlink written 2.8 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour