Question: SamToFastq: SAM validation error
0
gravatar for lim.michelle.25
3 months ago by
lim.michelle.250 wrote:

Hi all,

I am new here, and also very new to galaxy. I'm trying to convert bam files to fastq using Picard's SamToFastq on galaxy. I've never had a problem doing this with my own whole exome sequencing bam files. But today I am using bam files downloaded from this link: http://sns.ias.edu/~cschan/TLymphoma/

The error I am getting is:

Fatal error: Exit code 1 () Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/galaxy-repl/main/jobdir/020/427/20427382/_job_tmp -Xmx7g -Xms256m Ignoring SAM validation error: ERROR: Record 1, Read name FCD22W0ACXX:3:2308:15839:84890#0, RG ID on SAMRecord not found in header: p53-ko-tumor-male2 Ignoring SAM validation error: ERROR: Record 2, Read name FCD22W0ACXX:3:2308:15839:84890#0, RG ID on SAMRecord not found in header: p53-ko-tumor-male2 Ignoring SAM validation error: ERROR: Record 3, Read name FCD22W0ACXX:3:2115:11587:62625#0, RG ID on SAMRecord not found in header: p53-ko-tumor-male2 Exception in thread "main" java.lang.NullPointerException at picard.sam.SamToFastq.doWork(SamToFastq.java:195) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

What can I do to resolve this? I'm also not too familiar with bioinformatics jargon so simplified explanations/directions would be appreciated!

Thanks.

galaxy samtools • 142 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by lim.michelle.250
0
gravatar for Jennifer Hillman Jackson
3 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

It sounds like the BAM files are malformed. Specifically, there is a problem with the header. Reads are assigned to read groups that are missing from @RG lines, leading to this error.

I'm not sure if this will work, but you could try using the tool NGS: Picard > AddOrReplaceReadGroups to correct the problem. Prior assigned read groups likely wouldn't matter if you are only trying to extract the fastq sequence. This tool might also fail but it is worth a test.

If that doesn't work out, I would suggest contacting the data author/source and reporting the problem. Mismatched BAM headers versus aligned read content indicate one type of content problem and there could be others present in the data. Many tools will fail when using such data as an input and others might produce unexpected results, including some that may be not obvious/produce errors.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 3 months ago by Jennifer Hillman Jackson25k
0
gravatar for lim.michelle.25
3 months ago by
lim.michelle.250 wrote:

Hi Jen,

Thanks for your reply. The AddOrReplaceReadGroups didn't work. I will contact the authors.

Best,

Michelle

ADD COMMENTlink written 3 months ago by lim.michelle.250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour