Question: issue with realigner target creator
0
gravatar for BC357
3.9 years ago by
BC35770
United States
BC35770 wrote:

I read about a few other people's posts concerning realigner target creator but felt that my issue couldn't be addressed after trying their approaches. Thus comes this post.

Not sure why I got the error message (see below for details), as I assumed the uploaded files were BAM files, not SAM.

Details
Issue: realigner target creator failed to execute (database build under attribute: Human Feb 2009 (GRCh37/hg19)) Source for the reference list: “History”
Selected reference file: ucsc.hg19.fasta (selecting from the Shared Data the whole GATK bundle, among which ucsc.hg19.fasta is the only one available in the dropdown)     

Input file: a BAM file after running SAM-to-BAM conversion (database build under attribute: Human Feb 2009 (GRCh37/hg19))
Run with basic GATK options and basic analysis options.                                                      

Error report
##### ERROR MESSAGE: SAM/BAM file /galaxy-repl/main/scratch/tmp-gatk-brCFJ1/gatk_input.bam is malformed: SAM file doesn't have any read groups defined in the header.  The GATK no longer supports SAM files without read groups

Any suggestion ? Great many thanks.

ADD COMMENTlink modified 3.9 years ago by Daniel Blankenberg ♦♦ 1.7k • written 3.9 years ago by BC35770
2
gravatar for Daniel Blankenberg
3.9 years ago by
Daniel Blankenberg ♦♦ 1.7k
United States
Daniel Blankenberg ♦♦ 1.7k wrote:

You'll want to add read group information to the bam. You can do this using the Picard add or replace readgroups tool: https://usegalaxy.org/tool_runner?tool_id=toolshed.g2.bx.psu.edu%2Frepos%2Fdevteam%2Fpicard%2Fpicard_ARRG%2F1.56.0.

ADD COMMENTlink written 3.9 years ago by Daniel Blankenberg ♦♦ 1.7k

Thank you very much for clarifying the confusion. A stupid follow-up,

(1) does the need to add/replace readgroups indicate the default BAM output has no readgroups ?

(2) according to Broad's document about realigner target creator, readgroup is "really needed". If so, is it possible to have the info of readgroup in the BAM/SAM output as a default in Galaxy ?

(3) is it possible to include programs like "ValidateSamFile" to verify if a BAM/SAM is "ready" for certain downstream analyses (I could not find such a program, although I did not do an extensive search in Galaxy either)?

(4) what's the downside of having readgroup in BAM/SAM as a default?

I apologize if some of the questions have been answered before (as there is a very steep learning curve, personally, to be able to get around with bioinformatics tools)

Thank you again for the clarification.

ADD REPLYlink written 3.9 years ago by BC35770
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour