Question: Realigner Target Creator will not run?
0
gravatar for frankie.north
24 months ago by
frankie.north10 wrote:

I have been running the pipeline below to try and call SNPs from RNA-Seq data but have encountered problems with Realigner Target Creator Tool in Galaxy. Can anyone see any obvious problems in the pipeline?

Import ucsc.hg19.fasta, ucsc.hg19.dict, ucsc.hg19.fasta.fai, ucsc hg19 snps, 1000G indels and RNA-Seq data.

Convert RNA-Seq data into BED

Convert RNA-Seq data into FASTQ

FastQC on RNA-Seq data

FASTQ Groomer on RNA-Seq data

FASTQ Splitter into forward and reverse reads (RNA-Seq data originally paired end)

Map with BWA for Illumina on forward and reverse reads

IdxStats on BWA output

Sort by chromosomal coordinate

RmDup on RNA-Seq data

Filter on RNA-Seq data for mapped reads and reads in proper pairs

ValidateSamFile to check for errors (no read groups assigned)

AddOrReplaceReadGroups on RNA-Seq data

ReorderSam to remove lexicographical sort

Filter for chromosome 1 to narrow down data size

ValidateSamFile to check for further errors (nucleotide difference in file does not match reality and mate not found for paired reads given)

I have tried to run Realigner target creator as a prerequisite for the Indel Realigner however it will not work, bringing up the error "Lexicographically sorted human genome sequence detected in reads". I would have thought this problem had already been solved by running the ReorderSam step? Could this be that my reference genome is the problem somewhere? When running the Realigner target creator I can only use the imported fasta hg19 file, as it does not bring up any locally cached references?

Thanks, Frankie.

ADD COMMENTlink modified 24 months ago by Jennifer Hillman Jackson25k • written 24 months ago by frankie.north10
0
gravatar for Jennifer Hillman Jackson
24 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The GATK tools at http://usegalaxy.org are indexed for human using hg_g1k_v37.

My guess is that your hg19 fasta file is not sorted in the GATK expected way (it is very specific). Help to do that is here and includes extra help for custom genomes: https://biostar.usegalaxy.org/p/14777/

Best, Jen, Galaxy team

ADD COMMENTlink written 24 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 119 users visited in the last hour