How to deal with repeated genomic regions in BWA ? (How to generate a BED file from the XA tags)

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: How to deal with repeated genomic regions in BWA ? (How to generate a BED file from the XA tags)

0

3.0 years ago by

ed.dreuzy • 10

ed.dreuzy • 10 wrote:

Hi, I am using use "BWA for illumina" on galaxy main server, I am looking for the frequency of reads reads matching a specific region that is present in multiple copies in Hg19 on a few chromosomes. When I run BWA with in silico reads made from the original hg19 sequence I get a MapQ of 0 along with the WT:A:R tag and generally two other locations indicated by the XA: tag. Therefore when I run my actual reads, all the reads from that region are removed from the results as they are tagged unmapped and fall below the quality filter used.

Is there a way to :

1.Extract all the XA:locations obtained by running a synthetic library generated from the original target region sequence and create a BEDfile covering all these regions.(it is a large region, I have a synthetic library with 100k read, mimicking all 30bp or 80bp reads that could be derived from it).

That way, I guess when running my actual library, I can keep all the reads from the SAM output that fall within that BED and pool them back with all the other reads reads that matched to other location in HG19 with a correct MAPQ.(By the way is there a general agreement on a acceptable MAPQ score ?)

Do you think this a good strategy, or maybe is there a better way to deal with that issue ?

Thank you for your help,

Edouard DD

alignment bed bwa galaxy samtools • 859 views

ADD COMMENT • link •

modified 3.0 years ago • written 3.0 years ago by ed.dreuzy • 10

Hello, I don't know the answer for this one. Perhaps someone else on this forum will answer, although most Q&A here is with respect to Galaxy usage (not the details of 3rd party algorithms). Because of this, I would also suggest asking this question at the BWA help forum since their sole focus is this tool and use cases. Thanks! Jen

ADD REPLY • link written 3.0 years ago by Jennifer Hillman Jackson ♦ 25k

0

3.0 years ago by

ed.dreuzy • 10

ed.dreuzy • 10 wrote:

Thanks for your answer,

I will try the BWA help forum

Regarding galaxy usage are you aware of a way to extract XA TAG coordinates to a BED file ?

Thank you for your help,

Edouard DD

ADD COMMENT • link modified 3.0 years ago • written 3.0 years ago by ed.dreuzy • 10

Hello, The Select tool could be used to filter lines, then tools in Text Manipulation could be used to parse and format the data. Thanks!

ADD REPLY • link written 3.0 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

Are all the starting BED coordinates generated by BWA matching the first base of the reads ?
I am currently using "Map with BWA for illumina" (in galaxy main server) with default settings to...
April 8, 2011 Galaxy Development News Brief
April 8, 2011 Galaxy Development News Brief http://bitbucket.org/galaxy/galaxy- central/wiki/Fea...
Issue With Saving 'Manipulate Fastq' In Workflow; And Request For Advice Dealing With Barcoded 454 Data
Hi, I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actua...
AddOrReplaceReadGroups errors with BWA-MEM input
Hello, I appreciate if someone can help me figure out why I get this error. Could not displ...
November 24, 2010 Galaxy Development News Brief
November 24, 2010 Galaxy Development News Brief Here are the highlights of the following upgrade...
Realigner Target Creator will not run?
I have been running the pipeline below to try and call SNPs from RNA-Seq data but have encountere...
The Original Subject For The Thread
[mailto:galaxy-user-bounces@lists.bx.psu.edu] On Behalf Of galaxy-user-request@lists.bx.psu.edu T...
January 31, 2011 Galaxy Development News Brief
January 31, 2011 Galaxy Development News Brief http://bitbucket.org/galaxy/galaxy- central/wiki/...
Re: [Galaxy-Dev] Tool Integration: Soapaligner/Soap2
Branden, Are you currently calling a script file (Python Perl, etc.) in your command tag, or cal...
Handling Large Files In Galaxy
Hi all; I've recently gotten a local Galaxy install up and running for our group. We do a lot of ...
Does re-sequencing fastq files contain DNA or RNA sequences?
Dear Biostar members, I have a very basic question about some fastq files I got. All what I kn...
Differential comparison of Chip-Seq/SICER data
Hello all. I am looking for some help with a computational genetics task in Galaxy, but I am new ...
Slice VCF not retaining INFO data
Hi folks I'm using the Slice VCF tools to extract variants from a .vcf file that fall within ...
What is the difference between the tools "BWA" and and "Map with BWA for illumina" - tolerated mismatches at sequence end
Hi, What is the difference between the tools "BWA" and and "Map with BWA for illumina" ? I rea...
Generating Random Sequences
Hi,galaxy users, I'am wondering if there is a tool for generating random genome sequence or fro...
Problem with collectrnaseqmetrics (Galaxy)
I am attempting to run CollectRnaSeqMetrics on a tophat run (with my gff3 annotations and fasta g...
May 20, 2011 Galaxy Development News Brief
May 20, 2011 Galaxy Development News Brief http://bitbucket.org/galaxy/galaxy- central/wiki/Fea...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 175 users visited in the last hour