Question: Mapping using multiple reference sequences
0
gravatar for Juan Ledesma
13 months ago by
United Kingdom
Juan Ledesma0 wrote:

Hi

I am using BWA to map my FASTQ files against two different reference sequences (V3-1 and V3-2) simultaneously on Galaxy to see the reads belonging to each sequence.

Once I've got the SAM file I would like to filter the mapped reads to generate two BAM files in order to get one file with the reads mapping V3-1 and another with those mapping V3-2.

Does any one know how I could filter these reads on Galaxy?

Many thanks

ADD COMMENTlink modified 13 months ago • written 13 months ago by Juan Ledesma0
1
gravatar for Mo Heydarian
13 months ago by
Mo Heydarian790
United States
Mo Heydarian790 wrote:

Hello,

Can you provide more information on how you are simultaneously mapping to two reference sequences?

If you map your reads to your two reference sequences individually and have two SAM files, you could use the "Compare two Datasets to find common or distinct rows" tool to filter your alignments based on the read ID (column 1 in SAM format).

Thanks for using Galaxy!

Cheers, Mo Heydarian, Galaxy Team

ADD COMMENTlink written 13 months ago by Mo Heydarian790
0
gravatar for Juan Ledesma
13 months ago by
United Kingdom
Juan Ledesma0 wrote:

Hi

Basically what I do is using a single file where I have the two fasta sequences that I am interested to use as reference sequences and I map the FASTQ files to it using BWA. Afterwards I filter the reads to remove the unmapped ones and I get a sam file which includes the reads mapping both reference sequences. What I would like to get is two bam files to generate a consensus fasta for each reference sequence.

Thanks for your help, i will try to run what you suggest and I will let you know.

Cheers

Juan

ADD COMMENTlink written 13 months ago by Juan Ledesma0
1

Hi Juan, Have a look at column three of your output SAM file. You may be able to use the identifiers of your two references to parse out reads aligning to each one.

A quick way to find out if there are unique identifiers would be to run the "Group" tool on column 3 of your SAM file, this will output a list of unique entries of column 3.

Hope this helps!

Cheers, Mo Heydarian, Galaxy Team

ADD REPLYlink written 13 months ago by Mo Heydarian790
0
gravatar for Juan Ledesma
13 months ago by
United Kingdom
Juan Ledesma0 wrote:

Hi Mo Many thanks for your advice. I managed to filter the reads aligning to each sequence using the filter tool and the specific identifiers for each one.

Thanks again

Juan

ADD COMMENTlink written 13 months ago by Juan Ledesma0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 94 users visited in the last hour