Separating (F and R) reads uploaded from NCBI SRA & mapping them using BWA-MEM

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Separating (F and R) reads uploaded from NCBI SRA & mapping them using BWA-MEM

0

21 months ago by

tendai • 20

United States

tendai • 20 wrote:

Can someone explain to me the best way to prepare NCBI SRA data for subsequent mapping to a custom genome? I uploaded WGS paired-end sequence reads (8 files) from NCBI SRA directly into Galaxy Main by choosing the FastQ format/option.

Since each file contains both Forward (F) and Reverse (R) reads, as shown below, 1) is it necessary to separate the reads into separate F and R folders so as to be able to successfully map these reads to my custom genome using BWA-MEM within Galaxy? 2) If so, how do I do this? After reading this 4.5 year-old thread https://biostar.usegalaxy.org/p/4988/ on a similar topic, I tried to split the reads within Galaxy, using this information, but I have just cancelled the job since it had been running for >16 hours to process just one file.

@1/1
GATTCCAGCAAAGCACTCCCAAGGGGGCCTGACAGTGGTCAAGAGAA
+SRR5110008.1 1 length=151
AAAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@1/2
AATCAGTCCTGGCTGGTGTTAAGCCCTCAGGGGCAGGAGGGTGAAGT
+SRR5110008.1 1 length=151
AAFFFJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@2/1
AATAAAATTTTTAAAAAGTTATAAAGGAATACCTTTTCCAAAAGACC
+SRR5110008.2 2 length=151
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@2/2
TGTACGGAAAAGGGTCAGGACCTTCTCTAGACTGGGAGTTGCAAGCT
+SRR5110008.2 2 length=150
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@3/1
TGAAGTTGAGAGGGATCCATGGAAAGAGCTGGCATTCTCACTGTGAA
+SRR5110008.3 3 length=151
AAAFAFAFJF<fja7-fj7faa-f-f-fffaff-fjjjjjjfjjjjj <br=""> @3/2
AAAGAAGGAAACACATATACCTGGCTTCTGTCAACTTAGCTAAGCTG
+SRR5110008.3 3 length=151
(The reads are ~ 150 bp but I truncated them from the right) Thanks in advance.

bwa-mem separating ncbi sra reads galaxy • 748 views

ADD COMMENT • link •

modified 21 months ago by scholtz • 60 • written 21 months ago by tendai • 20

0

21 months ago by

scholtz • 60

Hungary

scholtz • 60 wrote:

This is what I've been using, following the suggestion of the Galaxy team - you will end up with separate forward/reverse data files:

Upload the SRR ID as a "list", (tabular file) using the NCBI SRA Tools - Extract reads in FASTQ/A format. Input type: List of SRA accession, one per line
1. Once uploaded, click on the "hide hidden" in the History panel
2. Unhide the forward/reverse data files

Proceed with the analysis as usual.

Hope this helps,

Beata

ADD COMMENT • link written 21 months ago by scholtz • 60

Thanks! This solved my problem.

ADD REPLY • link written 21 months ago by tendai • 20

Please log in to add an answer.

Similar posts • Search »

GEO SRA fastq-dump with very low mapping rate (Galaxy)
Dear Biostars, I am a quite unexperienced biologist doing a metaanalysis of RNA-seq/microarray e...
What is the correct regular expression for replacing inconsistent sequence and quality identifiers?
Hello, I am trying to analyze RNA-Sequence data. The SRA accession number for the file that I am...
Extract reads in FASTQ/A format from NCBI SRA
Hi. I downloaded an SRA accession. In the results, the spots were separated but is in the same fi...
Mapping paired reads to reference genome, variant calling
I am trying to map a trio of paired-end Illumina reads to a reference genome, then identify polym...
Bowtie2/FreeBayes/mpileup variant detection on NGS of PCR amplicons around Cas9/CRISPR indels
Hey threre, I have an MiSeq experiment using 24 indices where in each index I was sequencing 3 P...
how to extract mapped reads from output of BWA-MEM to a separate file?
Hello all, By default, BWA-MEM on the main galaxy server does not seem to give an option to writ...
Are there maximum Penalty Scores in BWA-MEM?
Apologies if this is a simplistic question, I am new to Galaxy. I am working on a variant callin...
FASTAQ not able to be selected in BWA, though FastaQC passed
I need some help. I am trying to map two fastq datasets with Bwa-mem. If I design a workflow by...
GATK (deprecated tools)
**Objective** The objective is to implement a variant discovery and annotation pipeline similar ...
Fastq Splitter Produced Empty Dataset, Please Help
I have problem to split a paired-end FASTQ dataset into two separate datasets. In order to explai...
Problem running BWA-MEM mapping
I am trying to run BWA-MEM mapping in galaxy. I have uploaded FASTQ files using fastqcssanger and...
ANNOVAR galaxy returning an empty file
I am working through a project that requires me to Identify genes with polymorphic sites using AN...
AddOrReplaceReadGroups errors with BWA-MEM input
Hello, I appreciate if someone can help me figure out why I get this error. Could not displ...
How to create and run a pipeline using BioBlend and Galaxy API?
I need to build the following pipeline: 1. Upload an SRA file from a remote FTP server (with an...
BWA-MEM mapping Taking longer time
Hi Folks, I am trying to run BWA-MEM mapping, its taking longer time. Can you please look into i...
BWA for Illumina using custom genome
Hi I have uploaded the fasta files (3 chromosomes) of my genome of interest. I downloaded FASTA ...
Use multiple inputs in same script
Hi all, I have a batch of 170 samples (represented by 170 comma-separated files) and am attempti...
Using Files Produced By "Barcode Splitter"
I used the "Barcode Splitter" tool to split multiplexed RNA-Seq libraries into separate files. I ...
October 5, 2010 Galaxy Development News Brief
http://bitbucket.org/galaxy/galaxy- central/wiki/Features/DevNewsBrief/2010_10_05 Here are the ...
BWA-MEM Fatal error: Matched on ERROR
Hi, I was aligning with bwa-mem on a 52.7GB trimmed paired fastq from WGS under the following con...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 169 users visited in the last hour