Low data conversion rate for BAM-to-SAM. Fix Database, Datatype, Sorting

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: Low data conversion rate for BAM-to-SAM. Fix Database, Datatype, Sorting

0

2.2 years ago by

cwalker912 • 0

cwalker912 • 0 wrote:

I have WXS fastq files from an illumina HiSeq 4000 paired end run- I uploaded them through FTP as fastqillumina. They are each about 24 GB. Reads look fine using FastQ Summary Statistics. I aligned to hg19 using BWA for illumina, and got a SAM file that is 62GB. Then I took the SAM file and tried to run SAMTOOLS SAM to BAM. This ran for a few hours and the output BAM file is 1.8 KB, (KILObytes - as in tiny). Please let me know where I went wrong with this workflow... Any help would be greatly appreciated. Thank you very much.

fastq datatype database samtools bam • 730 views

ADD COMMENT • link •

modified 2.2 years ago by Jennifer Hillman Jackson ♦ 25k • written 2.2 years ago by cwalker912 • 0

0

2.2 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

Two areas to correct/adjust:

1. Database and Sorting

Does the input dataset have the correct reference genome assigned as the "database"? Samtools requires this as well as sorted input.

Fix: Assign the correct "database". If you used a Custom Reference genome for alignment, then create a Custom Build from that to assign. Sort the input BAM dataset.

How to change datatype: https://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset

How to create a Custom Build. Other CG formatting rules on the same wiki: https://wiki.galaxyproject.org/Learn/CustomGenomes

Sorting tips: https://github.com/jennaj/support-prior-qa/wiki/Sort-your-inputs

2. Datatype: Fastqsanger

Tools require .fastqsanger formatted sequence/quality scores. I suspect your data is already in this format and the assignment of .fastqillumina is causing problems. Prior Q&A and bug reports with this type of result (low hits) are often due to the wrong sequence datatype as input - in content or by datatype assignment.

Fix: Double check format and Fastq Groom or assign the correct datatype. Don't just change the assigned datatype or more unexpected results can occur. This is how: https://wiki.galaxyproject.org/Support#FASTQ_Datatype_QA

Thanks, Jen, Galaxy team

ADD COMMENT • link modified 2.2 years ago • written 2.2 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

fastq to bam
Hi, I am a newbie in these bioinformatics tools. I have 42 x 2 (paired end) fastq files that got...
User Reference Vs. Built-In Index
Hi. I created a workflow to map IGA reads using bowtie and generate a pileup at the end. The work...
sam to fastq
Hi I have mapped PE reads to a custom build genome using bwa and converted the sam output to bam...
Question: SAM-to-BAM: fatal exit code Error 139
Dear All, I am analyzing DNAseq data. I obtained a SAM file by using "Map with BWA for Illumina ...
Samtools Processing Issue
Hi- I am currently working with one of Illumina's older Bam files and trying to run SamTools Pil...
Multiple alignment for fastq files from illumina
Can I make a multiple alignment in Galaxy with fastq data from illumina? Clustal tool only takes...
Sam To Bam Conversion Problem
Hello Galaxy Support, I generated an alignment with a "fastq groomed" illumina dataset using the...
Using Segments Of Sequences As A Reference Genome - Bowtie For Illumina
Dear all, My problem seems like something that should have a very simple solution from my end and...
bam_index_build2() not yet implemented exception on local Galaxy instance - sam-to-bam
Hi, I'm trying to run a basic workflow consisting of just Map with Bowtie for Illumina, Filter S...
Mark Duplicates - MAPQ = 0 and Value was put into PairInfoMap more than once
Dear Biostars Galaxy team, I'm relatively new to using the Galaxy online platform and have been ...
SAM-to-BAM; an error occurred setting the metadata for this dataset.
Hi. I am very new to Galaxy and am following a simple workflow for processing FASTQ files: Step ...
Galaxy problems plus tech info
Dear Office, my user is fradiancona@yahoo.it I uploaded library files in Galaxy (fastq.gz files)...
Sam To Bigbed
Hi, Is it possible to generate a bigbed or bigwig file from SAM (or BAM) file using Galaxy? It lo...
sam to bam - Galaxy version 17.01
Hi All First of all I am not a bioinformatician, I'm looking at someone else's test workflow. I ...
How can I order tools in the toolbar on a local Galaxy instance?
Hi all, I've searched the wiki and looked for similar questions on biostars but cannot find the a...
BWA for Illumina
Hello, I'm starting to use Galaxy, and I'm trying to generate vcf and BAM files from my Fastq (pa...
Problem With Bam And/Or Bai Files
Hello Galaxy Team, I have been using Galaxy for SNP detection for with great success. Basically...
Uploading Illumina Bam Files In Galaxy
Dear Galaxy Users, I had a quick question as a new Galaxy user. I just received WGS data from I...
ChIP-seq mapping with Bowtie for Illumina 1.1.2
Hi, I mapped my chip seq data with 'Map with Bowtie for Illumina 1.1.2' and used 'Filter SAM or ...
Fastq to SAM/BAM
Hi, I am trying convert data from fastq to SAM/BAM. I used: NGS: Picard Fastq to SAM. But when I ...
BAM to VCF format
I'm analyzing NGS data on website:usegalaxy.org in the following steps: 1. Fastq file Raw reads Q...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 175 users visited in the last hour