Question: GATK pipeline for WGS samples
0
gravatar for bsmith030465
2.2 years ago by
bsmith0304650 wrote:

Hi,

I have a command line script that does GATK analysis - I am now trying to set up a similar workflow on galaxy. Unfortunately, this is my first time using galaxy and it seems a bit daunting loading/organizing the data and the analysis! So, any help would be highly appreciated!!

I have my illumina fastq WGS data organized in sample folders, e.g.:

Sample_1/C45GOXX_s1_1_xx.fastq Sample_1/C45GOXX_s1_2_xx.fastq Sample_1/C45GOXX_s2_1_xx.fastq Sample_1/C45GOXX_s2_2_xx.fastq Sample_1/C45GOXX_s3_1_xx.fastq Sample_1/C45GOXX_s3_2_xx.fastq . .

In my script, I loop over all the files in the sample so that I can do the following for each pair (steps derived from Broad GATK 'Best Practises') :

i) Align each sequence (command line equivalent - baw aln) ii) Combine paired end reads (bwa sampe) iii) SAM TO BAM (samtools view) iv) Fix mate pair information (samtools fixmate) v) Set MAPQ to 0 for unmapped reads (samtools view) vi) Sort bam file (samtools sort)

My questions:

  1. How can I set up a loop (or some other workflow/structure) so that, for each sample, I get a set of bam files (output of step vi)?

  2. Here are the commands that I think I can use for each of the steps above

i) NGS: Mapping > Map with BWA for Illumina ii) ?? iii) NGS: SAMtools > SAM-to-BAM iv) ?? v) ?? vi) NGS: SAMtools > Sort

-- which tools/commands should I be using for ii,iv & v ?

many thanks!

wgs illumina gatk • 1.2k views
ADD COMMENTlink modified 2.2 years ago by Jennifer Hillman Jackson25k • written 2.2 years ago by bsmith0304650
0
gravatar for Jennifer Hillman Jackson
2.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

To analyse multiple paired-end samples, use the composite datatype Dataset Collection

For the tools, try these (find each using the tool's search function, at the top of the far left tool panel at http://usegalaxy.org). You can also review the tools in the SAMtools and Picard tool sections. Or review these tool packages in the Tool Shed.

ii) Combine paired end reads (bwa sampe) = FASTQ joiner

iv) Fix mate pair information (samtools fixmate) = FixMateInformation

v) Set MAPQ to 0 for unmapped reads (samtools view) = CleanSam

Thanks! Jen, Galaxy team

ADD COMMENTlink written 2.2 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 101 users visited in the last hour