Question: Rat SNP/Indel calling using Galaxy
0
gravatar for Christian.Wood
4.2 years ago by
United Kingdom
Christian.Wood0 wrote:

Dear all,

I'm in the process of trying to figure out the 'best' method for looking at SNP/variant/indel calling in a set of paired-end Illumina rat RNAseq data that has been mapped using Bowtie to rn5 and converted to bam files. I have 2 groups, WT and mutant and have 6 biological replicates. I'm relatively new to RNaseq analysis so only have access to the main Galaxy server at usegalaxy.org and may be slightly naive in my understanding of the different tools available to me.

I believe the most common method for running SNP analysis is using the GATK tools but I believe there's only GATK beta tools currently available which uses the human b37 genome only. What are your thoughts on the best method for calling SNPs/Indels in this dataset using the current tools? Would Mpileup/Freebayes be the way to go?

In addition, what would be the best method for comparing the two groups, control vs mutant?

Any help on this would be greatly appreciated.

Many thanks,

Christian

 

 

rna-seq snp usegalaxy indel bowtie • 1.5k views
ADD COMMENTlink modified 4.2 years ago by Jennifer Hillman Jackson25k • written 4.2 years ago by Christian.Wood0
0
gravatar for Jennifer Hillman Jackson
4.2 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello Christian,

You can use GATK with the rat genome as a Custom Reference genome, but there is a good chance that the jobs are going to become to large to run on the public instance when run this way (but until you test, that isn't certain). There are other methods - two are in this simplified protocol if you want to have a look, compare results, and see if one of these meets your needs:
http://usegalaxy.org/u/galaxyproject/p/galaxy-101-ngs-variant
http://wiki.galaxyproject.org/Support#Custom_reference_genome

Other example tutorials/protocols can be found in "Shared Data -> Published Pages" on Main (http://usegalaxy.org) and in our wiki under the Learn and Teach sections (http://galaxyproject.org).

In any single protocol, depending on where it is sourced, the tools may or may not be on the public Main Galaxy server. Also, you won't be able to really know if your data is too large to run the more compute intensive tools until tested. If you do run into scale issues, a cloud Galaxy is often a good alternative. Especially if you are academic. Checking out the AWS Educational Grants program is worth it.
http://usegalaxy.org/cloud

As you work through this, make sure that you know if your data is DNA or RNA. The way your question is phased indicates that the data could be either. If you are not sure if a tool is appropriate for one or the other or both, the tool's form itself will have a link to usage documentation.

Hopefully this provides some options that help you to decide how to proceed, Jen, Galaxy team

 

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour