3.6 years ago by
United States
Hi Ratish,
Glad that you were able to upload your data. For data prep, starting with FastQC is a great choice. From there, just make sure that the data has the quality scores scaled correctly and that the datatype labels are correct. Help for that is in the Galaxy wiki here:
http://wiki.galaxyproject.org/Support section 2.10.1
For the analysis, if using GATK, make sure that you align versus the 1000 Genomes version of the human genome (hg_g1k_b37), if the data are human. This will allow you to use the indexes already in place. If using other tools, then hg19 and hg38 are also choices.
In short, decide which target genome to use (human or other) based on what is available for the other inputs you plan to use (reference annotation datasets, such as dbSNP and others). The availability of these can vary by genome and genome build. All inputs must be based on the same exact genome build. Once you know the inputs, then map. If you wait to look for downstream inputs until after mapping, you may find that what is available (or the best choice) are not a match for the build you selected for mapping, which means starting over - that is never fun.
Good luck with your project, Jen, Galaxy team