Question: pipeline for DNA-seq analysis
0
gravatar for rgambhir
3.6 years ago by
rgambhir10
United States
rgambhir10 wrote:

Thanks for all your help. Finally I got the data uploaded on the Galaxy. As suggested there was a problem in uploading my fastaq.gz files. now everything looks fine. i would like to start analyzing my data. I looked at the FASTQC reports and everything looks good. I have DNA sequences derived from plasma samples of cancer patients (cell free DNA). i am interested in aligning this sequence with recent human genome build up for deciphering any mutations, insertions/deletion or copy number variations. Please advice what is the next step before I go into my data analysis

 

Regards

Ratish

 

 

 

fastq alignment data-prep • 1.5k views
ADD COMMENTlink modified 3.6 years ago by Jennifer Hillman Jackson25k • written 3.6 years ago by rgambhir10
0
gravatar for Jennifer Hillman Jackson
3.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hi Ratish,

Glad that you were able to upload your data. For data prep, starting with FastQC is a great choice. From there, just make sure that the data has the quality scores scaled correctly and that the datatype labels are correct. Help for that is in the Galaxy wiki here:
http://wiki.galaxyproject.org/Support section 2.10.1

For the analysis, if using GATK, make sure that you align versus the 1000 Genomes version of the human genome (hg_g1k_b37), if the data are human. This will allow you to use the indexes already in place. If using other tools, then hg19 and hg38 are also choices.

In short, decide which target genome to use (human or other) based on what is available for the other inputs you plan to use (reference annotation datasets, such as dbSNP and others). The availability of these can vary by genome and genome build. All inputs must be based on the same exact genome build. Once you know the inputs, then map. If you wait to look for downstream inputs until after mapping, you may find that what is available (or the best choice) are not a match for the build you selected for mapping, which means starting over - that is never fun.

Good luck with your project, Jen, Galaxy team

ADD COMMENTlink written 3.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour