pipeline for DNA-seq analysis

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: pipeline for DNA-seq analysis

0

3.6 years ago by

rgambhir • 10

United States

rgambhir • 10 wrote:

Thanks for all your help. Finally I got the data uploaded on the Galaxy. As suggested there was a problem in uploading my fastaq.gz files. now everything looks fine. i would like to start analyzing my data. I looked at the FASTQC reports and everything looks good. I have DNA sequences derived from plasma samples of cancer patients (cell free DNA). i am interested in aligning this sequence with recent human genome build up for deciphering any mutations, insertions/deletion or copy number variations. Please advice what is the next step before I go into my data analysis

Regards

Ratish

fastq alignment data-prep • 1.5k views

ADD COMMENT • link •

modified 3.6 years ago by Jennifer Hillman Jackson ♦ 25k • written 3.6 years ago by rgambhir • 10

0

3.6 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hi Ratish,

Glad that you were able to upload your data. For data prep, starting with FastQC is a great choice. From there, just make sure that the data has the quality scores scaled correctly and that the datatype labels are correct. Help for that is in the Galaxy wiki here:
http://wiki.galaxyproject.org/Support section 2.10.1

For the analysis, if using GATK, make sure that you align versus the 1000 Genomes version of the human genome (hg_g1k_b37), if the data are human. This will allow you to use the indexes already in place. If using other tools, then hg19 and hg38 are also choices.

In short, decide which target genome to use (human or other) based on what is available for the other inputs you plan to use (reference annotation datasets, such as dbSNP and others). The availability of these can vary by genome and genome build. All inputs must be based on the same exact genome build. Once you know the inputs, then map. If you wait to look for downstream inputs until after mapping, you may find that what is available (or the best choice) are not a match for the build you selected for mapping, which means starting over - that is never fun.

Good luck with your project, Jen, Galaxy team

ADD COMMENT • link written 3.6 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

FASTQC issue getting error in web page
Thank you for that. I did finally upload my sequence using the file zilla. First I changed my .gz...
Galaxy-Fetch Sequences-How to extract genomic DNA from Fasta file?
I am trying to extract Virus genomic DNA sequence using Fetch sequences tools. The source of geno...
Fwd: Exome Sequencing Analysis
Hello, My name is Johnathan Cooper-Knock, I am a clinical fellow based at the University of Shef...
Can i upload file with extension gz?
Hi, Can I upload a 17G DNA sequence data file with extension .fastq.gz through FTP? Thanks, Ren
Need help with "Principal Component Analysis" tool
I want to convert DNA sequence into RNA sequence how to upload target DNA sequence. it is saying...
Need help with "Principal Component Analysis" tool
Tool name: Principal Component Analysis Tool version: 1.0.2 Tool ID: toolshed.g2.bx.psu.ed...
No Mm5 Sequence Data In Galaxy 2.2
when i try to download sequences for mm5 (uploaded bed file) i get an error: 4: Extract genomic ...
cDNA sequencing data analysis
I am new to galaxy and sequencing data analysis. I have RAW files from cDNA sequencing and I am l...
missing data history erased
Hello, I went to check my analysis this morning and everything was deleted and erased from my his...
uploading .gz files
Hi , I am new to Galaxy and Bioinformatics. I just got data after DNA NGS data. I have data from...
Extract Genomic DNA tool behavior
Hi I have downloaded RNA_seq data as fastq files and aligned these against FASTA sequences as re...
missing history but I did not delete it.
Hello, I went to check my analysis this morning and everything was deleted and erased from my his...
How to save sequence data and analysis results outside of Galaxy
I have some important data and an entire analysis of the data saved on Galaxy since 2013. I don'...
Problem In Uploading The Data
Dear Officer, I am a new user to assemble 75bp illumina solexa data. I have done single read ill...
Illumina DNA sequence upload problem
The Illumina DNA sequence for single sample we got was 59 (uncompressed), but I learned that the ...
1000 Genome Variant Calls
Repost Hello Jennifer, I am a new to Linux and have no programming skills and hence galaxy is t...
Rna-Dna Converter In Fasta Format
Hi all, I want to to map my sequencing reads to miRNA reference database . Anyone know how to con...
Getting RT stops
I'm looking to count RT stops for a sequencing job I performed. I have a reference sequence that ...
Uploading data to galaxy via FTP
Hi I'm currently using galaxy to analyze some genome sequences. I have uploaded my fastq to FTP. ...
Newbie wanting to align Ion Torrent PGM data to mitochondrial DNA template
I have three large .fastaq files for an Ion Torrent PGM analysis for three species of tuna. I wan...
Use whole genome bisulfite sequencing to find DNA mutations
Hello all, Our lab currently has a large amount of bisulfite sequencing data that a collaborator...
Galaxy interval format - what should be provided as CHROM#?
Dear colleagues, I have a .txt file with >100 lines of the following format (1st column - seq...
Extract SNPs flanking sequence from organisme without reference genome
Hi, I'm new in managing DNA sequences and I'm looking for help. I have fastq and vcf files from t...
Data Upload Question
Hi, I am new to Galaxy and trying to upload some data files. The size of my compressed file is a...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 175 users visited in the last hour