EBI SRA Data Import

Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search

Latest

Open

RNA-Seq

ChIP-Seq

SNP

Assembly

Forum

Home

Welcome to Galaxy Biostar! User support for Galaxy! about • faq • rss

Log In

Sign Up

Question: EBI SRA Data Import

0

3.8 years ago by

prahkingsworth • 0

prahkingsworth • 0 wrote:

Hi guys

i have just imported these datasets into Galaxy in fastq format

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4620

which focuses on several chromosomes.

I want to convert these imported fastq files into bam files and focus on just one chromosome for example chromosome 5 using galaxy for RNA Seq analysis. The main reason for trying to focus on a single chromosome is that this will reduce the size of the imported fastq files making it easy to focus on a single study.

How can this be done using Galaxy?

Thanks in advance

Rex

fastq datasets filtering • 1.2k views

ADD COMMENT • link •

modified 3.8 years ago by Jennifer Hillman Jackson ♦ 25k • written 3.8 years ago by prahkingsworth • 0

0

3.8 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

You can convert a fasta file to BAM format, but it will not have chromosome mapping information.

In order to obtain that, map the full dataset then filter the results for hits to the target chromosome. Then go back and extract just the fasta sequences for those hits, creating the final .fastq dataset to use in your analysis as the input. This creates slightly skewed input - only sequences that map will be retained (meaning, unmapped sequences will not be a part of the input, as some fraction would normally be). This may or may not matter to you, and you could always seed back in some unmapped sequences at the same fraction found in the original dataset.

Alternatively, you could create a custom reference genome with just the target chromosome and use that when you map. The job will execute quicker. However, you will almost certainly get slightly different results. Perhaps try both and see which works best for you on one dataset, then use that method with the others.

Best, Jen, Galaxy team

ADD COMMENT • link written 3.8 years ago by Jennifer Hillman Jackson ♦ 25k

Please log in to add an answer.

Similar posts • Search »

how to combine individual chromosomal wig files into 1 bigwig file.
I am using Galaxy on a PC. My peak calling software is a PC based program built into DNAstar. U...
Galaxy On Cloud Not Recognizing .Txt Files As Fastq
Hello, I have been able to get Galaxy to instantiate on our Cloud account, and would like to us...
Dataset Does Not Appear In A Drop-Down Menu Of A Tool
Hi, I just started using the main galaxy instance. I uploaded via FTP and imported several FASTQ...
How To Extract Geneid From Pileup File?
Dear galaxy-users, I am working on a project to identify and genotype SNPs in targeted genes. ...
Galaxy Cloud: Location Of Imported And Converted Fastq'S
Hi, I had fastq.gz files in an Amazon S3 bucket. I created a Galaxy on the Cloud instance usin...
How to know if mapping to reference genome was succesful?
Given a set of BAM files for single-end reads just mapped with BWA tool, how can I have an overvi...
Import Files From Filesystem Path : No Redundance?
Hi all, I have used the option "Import files from filesystem path" to import a complete arboresce...
How to align 500 fastq unpaired files to an uploaded reference and put every single result to a new folder?
How to align separately 500 fastq unpaired files to an uploaded reference and put every single re...
Performing Operations On Tab Delimited Text Files
Hi, My name is Lanelle Edwards. I am a new galaxy user. I uploaded a tab delimited text file to m...
Importing Gtf Into Galaxy
Hi, I have been trying to get reference data from the ucsc browser into Galaxy, but when I try ...
Analyzing Targeted Resequencing Data With Galaxy
Hi! I am having problems with my sequencing results, but I am a newbie at this; so I am thinking ...
interval file to fastq conversiosn
I want to obtain fastq file from chromosome coordinates. I have interval file which contains 4 co...
Issues On Rnaseq Since The Changeover
My histories seem to be stopping their processing around the Tophat-cuffmerge steps since the cha...
Saccharomyces cerevisiae reference genome for TopHat,Bowtie
Hi, I am new to Galaxy, and for my first project I wish to analyze some sequences we recently ob...
GEO SRA fastq-dump with very low mapping rate (Galaxy)
Dear Biostars, I am a quite unexperienced biologist doing a metaanalysis of RNA-seq/microarray e...
fastq-dump on Galaxy?
Hello, I'm just getting my hands on Galaxy. I imported local .sra files as my starting data, but...

Content

Help

About
FAQ

Access

RSS
Stats
API

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by Biostar version 16.09

Traffic: 170 users visited in the last hour