Question: Problem with TopHat in Galaxy using data from Nature Protocols
0
gravatar for skkim0217
4.5 years ago by
skkim02170
Korea, Republic Of
skkim02170 wrote:

Hi, I am a graduate student who began RNA-seq data analysis recently.

I am having a trouble with testing RNA-seq analysis using a set of data provided from Nature Protocols (Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Trapnell et al.).

In the TopHat part, I used own juction and used annotation from Ensembl fruit fly gene set data (in a gtf format; ftp://ftp.ensembl.org/pub/release-75/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP5.75.gtf.gz). At first it seemed running fine, but after a couple of minute an error message shows up:

 

Fatal error: Tool execution failed [2014-06-17 06:29:59] Beginning TopHat run (v2.0.9)

-----------------------------------------------

[2014-06-17 06:29:59] Checking for Bowtie Bowtie version: 2.1.0.0

[2014-06-17 06:29:59] Checking for Samtools Samtools version: 0.1.18.0

[2014-06-17 06:29:59] Checking for Bowtie index files (genome)..

[2014-06-17 06:29:59] Checking for reference FASTA file

[2014-06-17 06:29:59] Generating SAM header for /galaxy/data/dm3/bowtie2_index/dm3 format: fastq quality scale: phred33 (default)

[2014-06-17 06:30:01] Reading known junctions from GTF file

[2014-06-17 06:30:05] Preparing reads left reads: min. length=75, max. length=75, 11607353 kept reads (0 discarded) right reads: min. length=75, max. length=75, 11607353 kept reads (0 discarded)

[2014-06-17 06:32:35] Building transcriptome data files..

[2014-06-17 06:32:39] Building Bowtie index from dataset_8400771.fa

[FAILED] Error: Couldn't build bowtie index with err = 1

 

Trying to find a solution to this problem, I came across to a comment:

The gtf and the reference fasta files identifiers must be the same. Consider to update the chromosome/contig names in all your annotation files
(gtf, gff, dbsnp vcf, etc)

I did not understand exactly, but somehow there is a mismatch between the gene data set and reference in Galaxy. Could you help me with this matter? Has this error caused by uploading wrong gene data set?

ADD COMMENTlink modified 4.5 years ago by Jennifer Hillman Jackson25k • written 4.5 years ago by skkim02170
0
gravatar for Jennifer Hillman Jackson
4.5 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

You are also using a custom reference genome? There is almost certainly a mismatch. Here is an explanation and some troubleshooting help:
https://wiki.galaxyproject.org/Support#Detecting_Genome_Mismatch_Problems

The UCSC "dm3" genome is available at http://usegalaxy.org and the matching GTF file at the Cufflinks web site under iGenomes (also in link above) if you want to try that instead.

Best, Jen, Galaxy team

ADD COMMENTlink written 4.5 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 166 users visited in the last hour