Question: Running TopHat2 with a GTF file
0
gravatar for gkuffel22
3.6 years ago by
gkuffel22170
United States
gkuffel22170 wrote:

Hi everyone,

 

I am having issues running TopHat 2. I have added the reference genome fasta file and corresponding gtf file from UCSC to my history in Galaxy for Mouse (mm10). When I look closer at the 2 files, they are both from Ensembl and have the same notation (chr1), however when I run TopHat 2 I get an error stating: Couldn't build bowtie index with err = 1. 

The first line of each file looks like this:

Fasta:

>mm10ensGene_ENSMUST00000086465 range=chr1:134199223-134235431 5 'pad=0 3' pad=0 strand=- repeatMasking=none

GTF:

chr1 mm10_ensGene stop_codon 134202951 134202953 0.000000 - . gene_id "ENSMUST00000086465"; transcript_id "ENSMUST00000086465";

tophat rnaseq • 1.6k views
ADD COMMENTlink modified 3.6 years ago by Jennifer Hillman Jackson25k • written 3.6 years ago by gkuffel22170
1
gravatar for Jennifer Hillman Jackson
3.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The reference genome (Custom?) is in Emsembl format. But, the reference annotation has UCSC chromosome identifiers - it is based on mm10 (but the track contents is from Ensembl). These two must be an exact match.

The error indicates a format error in the fasta file. Here is more about custom reference genome. 
http://wiki.galaxyproject.org/Support Section 2.14

I would suggested getting mm10 from UCSC downloads area and indexing that your server (local?). There is a data manager in the Tool Shed to use for both genome retrieval and index creation.

Best, Jen, Galaxy team

 

 

 

ADD COMMENTlink written 3.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour