Question: BAM file not working with Cufflinks or IGV
4.1 years ago
United States
dluesse10 wrote:

Hello Galaxy Team,

I am a complete novice at this, and have been unable to find the answer to my question on the message boards, so I'm hoping you can help me.  I have RNAseq data from an Illumina HiSeq.  Part of the results I obtained were BAM files generated by the sequencing facility, aligned to the Arabidopsis genome (TAIR 10).  I have uploaded these to Galaxy. 

I am trying to use cufflinks to analyze differential expression in my samples.  I uploaded the Arabidopsis genome GTF file from Illumina ( and attempted to use cufflinks on the BAM file.  However, I get this error:

Error running cufflinks.
return code = 1
Command line:
cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 -G /galaxy-repl/main/files/009/370/dataset_9370819.dat -u -N -b /galaxy/data/Arabidopsis_thaliana_TAIR10/sam_index/Arabidopsis_thaliana_TAIR10.fa /galaxy-repl/main/files/009/319/dataset_9319592.dat 
[14:48:46] Loading reference annotation and sequence.
Error parsing strand (2) from GFF line:
2	AT1G71695.1	1	+	26964248	26966688	26964358	26966557	3	26964248,26965590,26965942,	26964625,26965785,26966688,	0	AT1G71695	unk	unk	0,

However, if I run cufflinks without the reference genome, it works just fine. 

I have tried downloading different versions of the reference genome.  I have also tried converting the BAM to a SAM file, sorting it for cufflinks using the published workflow (, and using that file.  Always get the same error. 

I have also tried running the published workflow on the GTF genome from ensmble (  The result was an error, but random characters in the output of the error. 

In what may be a related problem, I cannot see any assembled reads when I try to view the file on IGV through galaxy.  However, when I use IGV on my local file, using the bai file generated with the sequence, the transcripts show up just fine.  When I download the BAM file I previously uploaded to galaxy and the corresponding index generated by galaxy, I once again see no reads. 

If someone can point out the rookie mistake I'm making, I would be very grateful!



rna-seq cufflinks galaxy bam
written 4.1 years ago by dluesse10
4.1 years ago
United States
Jennifer Hillman Jackson wrote:


I would start by confirming that the reference genome each input is based on was and is (if used again in a run) identical. Help to do that is available here:

Thanks, Jen, Galaxy team

written 4.1 years ago by Jennifer Hillman Jackson
4.1 years ago
United States
dluesse10 wrote:

OK, I discovered the problem.  I was not unpacking the .tar file.  I uploaded the genome straight from igenomes to Galaxy.  

For anyone who comes across this problem in the future, here is what you should do:

1) Download .tar file from igenomes to your computer.  

2) Download 7Zip.  It is free.  

3) Right click on the genome file, select "7Zip" and then "extract files."  I'd pick a new folder to put them in.  

4) Navigate through the maze of files to find the most recent .gtf.  Upload that to Galaxy.  



written 4.1 years ago by dluesse10
