Question: Configuring indices for Bismark
0
gravatar for Mark Crowe
3.7 years ago by
Mark Crowe100
QFAB, Brisbane
Mark Crowe100 wrote:

Hi, I'm trying to get Bismark running on my Galaxy instance, but am having problems which I think are related to the location of the indices. For various permissions reasons, I can't put the Bismark indices in the normal Galaxy Bowtie directory, so instead I've created the tool-data/bowtie_indices.loc file within the Bismark tool directory specifying a different location for these files. But when I run Bismark (from within Galaxy) I get an empty output file with a message in the history entry listing the reference genome directory as the default Galaxy one, not the one specified in the Bismark bowtie_indices.loc (full output given below - the apparent truncation at the end is as it is shown in the history).

The history entry is green, suggesting that whatever has gone wrong, the tool wrapper has returned a success value back to Galaxy. Also the program works fine from the command line. Any suggestions please? Thanks very much.

Error in bismark: 
Path to Bowtie specified as: bowtie 
Alignments will be written out in BAM format. Samtools found here: '/usr/local/bin/samtools' 
Reference genome folder provided is /mnt/galaxyIndices/genomes/hg19/bowtie_index/ (absolute path is '/mnt/g
bowtie bismark indexes • 1.4k views
ADD COMMENTlink written 3.7 years ago by Mark Crowe100
1
gravatar for Bjoern Gruening
3.7 years ago by
Bjoern Gruening5.1k
Germany
Bjoern Gruening5.1k wrote:

Hi Mark,

blame on me! Indeed I assume that the bismark indices are located next to the bowtie indices in a separate folder. Can you please provide me your *.loc file and the ls/pwd of your bismark index?

Thanks,

Bjoern

ADD COMMENTlink written 3.7 years ago by Bjoern Gruening5.1k

Hi Bjoern

Thanks for that. My hg19 default bowtie directory is /mnt/galaxyIndices/genomes/hg19/bowtie_index/, containing files hg19.*.ebwt, and the relevant line in galaxy/galaxy-app/tool-data/bowtie_indices.loc is 

hg19    hg19    Human (hg19)    /mnt/galaxyIndices/genomes/hg19/bowtie_index/hg19

The location of the "Bisulphite_Genome" folder, containing my Bismark indices, is /mnt/galaxy/bismarkIndices/seq, and the entry in galaxy/shed_tools/toolshed.g2.bx.psu.edu/repos/bgruening/bismark/0f8646f22b8d/bismark/tool-data/bowtie_indices.loc is 

hg19    hg19    Human (hg19)    /mnt/galaxy/bismarkIndices/hg19/seq

Looking at those two .loc files in more detail now though, have I formatted the Bismark one correctly? The conventional .loc file includes the index prefix (hg19, referring to the hg19.*.ebwt files) but the Bismark .loc file only points to the directory. Although because the Bisulphite_Genome seems to be structured differently, with the CT_ and GA_ conversion subdirectories and BS_CT prefix, I'm not sure what the appropriate prefix would be. Anyway, hopefully you can make sense of my ramblings here and point me in the right direction.

Cheers

Mark

ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by Mark Crowe100
1

Hi Mark,

try to add the hg19 to your path please. If you submit the job the command should look like this:

--indexes-path /data/db/reference_genomes/hg19/bowtie_index/hg19

And the data storage should look like this:

ls -l /data/db/reference_genomes/hg19/bowtie2_index/
Bisulfite_Genome/ hg19.1.bt2        hg19.2.bt2        hg19.3.bt2        hg19.4.bt2        hg19chrM/         hg19.fa           hg19.rev.1.bt2    hg19.rev.2.bt2

Ciao,

Bjoern

ADD REPLYlink written 3.7 years ago by Bjoern Gruening5.1k

Thanks Bjoern, that's solved that problem - Bismark is now finding the indices correctly. Sadly, I still can't get it working properly yet - I'm getting a "tool error" now: 

Traceback (most recent call last):
  File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py", line 564, in finish_job
    job_state.job_wrapper.finish( stdout, stderr, exit_code )
  File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", line 1107, in finish
    dataset.datatype.set_meta( dataset, overwrite=False )  # call datatype.set_meta directly for the initial set_meta call during dataset creation
  File "/mnt/galaxy/galaxy-app/lib/galaxy/datatypes/binary.py", line 250, in set_meta
    raise Exception, "Error Setting BAM Metadata: %s" % stderr
Exception: Error Setting BAM Metadata: [bam_index_core] the alignment is not sorted (SRR020138.15024362_SALK_2029:7:100:1673:461_length=86): 131526433 > 89867996 in 19-th chr

I don't know if that is related to Bismark though, or another configuration problem in my Galaxy instance. Samtools is installed and in $PATH for the Galaxy user, and the output .bam file is generated (the "Full path" field in the history information panel points to /mnt/galaxy/files/000/dataset_111.dat, which is a valid if unsorted .bam file. I'm happy to close this thread and open a new one for that separate problem if it would be more appropriate.

 

ADD REPLYlink written 3.6 years ago by Mark Crowe100

Which version if samtools is in your PATH? Can you test a different tool that is producing an BAM file? Maybe SAM to BAM?

ADD REPLYlink written 3.6 years ago by Bjoern Gruening5.1k

That was it, sorry about that - I had samtools 0.1.14, not 0.1.19. Updated now, and I get a green history entry, although it's still reporting that it's unsorted and hence unable to be indexed. It still works as input for Bismark Meth. Extractor, just needs a bit of manipulating to open in IGV. 

ADD REPLYlink written 3.6 years ago by Mark Crowe100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 171 users visited in the last hour