Question: SAMtools Merge BAM files hangs
3.3 years ago
kr81 wrote:

Hi all,

I am using TopHat to map RNA-seq reads. My study species is a non-model organism, so I created a custom genome build in order to do this. The first time I tried to map, I got an error message (reference sequence has more than 2^32-1 characters), so I split the genome in two, created two builds and mapped the reads to each build separately. I now have two accepted_hits BAM files that I want to join before performing further analyses. I tried using SAMTools > Merge BAM Files (Galaxy GVL-QLD mirror) but the job runs for a week without being completed. I think it might be related to this issue , as the reference genome contains over 1 million contigs, but I'm not sure. Can anyone suggest a work around? I would prefer to continue using the Galaxy GUI rather than the command line version of SAMtools if possible.



galaxy samtools bam
written 3.3 years ago by kr81
3.3 years ago
United States
Jennifer Hillman Jackson wrote:


The mapped datasets are based on a different reference genome (the two halves) so cannot be used in the same analysis. Much current bioinformatics computation is based on data positions relative to a common reference genome (or transcriptome, exome, etc). Without that, there is little that can be done between different datasets (only calculations that do not involve positional information).

Section 2.8 of the Galaxy support wiki explains alternatives for working with data/jobs that exceed the compute resources at

Best, Jen, Galaxy team

written 3.3 years ago by Jennifer Hillman Jackson
