Hi
I'm trying to run TopHat on a Cloudman Galaxy instance and I keep getting the below error. I've run TopHat on this FastQ file several times before with great results. the ONLY thing that has changed is that I'm trying to use a Mouse mm10 FastA file [one that has been used elsewhere with success] for the genome alignment rather than using the built-in Galaxy Mouse mm10 genome.
Here's the error report:
Fatal error: Tool execution failed
Building a SMALL index
[2015-07-24 17:25:11] Beginning TopHat run (v2.0.14) ----------------------------------------------- [2015-07-24 17:25:11] Checking for Bowtie Bowtie version: 2.2.5.0 [2015-07-24 17:25:12] Checking for Bowtie index files (genome).. [2015-07-24 17:25:12] Checking for reference FASTA file [2015-07-24 17:25:12] Generating SAM header for genome [2015-07-24 17:26:36] Reading known junctions from GTF file [2015-07-24 17:26:56] Preparing reads left reads: min. length=50, max. length=50, 40744020 kept reads (126243 discarded) [2015-07-24 17:36:05] Building transcriptome data files ./tophat_out/tmp/dataset_5000 [2015-07-24 17:36:27] Building Bowtie index from dataset_5000.fa [FAILED] Error: Couldn't build bowtie index with err = 1 [bam_header_read] bgzf_check_EOF: Invalid argumentThe tool produced the following additional output:
Settings: Output files: "genome.*.bt2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt multiplier: default Max bucket size, len divisor: 4 Difference-cover sample period: 1024 Endianness: little Actual local endianness: little Sanity checking: disabled Assertions: disabled Random seed: 0 Sizeofs: void*:8, int:4, long:8, size_t:8 Input files DNA, FASTA: /mnt/galaxy/files/005/dataset_5100.dat Reading reference sizes Time reading reference sizes: 00:01:22 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:29 bmax according to bmaxDivN setting: 663195875 Using parameters --bmax 497396907 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 497396907 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:01:54 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:29 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:54 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:02:00 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 6; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:43 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 3.78969e+08 (target: 497396906) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:37 Sorting block of length 470172614 (Using difference cover) Sorting block time: 00:09:28 Returning block of 470172615 Getting block 2 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:44 Sorting block of length 392875285 (Using difference cover) Sorting block time: 00:07:56 Returning block of 392875286 Getting block 3 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:43 Sorting block of length 287831635 (Using difference cover) Sorting block time: 00:05:43 Returning block of 287831636 Getting block 4 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:52 Sorting block of length 267502683 (Using difference cover) Sorting block time: 00:05:19 Returning block of 267502684 Getting block 5 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:48 Sorting block of length 429791782 (Using difference cover) Sorting block time: 00:08:39 Returning block of 429791783 Getting block 6 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:51 Sorting block of length 482842074 (Using difference cover) Sorting block time: 00:09:57 Returning block of 482842075 Getting block 7 of 7 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:36 Sorting block of length 321767421 (Using difference cover) Sorting block time: 00:06:37 Returning block of 321767422 Exited Ebwt loop fchr[A]: 0 fchr[C]: 773280124 fchr[G]: 1325927941 fchr[T]: 1878618059 fchr[$]: 2652783500 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 888467894 bytes to primary EBWT file: genome.1.bt2 Wrote 663195880 bytes to secondary EBWT file: genome.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2652783500 bwtLen: 2652783501 sz: 663195875 bwtSz: 663195876 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 165798969 offsSz: 663195876 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 13816581 numLines: 13816581 ebwtTotLen: 884261184 ebwtTotSz: 884261184 color: 0 reverse: 0 Total time for call to driver() for forward index: 01:18:04 Reading reference sizes Time reading reference sizes: 00:00:27 Calculating joined length Writing header Reserving space for joined string Joining reference sequences Time to join reference sequences: 00:00:29 Time to reverse reference sequence: 00:00:04 bmax according to bmaxDivN setting: 663195875 Using parameters --bmax 497396907 --dcv 1024 Doing ahead-of-time memory usage test Passed! Constructing with these parameters: --bmax 497396907 --dcv 1024 Constructing suffix-array element generator Building DifferenceCoverSample Building sPrime Building sPrimeOrder V-Sorting samples V-Sorting samples time: 00:01:55 Allocating rank array Ranking v-sort output Ranking v-sort output time: 00:00:29 Invoking Larsson-Sadakane on ranks Invoking Larsson-Sadakane on ranks time: 00:00:53 Sanity-checking and returning Building samples Reserving space for 12 sample suffixes Generating random suffixes QSorting 12 sample offsets, eliminating duplicates QSorting sample offsets, eliminating duplicates time: 00:00:00 Multikey QSorting 12 samples (Using difference cover) Multikey QSorting samples time: 00:00:00 Calculating bucket sizes Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:55 Splitting and merging Splitting and merging time: 00:00:00 Split 1, merged 6; iterating... Binary sorting into buckets 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Binary sorting into buckets time: 00:01:43 Splitting and merging Splitting and merging time: 00:00:00 Avg bucket size: 3.31598e+08 (target: 497396906) Converting suffix-array elements to index image Allocating ftab, absorbFtab Entering Ebwt loop Getting block 1 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:35 Sorting block of length 481244811 (Using difference cover) Sorting block time: 00:09:55 Returning block of 481244812 Getting block 2 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:43 Sorting block of length 401432159 (Using difference cover) Sorting block time: 00:08:10 Returning block of 401432160 Getting block 3 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:45 Sorting block of length 299524672 (Using difference cover) Sorting block time: 00:06:00 Returning block of 299524673 Getting block 4 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:50 Sorting block of length 371622119 (Using difference cover) Sorting block time: 00:07:31 Returning block of 371622120 Getting block 5 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:46 Sorting block of length 192869528 (Using difference cover) Sorting block time: 00:03:46 Returning block of 192869529 Getting block 6 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:51 Sorting block of length 363313281 (Using difference cover) Sorting block time: 00:07:26 Returning block of 363313282 Getting block 7 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:47 Sorting block of length 286950766 (Using difference cover) Sorting block time: 00:05:43 Returning block of 286950767 Getting block 8 of 8 Reserving size (497396907) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:33 Sorting block of length 255826157 (Using difference cover) Sorting block time: 00:05:11 Returning block of 255826158 Exited Ebwt loop fchr[A]: 0 fchr[C]: 773280124 fchr[G]: 1325927941 fchr[T]: 1878618059 fchr[$]: 2652783500 Exiting Ebwt::buildToDisk() Returning from initFromVector Wrote 888467894 bytes to primary EBWT file: genome.rev.1.bt2 Wrote 663195880 bytes to secondary EBWT file: genome.rev.2.bt2 Re-opening _in1 and _in2 as input streams Returning from Ebwt constructor Headers: len: 2652783500 bwtLen: 2652783501 sz: 663195875 bwtSz: 663195876 lineRate: 6 offRate: 4 offMask: 0xfffffff0 ftabChars: 10 eftabLen: 20 eftabSz: 80 ftabLen: 1048577 ftabSz: 4194308 offsLen: 165798969 offsSz: 663195876 lineSz: 64 sideSz: 64 sideBwtSz: 48 sideBwtLen: 192 numSides: 13816581 numLines: 13816581 ebwtTotLen: 884261184 ebwtTotSz: 884261184 color: 0 reverse: 1 Total time for backward call to driver() for mirror index: 01:17:50 [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Thank you