Question: Bowtie Mapping Problem
gravatar for Kelkar, Hemant
6.5 years ago by
Kelkar, Hemant30 wrote:
It is possible that this question has been asked/answered before. I tried searching through the galaxy-user list archives on nabble but could not find an applicable answer. Bowtie alignments done using a de-multiplexed Illumina sequence data set (CASAVA v.1.8.2) appear to be leading to alignment problem in our local galaxy install. At first glance this appears to be because of the " @SOMETHING<space>READINFO" read names not being handled correctly by bowtie. This is not a galaxy issue per se but what are other users doing to avoid this problem. We are using bowtie v. 0.12.7 (going to upgrade soon) at the moment. Posting a snippet from the alignment file below: MACHINE_NAME:2:1101:1533:1944 1:N:0:CGATGT 4 * 0 0 * * 0 0 NACGAAACGGGTCGGTCCGTCGGCATAGCGCGCCACGGCCTGCGGATCGG #4=DDFFFHHHFHIJHIJIHIIJJJIIIIIJJIHHFFDDDDDDDDDDDDD XM:i:0 MACHINE_NAME:2:1101:2523:1962 16 chr5 80936209 255 50M * 0 0 CTAAAAGGAAAAATTCCAGGGATTAAGGAACTTGAAGTTAGAAAAACTAN IIJJJIHHJJJIHFEIHIJIIJIJJJJIIJIIGJJIJFGHHHFEDAD=4# XA:i:1 MD:Z:49C0 NM:i:1 MACHINE_NAME:2:1101:1596:1971 16 chr13 41692461 255 50M * 0 0 GCTGAATAATAGTCCATTGTGAACATATACCATGTTTTCTTTATTTTTAN JJJJJJJJJIJJIJJJJJJJJJJJJJJHFJJJJJJJJHHHHHFFFFD=4# XA:i:1 MD:Z:49T0 NM:i:1 MACHINE_NAME:2:1101:2670:1962 1:N:0:CGATGT 4 * 0 0 * * 0 0 NTGCACTCGCCTGGATACCGTCGCCGGTGAGGTGGCATTCGAACACACCC --Hemant
alignment bowtie • 841 views
ADD COMMENTlink modified 6.5 years ago by Jennifer Hillman Jackson25k • written 6.5 years ago by Kelkar, Hemant30
gravatar for Jennifer Hillman Jackson
6.5 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Hemant, IDs such as this (resulting from the CASAVA 1.8+ pipeline) work with most tools. For a few others, there is an open ticket at bitbucket to track the progress: tool-to-work-with-casava-18 However, a problem with Bowtie is not a known issue, except if you consider that it will produce a SAM in the format as you show. But, this is not really a Bowtie problem, it is just passing the FASTQ IDs given as input. Perhaps if I explain the data it will help? When Bowtie doesn't have a hit to report, it will write out the entire sequence name into the SAM file. When there is a hit to report, the sequence will be in short format, but if you go back into your original FASTQ input and look for that shorter sequence ID, you will find the full name there, including space and additional content (it just wasn't passed into the SAM file). For downstream 3rd party tools that have problems with the inserted space in the identifier column (MACS is a good example), converting the SAM file to BAM, then using the BAM as input has been a successful work-around. Hopefully this helps! Jen Galaxy team -- Jennifer Jackson
ADD COMMENTlink written 6.5 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour