Question: Filter BAM dataset by CIGAR string "D" (deletion)
0
gravatar for matteoaccetturo
4 weeks ago by
matteoaccetturo0 wrote:

Hello everyone, I would like to select all reads from a BAM dataset with one deletion, two deletion and so on. I thought to use the CIGAR string putting 1D, 2D and so on depending on the case but when I use the Filter BAM datasets on a variety of attributes tool, I get only those reads with 0 bases and no CIGAR string (apparently all reads). As an example given the two reads below, I get the second one, no matter the filter I use. Does anyone can explain this behavior? Thanks a lot!

M02230:28:000000000-D2Y2R:1:1101:15213:1961 73 chr1 6257782 40 27S87M36S = 6257782 0 CAATCAAGAGTGAACTTCAGAACTT...C AAAAAFFFBFBDEGFFGGGGDGHHHB5B#AAFFGHGGGEGFBFEFGFHHHHBEFA1FGHFHHHHHEF353BFFGED1FCFHBGG4B3@F3F3343FFFHHHGGFFBGGE?///20E/CBHHBFFH322?@G0/02@@CGGCDF?@CCFHF RG:Z:0179 BC:Z:13 SM:i:40 NM:i:3 XN:Z:Coordinate_UserDefined (58174392)_193531064.1 M02230:28:000000000-D2Y2R:1:1101:15213:1961 133 chr1 6257782 0 * = 6257782 0 TGACGTTTCTGGTTCTGTTAANCTTGTTTTNCCGACTGA....

deletions sam galaxy filter bam • 75 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by matteoaccetturo0
0
gravatar for Jennifer Hillman Jackson
4 weeks ago by
United States
Jennifer Hillman Jackson23k wrote:

Hello,

If you don't want to have unmapped reads in the output, also filter for "isMapped" as a second condition.

I don't see any deletions marked in the CIGAR string: 27S87M36S, so this read would not be output if filtering by "nD" (where n == some number of deletion bases).

Help for SAM format: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723002/

Hope that helps! Jen, Galaxy team

ADD COMMENTlink written 4 weeks ago by Jennifer Hillman Jackson23k

Sorry I answered, instead of using the reply button. Here is my comment: Thanks Jen, now I get mapped reads, but it seems that the CIGAR filter does not work as I also put 1D to filter on CIGAR string, and I got everything. Here below the parameters I used: Input Parameter Value Note for rerun BAM dataset(s) to filter 1: 0179_S13.bam Select BAM property to filter on cigar Filter on this CIGAR string 1D Select BAM property to filter on isMapped Selected mapped reads True Would you like to set rules? false

Also reads with no deletions or with insertions are present in the output. What mistake do I make?

ADD REPLYlink written 4 weeks ago by matteoaccetturo0

Try this instead using the * to avoid the exact match. This captures other hits you don't want, so filter on MapQ as well. There might be a better way but this works:

Under one Condition use two Filters. Don't use two Conditions or the Filters they contain are applied independently.

  • cigar set as *nD*
  • mapQuality set as >0

where *nD* could be *1D*, *2D*... *6D*, etc

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Jennifer Hillman Jackson23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 91 users visited in the last hour