I am mapping paired-end reads using Bowtie2 and setting "maximum insert size for valid paired-end alignments" to 500 bases. However, when I calculate the insert sizes of the resulting bam files using Picard Insert Size Metrics I frequently see one or two inserts per file that exceed 20000 bases. Is the Bowtie2 mapping incorrectly returning very large inserts, or is the Picard software mis-analyzing the bam file? If these large inserts are really coming from incorrect mapping, how can I remove them from the bam files so my downstream analyses are not affected?
Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search
Question: Paired-end insert size upper limit
0
lukacdm • 0 wrote:
0
Jennifer Hillman Jackson ♦ 25k wrote:
Hello,
This is just how the data mapped first-pass. Downstream analysis/summary tools will not consider the data as valid - so you can just leave these in the input.
Best, Jen, Galaxy team
0
lukacdm • 0 wrote:
Thanks, Jennifer. Will peak calling programs also ignore the long inserts?
The two peak calling tools on the public Main Galaxy instance at http://usegalaxy.org only accept single-end input. So you will be running a tool on a local/cloud for this analysis.
There should be an option for this with most tools, but this is more educated guess than fact. If you are in doubt, there is usually a link to the 3rd party tool documentation on the execution/help form that will help you to determine the proper usage for the tool you are using.
Best, Jen, Galaxy team
Please log in to add an answer.
Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.
Powered by Biostar
version 16.09
Traffic: 169 users visited in the last hour