Question: bam to fastq from galazy gives different fastq size of bamtofastq from bedtools
0
gravatar for ib7
7 months ago by
ib70
United Kingdom
ib70 wrote:

hi anyone knows why bam to fastq from galaxy platform gives different fastq size of bamtofastq from bedtools?

thanks ib

ADD COMMENTlink modified 7 months ago • written 7 months ago by ib70
0
gravatar for Jennifer Hillman Jackson
7 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

The two tools have different reporting options on the tool forms. Perhaps these were not set in a way that would produce the same output?

For reference, the most current tool versions in Galaxy:

  • BEDTools >> Convert from BAM to FastQ (Galaxy Version 2.27.0.0)

  • Picard >> SamToFastq extract reads and qualities from SAM/BAM dataset and convert to fastq (Galaxy Version 2.7.1.0)

Thanks! Jen, Galaxy team

ADD COMMENTlink written 7 months ago by Jennifer Hillman Jackson25k
0
gravatar for ib7
7 months ago by
ib70
United Kingdom
ib70 wrote:

hi Jen thanks for the answer.

I've used the bedtools bamtofastq on a bam generated by galaxy then I use the galaxy tool.

so the input is the same, but these are the sizes: bedtools: read1 28679667 read2 28767491

galaxy read1 72646227 read2 83312211

i ran the bedtools bamtofastq from the command line and I also got this warning message: "*WARNING: Query HS38_18980:1:1116:9578:21925 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping."

can the difference in size accout for the fact that bedtools is excluding unmated reads?

thanks ib

ADD COMMENTlink written 7 months ago by ib70

Hi - Yes, the options for the BedTools version of the tool can work on paired-end reads and has a few different reporting options. The default settings will produce output for only matched alignment pairs and the input should be a queryname sorted SAM dataset (when used in Galaxy -- the line command tool will accept a queryname sorted BAM). No tools in Galaxy currently create a valid queryname sorted BAM file (although they will soon).

See Highlights > New BAM datatypes in the 18.01 release notes here for details: https://docs.galaxyproject.org/en/master/releases/18.01_announce.html. The new datatypes have been created but tools are still pending an update to use/produce some of them. Development details: https://github.com/galaxyproject/tools-iuc/issues/1774 && https://github.com/galaxyproject/usegalaxy-playbook/issues/102

If you want all reads, use the Picard version with the option "If true, include non-primary alignments in the output" set to "Yes". Be aware that you may end up with duplicate reads in the output - the BAM is directly parsed. The BAM can be filtered before using this tool if needed.

The input BAM must be coordinated sorted before using tools from most tool groups within Galaxy. Galaxy now coordinate sorts BAMs loaded with the Upload tool by default. If this is a pre-existing BAM within Galaxy, either SortSam or Sort can be used to adjust sorting. Please do not attempt to produce a queryname sorted BAM with current tools as this will result in an error - output SAM instead. https://galaxyproject.org/support/sort-your-inputs/

All Support FAQs: https://galaxyproject.org/support/

ADD REPLYlink modified 7 months ago • written 7 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour