Hello,
The two tools have different reporting options on the tool forms. Perhaps these were not set in a way that would produce the same output?
For reference, the most current tool versions in Galaxy:
BEDTools >> Convert from BAM to FastQ (Galaxy Version 2.27.0.0)
Picard >> SamToFastq extract reads and qualities from SAM/BAM dataset and convert to fastq (Galaxy Version 2.7.1.0)
Thanks! Jen, Galaxy team
hi Jen thanks for the answer.
I've used the bedtools bamtofastq on a bam generated by galaxy then I use the galaxy tool.
so the input is the same, but these are the sizes: bedtools: read1 28679667 read2 28767491
galaxy read1 72646227 read2 83312211
i ran the bedtools bamtofastq from the command line and I also got this warning message: "*WARNING: Query HS38_18980:1:1116:9578:21925 is marked as paired, but it's mate does not occur next to it in your BAM file. Skipping."
can the difference in size accout for the fact that bedtools is excluding unmated reads?
thanks ib
Hi - Yes, the options for the BedTools version of the tool can work on paired-end reads and has a few different reporting options. The default settings will produce output for only matched alignment pairs and the input should be a queryname sorted SAM dataset (when used in Galaxy -- the line command tool will accept a queryname sorted BAM). No tools in Galaxy currently create a valid queryname sorted BAM file (although they will soon).
See Highlights > New BAM datatypes in the 18.01 release notes here for details: https://docs.galaxyproject.org/en/master/releases/18.01_announce.html. The new datatypes have been created but tools are still pending an update to use/produce some of them. Development details: https://github.com/galaxyproject/tools-iuc/issues/1774 && https://github.com/galaxyproject/usegalaxy-playbook/issues/102
If you want all reads, use the Picard version with the option "If true, include non-primary alignments in the output" set to "Yes". Be aware that you may end up with duplicate reads in the output - the BAM is directly parsed. The BAM can be filtered before using this tool if needed.
The input BAM must be coordinated sorted before using tools from most tool groups within Galaxy. Galaxy now coordinate sorts BAMs loaded with the Upload tool by default. If this is a pre-existing BAM within Galaxy, either SortSam or Sort can be used to adjust sorting. Please do not attempt to produce a queryname sorted BAM with current tools as this will result in an error - output SAM instead. https://galaxyproject.org/support/sort-your-inputs/
All Support FAQs: https://galaxyproject.org/support/