Question: How can I merge two single-end mapping data
3.5 years ago
zhhxu90 wrote:


I am analysing my ChIP-Seq data which generated by paired-end sequencing. But due to some experimental issues. It is better to mapping both the R1 and R2 data in single-end way separately.

So I would like to know how can I combine two SAM or BAM files into one file and how can I get the mapping statistics (eg. mapped and unmapped percentage, unique and multiple mapped percentage...) from the combined file.

Thanks a lot.


3.5 years ago
3.5 years ago
Bjoern Gruening
Bjoern Gruening4.9k wrote:

I may be completely wrong, but I would say that you can simply put both files together. To do this you can use the tool "Concatenate datasets tail-to-head".



ADD COMMENTlink written 3.5 years ago by Bjoern Gruening4.9k

This is a great tool for many datatypes! BED files can be directly combined, as can many tabular dataset types. Some will require that you remove the headers first and then replace them after merging (GTF, BAM, SAM, etc). To remove headers of tabular datasets, the various "remove/select lines from a dataset" tools in the group "Text Manipulation" can be used. For BAM, there are other methods: BAM-to-SAM without headers, concatenate the mapping line results, coordinate sort (tool = "Sort"), add back in the header (tool = "Replace SAM/BAM header"), and be ready to go. An alternative is to use the tool "Merge BAM Files" in one step, perhaps followed by the tool "Reorder SAM/BAM" if that sort type is required. Not sure which sorting to use (if any)? Each tool's form has links to documentation that will help determine if sorting is required and what type (by order of the reference genome or by coordinates).

Best! Jen, Galaxy team

Jennifer Hillman Jackson

Thanks Bjoern, I will take a shot.

3.5 years ago by zhhxu9
