4.3 years ago by
United States
Hello,
An alternative (all steps can be performed on the public Main Galaxy instance, and with minimal tool install on a local/cloud):
- Convert using tool "NGS: SAM Tools -> BAM-to-SAM", preserving headers.
- Use tool "Filter and Sort -> Select" to filter for lines that are and are not headers (start with the character "@").
> Using pattern "^@" (without quotes). Run twice, this will create two new datasets.
- On the dataset containing the mapping lines, use tool "Text Manipulation -> Select random lines from a file". Change to "tabular" datatype first if needed (pencil icon -> Datatype tab -> modify and save).
- Add the dataset containing the headers to the result with tool "Text Manipulation -> Concatenate datasets tail-to-head". Change to "sam" datatype" if this produces tabular.
- Convert using tool "NGS: SAM Tools -> SAM-to-BAM".
Remove unmapped lines at the start, if wanted/needed, with one of the "NGS: SAM Tools -> Filter SAM (BAM)" tools. There are variations on the above, but all have about the same number of steps. Just permanently delete intermediate files once done to regain disk space. After creating a workflow from your history, if you think you might do this again.
Best, Jen, Galaxy team