Question: How to convert ion torrent Reads into Wiggle by galaxy?
4.2 years ago
United States
I just obtained a set of BAM file that is obtained by Ion Torrent(our collaborator helped with the mapping already).

I used to use galaxy to process illumina reads, MACS in galaxy main instance will convert my SAM/BAM file into wiggle simultaneously while doing peak calling.

Now I found that I cannot do the same for Ion Torrent data(I figure I cannot use MACS to do peak calling because unlike illumina ion torrent reads are not with fixed length).

There is also anther thing very odd: I got a lot of lines like 'Chr NT' in the bam file, what does those mean? I figured that possibly this is what caused the error.  Since my computer has only 2Gb ram I cannot use text editor to remove those lines(the BAM file is 3GB in size), is there any similar manipulation I could do by using galaxy?

Thank you in advance for answering the question.


4.1 years ago
United States
The basic steps to convert Bam to wiggle (wig) are: 

  1. BEDTools: Create a BedGraph of genome coverage
  2. Convert Formats: Wig/BedGraph-to-bigWig converter

Note that this is NOT the same thing as calling peaks. Those algorithms do more that count positional read depth from a mapping jobs. 

But I think you will need to solve the reference genome issue first, if that is the immediate issue (not exactly clear, but you can check).

"Chr NT" looks like a chromosome identifier to me. The reference genome used for mapping and that used by downstream tools must be identical. Check for a conflict with what you are using now, then consider using a Custom reference genome (obtain the .fasta from your colleague that did the mapping).

I would normally suggest adding in the genome natively to your instance, instead of using a Custom genome, but any of these operations will have problems running with so little memory. A good rule to follow is that if the tool will run line-command on a system, it will almost certainly run within Galaxy (same core resources are needed for the tool's operation).

Because of that, consider a larger server for processing:

  1. Main,
  2. Cloud,
  3. All options,

Best, Jen, Galaxy team

thank you very much!!!!

Hi, Jen:

If I were to ignore those Chr MT/NT, could you give me a suggestion how to do that in galaxy? if it was a smaller file I could just delete those lines by notepad++ by regular expression, but this file exceeded my computer capacity since it is too big. Can I do any regular expression maneuver in galaxy?

Thank you again!

The Select tool permits regular expression use. But the Filter tool may be a more direct method if just working with a single column and known filter criteria.

With either tool, you'll need to work with tabular data. BAM is compressed, so a BAM-to-SAM run with Samtools would be needed. Not sure how much disk you have, or if this will cause a problem with memory. Worth a try locally, then use a larger server if problems. This particular tool does not require a reference genome, so that is good. Going back the other way will - and if the genomes are not an exact match, expect to encounter issues.

thank you! I am trying according your instructions in galaxy main.

