Question: Stitch Gene Blocks From MAF for non-coding region
2.8 years ago by
lena.ho0 wrote:

Dear Galaxy,

I am attempting to stitch alignments from 46MultiZ MAF for a non-coding transcript. I am using a bed12 file :

chr9 97317326 97330411 NR_121569 0 + 97330411 97330411 0 3 266,184,740,


Using Stitch MAF blocks with the bed12 file, I get a stitched block from 9:97317326-97330411 but my purpose is to remove the introns to get a fasta alignment of all (concatenated) exons representing the spliced mRNA sequence. 

However, using Stitch Gene Blocks, I get fasta file where there are no sequences, with no errors.

How does one go about stitching MAFs for exons of non-coding transcripts?

Thank you very much!


2.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:


At the Galaxy Main public server at I just ran a quick test with Stitch Gene Blocks using the default for the option Split into Gapless MAF blocks (NO). The results are as expected. Meaning that spliced and stitched fasta sequence is the output - a match for the region in the BED12.

If Extract MAF Blocks produced output, then Stitch Gene Blocks should, given the same input. Whether the BED12 file input represents a coding or non-coding transcript is not considered by the tool. If spliced regions not included in the BED12 appear to be present in the fasta output from Stitch, comparing those results to the entire region footprint, extracted as genomic sequence directly, is a good way to confirm.

All that said, your described results do not seem to be based on the same exact BED12 input (?).

If that is the case - and even if not, I suggest double checking that the region is covered by the MAF dataset (if the source is UCSC, review your region versus the Conservation track). If the region covered (by the same primary and selected genomes as used with the Stitch tool), then the MAF will have data for it. From there, confirm the input BED12 dataset formatting (and input MAF,  if not using the pre-cached indexes) and that the MAF data are based on the exact same reference genome as the input BED12. This includes exact chromosome identifiers.

If you are working on another server, the problem could be with the tool or index, even though I would expect an error for these cases, not empty results. Contacting the administrator of that instance is one option for troubleshooting, but rule out the input formatting/coverage causes for empty output first.

Hopefully this helps, Jen, Galaxy team

ADD COMMENTlink written 2.8 years ago by Jennifer Hillman Jackson25k
