Uncollapse sequences in Galaxy

Question: Uncollapse sequences in Galaxy

2.8 years ago by

elisabetta.cilli • 0

Italy

elisabetta.cilli • 0 wrote:

Hello

I am looking for a tool to uncollapse a previously-collapsed fasta-file.

Thank you

Elisabetta

fastx uncollapse collapse fasta galaxy • 789 views

ADD COMMENT • link •

modified 13 months ago by Jennifer Hillman Jackson ♦ 25k • written 2.8 years ago by elisabetta.cilli • 0

Can you add an example?

ADD REPLY • link written 2.8 years ago by Bjoern Gruening ♦ 5.1k

Sorry for writing in the old post but I was searching the same thing. I have fasta files collapsed from FASTx and I want to uncollapse such format:

>name_x99999

>name_9999

The copy number is after the _x or _

ADD REPLY • link modified 13 months ago • written 13 months ago by vebaev • 130

13 months ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

So you want to create a dataset that has each of the collapsed sequences put back into individual sequences, where the frequency of each is based on the count. There isn't a wrapped Galaxy tool that I know of to do this. And if you used the tool Collapse sequences (Galaxy Version 1.0.0) - FASTX-toolkit based, then the original sequence identifiers are no longer available.

A line-command script could be written to do this (and wrapped as a Galaxy tool). If you are interested in creating this, start here (use Planemo): https://galaxyproject.org/tools/

Next time, save back the original uncollapsed fasta dataset. It can be downloaded locally. Then after you confirm the download was successful, the dataset can be perm deleted from the history. This way, you can always upload it again if needed.

Thanks, Jen, Galaxy team

ADD COMMENT • link modified 13 months ago • written 13 months ago by Jennifer Hillman Jackson ♦ 25k

Thanks, I got an AWK line that can do it maybe I can make a tool.

PS Strange but these data were from BGI 5-6 years ago and they give us only collapsed fasta and fastq :(

ADD REPLY • link written 13 months ago by vebaev • 130

Update: I just double checked and the FASTX authors release an uncollapse tool in 2009. It isn't covered in the online documentation except in the release notes and I didn't download the latest version to see if it there, but you could. It is also not wrapped in the Galaxy FASTX repo in the Tool Shed.

http://hannonlab.cshl.edu/fastx_toolkit/ 24-Nov-2009 - Version 0.0.11 New tools: fastx_uncollapser

However, if you know awk and have a script already, you could potentially use the Galaxy tool Text Manipulation > Text reformatting with awk.

ADD REPLY • link modified 13 months ago • written 13 months ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »