Question: Uncollapse sequences in Galaxy
gravatar for elisabetta.cilli
2.2 years ago by
elisabetta.cilli0 wrote:


I am looking for a tool to uncollapse a previously-collapsed fasta-file.

Thank you


ADD COMMENTlink modified 4 months ago by Jennifer Hillman Jackson24k • written 2.2 years ago by elisabetta.cilli0

Can you add an example?

ADD REPLYlink written 2.1 years ago by Bjoern Gruening4.9k

Sorry for writing in the old post but I was searching the same thing. I have fasta files collapsed from FASTx and I want to uncollapse such format:




The copy number is after the _x or _

ADD REPLYlink modified 4 months ago • written 4 months ago by vebaev130
gravatar for Jennifer Hillman Jackson
4 months ago by
United States
Jennifer Hillman Jackson24k wrote:


So you want to create a dataset that has each of the collapsed sequences put back into individual sequences, where the frequency of each is based on the count. There isn't a wrapped Galaxy tool that I know of to do this. And if you used the tool Collapse sequences (Galaxy Version 1.0.0) - FASTX-toolkit based, then the original sequence identifiers are no longer available.

A line-command script could be written to do this (and wrapped as a Galaxy tool). If you are interested in creating this, start here (use Planemo):

Next time, save back the original uncollapsed fasta dataset. It can be downloaded locally. Then after you confirm the download was successful, the dataset can be perm deleted from the history. This way, you can always upload it again if needed.

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 4 months ago • written 4 months ago by Jennifer Hillman Jackson24k

Thanks, I got an AWK line that can do it maybe I can make a tool.

PS Strange but these data were from BGI 5-6 years ago and they give us only collapsed fasta and fastq :(

ADD REPLYlink written 4 months ago by vebaev130

Update: I just double checked and the FASTX authors release an uncollapse tool in 2009. It isn't covered in the online documentation except in the release notes and I didn't download the latest version to see if it there, but you could. It is also not wrapped in the Galaxy FASTX repo in the Tool Shed. 24-Nov-2009 - Version 0.0.11 New tools: fastx_uncollapser

However, if you know awk and have a script already, you could potentially use the Galaxy tool Text Manipulation > Text reformatting with awk.

ADD REPLYlink modified 4 months ago • written 4 months ago by Jennifer Hillman Jackson24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 72 users visited in the last hour