Question: Uncollapse sequences in Galaxy
0
gravatar for elisabetta.cilli
24 months ago by
Italy
elisabetta.cilli0 wrote:

Hello

I am looking for a tool to uncollapse a previously-collapsed fasta-file.

Thank you

Elisabetta

ADD COMMENTlink modified 11 weeks ago by Jennifer Hillman Jackson23k • written 24 months ago by elisabetta.cilli0
1

Can you add an example?

ADD REPLYlink written 23 months ago by Bjoern Gruening4.8k

Sorry for writing in the old post but I was searching the same thing. I have fasta files collapsed from FASTx and I want to uncollapse such format:

>name_x99999

or

>name_9999

The copy number is after the _x or _

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by vebaev130
0
gravatar for Jennifer Hillman Jackson
11 weeks ago by
United States
Jennifer Hillman Jackson23k wrote:

Hello,

So you want to create a dataset that has each of the collapsed sequences put back into individual sequences, where the frequency of each is based on the count. There isn't a wrapped Galaxy tool that I know of to do this. And if you used the tool Collapse sequences (Galaxy Version 1.0.0) - FASTX-toolkit based, then the original sequence identifiers are no longer available.

A line-command script could be written to do this (and wrapped as a Galaxy tool). If you are interested in creating this, start here (use Planemo): https://galaxyproject.org/tools/

Next time, save back the original uncollapsed fasta dataset. It can be downloaded locally. Then after you confirm the download was successful, the dataset can be perm deleted from the history. This way, you can always upload it again if needed.

Thanks, Jen, Galaxy team

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by Jennifer Hillman Jackson23k

Thanks, I got an AWK line that can do it maybe I can make a tool.

PS Strange but these data were from BGI 5-6 years ago and they give us only collapsed fasta and fastq :(

ADD REPLYlink written 11 weeks ago by vebaev130

Update: I just double checked and the FASTX authors release an uncollapse tool in 2009. It isn't covered in the online documentation except in the release notes and I didn't download the latest version to see if it there, but you could. It is also not wrapped in the Galaxy FASTX repo in the Tool Shed.

http://hannonlab.cshl.edu/fastx_toolkit/ 24-Nov-2009 - Version 0.0.11 New tools: fastx_uncollapser

However, if you know awk and have a script already, you could potentially use the Galaxy tool Text Manipulation > Text reformatting with awk.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by Jennifer Hillman Jackson23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 82 users visited in the last hour