Question: Subtract one data set from another
gravatar for mcrabtree
9 months ago by
mcrabtree10 wrote:

Is there a text manipulation tool or other tool that allows subtraction upon a data set by another data set that is aligned?

For example, if a defined column in two data sets were:

File #1: A,B,C,D,E,F,G,H,I,J,K,L,M

File #2: B,D,K,L

[File #1] - [File #2] = A,C,E,F,G,H,I,J,M

This shouldn't be too hard to do. Is there a tool that does this?

ADD COMMENTlink modified 9 months ago by Jennifer Hillman Jackson25k • written 9 months ago by mcrabtree10
gravatar for Jennifer Hillman Jackson
9 months ago by
United States
Jennifer Hillman Jackson25k wrote:


Would keeping just the specified columns from File #1 be enough? If so, the tool Cut could be used.

If you need to also filter, and just keep lines from File #1 that exist in File #2 (matched on all of the common keys), try this:

  1. Restructure each file so that columns B, D, K, L are combined into a single column.
    • Use the tool Merge Columns together.
    • You'll do this twice, once for each file.
    • This creates a new common "combined" field in the last column of each file. The original columns are left intact.
  2. Filter File #1 using File #2 on the common field created in step 1.
    • Use the tool Join two files with the option Output lines appearing in set to 1st but not in 2nd file.
  3. The result will have all of the original columns from File #1 plus the combined key used for filtering.
    • Use the Cut tool to just keep the columns you want.

If this is something that you'll be doing a few times, placing this manipulation into a workflow would make it easier to rerun all in one step. The Galaxy 101 shows how to create workflows:

All tutorials:

Thanks! Jen, Galaxy team

ADD COMMENTlink modified 9 months ago • written 9 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour