Question: how to filter the "missense_variant"
1
gravatar for silarissa
2.7 years ago by
silarissa10
Belarus, Minsk
silarissa10 wrote:

Please, tell me how to filter the "missense_variant" in the case shown below:

...BaseQRankSum=1.901;CSQT=TTN|NM_001267550.1|missense_variant,TTN-AS1|NR_038272.1|intron_variant:....

I'm trying to make it in the VCFfilter, but without success.I did not want to change the vcf-format to tabular. I would be grateful for the help.

snp • 657 views
ADD COMMENTlink modified 2.7 years ago by Jennifer Hillman Jackson25k • written 2.7 years ago by silarissa10
3
gravatar for Jennifer Hillman Jackson
2.7 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

One simple method to achieve this is to use the tool "Select" with the regular expression:

.*missense_variant.*

This will select rows that contain, or do not contain this term depending on how the tool form option "Matching/Not Matching" is set.

If the goal is to select and remove the rows with the term (Not Matching), the remaining VCF file will be intact with headers.

If the goal is to select and keep the rows with the term (Matching), the output will contain only these rows, but will be a VCF file without a header. To add back in the header, Use "Select" again to retrieve the header rows with this regular expression:

^#

The result will be a file containing just the header. Combine this header dataset with the dataset containing rows of "missense_variant" content using the tool "Concatenate".

For any of this, change the datatype back to VCF as needed using the final result dataset's pencil icon -> Edit Attributes -> Datatype (many text manipulation tools output datasets with the datatype "tabular" by default).

All of this can be placed into a Workflow for re-use. Try "Extract Workflow", edit/annotate as desired. A datatype reassignment is a output option that can be applied to any tool in the workflow. In this specific case, it will be the last tool used in this series of manipulations. Other tools can proceed these or be included after to create a complete anaysis pipeline, or you can keep it distinct (sometimes modular data manipulation workflows have greater long-term utility).

Use the search function at the top of the tool panel to locate tools by name.

Hopefully this helps, Jen, Galaxy team

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour