Question: Help removing part of a sequence
0
gravatar for Briony
3 months ago by
Briony10
Briony10 wrote:

Hi, I am trying to trim out part of a sequence so I can align it better to a reference sequence. I can remove the unwanted sequence at the end of my read but not the sequence at the beginning. It starts from the second base in. If anyone had any ideas about how to remove parts of a sequence that would be great!

ADD COMMENTlink modified 3 months ago • written 3 months ago by Briony10
0
gravatar for Jennifer Hillman Jackson
3 months ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

There are a few tools under the group NGS: QC and manipulation that can filter/clip reads.

The tool Trim sequences (Galaxy Version 1.0.0) can be used to trim specific regions by base positions.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 3 months ago by Jennifer Hillman Jackson25k
0
gravatar for Briony
3 months ago by
Briony10
Briony10 wrote:

Hi, Thanks for your suggestion Jen. That only trims the ends off sequences however. I am wanting to remove a small section in the middle of the whole sequence. Thanks

ADD COMMENTlink written 3 months ago by Briony10

Hi - That is an unusual function. Is there a compelling reason to keep that first base? If you remove sequence in the middle and join the two ends, that discontinuous first base will just be ignored by the aligner unless included in the alignment by chance.

ADD REPLYlink written 3 months ago by Jennifer Hillman Jackson25k

It's part of a primer that the other sequences I'm comparing to don't have. So for a easier alignment, I would like to remove it. However the sequences I'm aligning to have the first 2 bases. So I need to remove bases 3-22.

ADD REPLYlink written 3 months ago by Briony10

Ok - you could do this:

  • On the original file, Trim the 5' end (1-22), this keeps the 3' end after the primer.
  • Again on the original file, Trim the 3' end (total 3' sequence length + length of the primer), this keeps the two 5' bases.
  • Use FASTQ joiner to merge the two ends together

There are some find/replace tools in Galaxy but these won't work well on fastq formatted data.

ADD REPLYlink written 3 months ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 139 users visited in the last hour