Hi, I am trying to trim out part of a sequence so I can align it better to a reference sequence. I can remove the unwanted sequence at the end of my read but not the sequence at the beginning. It starts from the second base in. If anyone had any ideas about how to remove parts of a sequence that would be great!
Hello,
There are a few tools under the group NGS: QC and manipulation that can filter/clip reads.
The tool Trim sequences (Galaxy Version 1.0.0) can be used to trim specific regions by base positions.
Thanks! Jen, Galaxy team
Hi, Thanks for your suggestion Jen. That only trims the ends off sequences however. I am wanting to remove a small section in the middle of the whole sequence. Thanks
Hi - That is an unusual function. Is there a compelling reason to keep that first base? If you remove sequence in the middle and join the two ends, that discontinuous first base will just be ignored by the aligner unless included in the alignment by chance.
It's part of a primer that the other sequences I'm comparing to don't have. So for a easier alignment, I would like to remove it. However the sequences I'm aligning to have the first 2 bases. So I need to remove bases 3-22.
Ok - you could do this:
- On the original file, Trim the 5' end (1-22), this keeps the 3' end after the primer.
- Again on the original file, Trim the 3' end (total 3' sequence length + length of the primer), this keeps the two 5' bases.
- Use FASTQ joiner to merge the two ends together
There are some find/replace tools in Galaxy but these won't work well on fastq formatted data.