Question: FASTQ trimmer trimmed 1 nt too much of sequence but right amount of quality
0
gravatar for james.lloyd
4.4 years ago by
United States
james.lloyd0 wrote:

Hi,

 

I added two fastq files (I had just changed the datatype to fastqcsanger) to FASTQ trimmer and when I got the output from the two files I checked a few length manually and noticed the sequence length had decreased to 35 NOT 36 as I intended (100 long trimmed by 64 from the 3' end). 

 

Strangely the length of the quality information was now 36 for those that I checked. Is there a simple reason why this might have occurred? 

 

Any help or suggestions would be greatly appreciated! 

James

ADD COMMENTlink modified 4.4 years ago by Jennifer Hillman Jackson25k • written 4.4 years ago by james.lloyd0

Here is an example to make it clear what I mean.

Pre-trim

@HS2:90:B09PCABXX:4:1101:1216:2115 1:Y:0: 
NTAGAATTCCAGGTGTAGCGGNGAAATNNNNNGNGATCTGNNNNNNNACCGATGGCGAAGGCAGCCATCTGGCCTAATACTGACACTGAGGTGCGAAAGC 
+ 
#################################################################################################### 

Post-trim

@HS2:90:B09PCABXX:4:1101:1216:2115 1:Y:0: 
TAGAATTCCAGGTGTAGCGGNGAAATNNNNNGNGA 
+ 
#################################### 

 

ADD REPLYlink written 4.4 years ago by james.lloyd0
0
gravatar for Jennifer Hillman Jackson
4.4 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Both the "FASTQ Trimmer" and "FASTQ Quality Trimmer" will remove the adaptor base (or what the tool thinks is the adaptor base). 

This is noted on the tool form, in the usage notes under the execution button and primary explaintion, next to the yellow "!" icon:
"Trimming a color space read will cause any adapter base to be lost."

You can try the tool "Trim sequences" instead. Or, maybe go back and find out where the adaptor was lost before upload (other pre-processing may have altered the sequence, and maybe that can be modified). 

Hopefully one of these options works out for you. I realize this may be frustrating, it is just that tools are configured at default to interpret the most common variant of datatypes. Some adjustment when uploading data that differs may be needed. It also can't hurt to run FastQC to confirm the quality score type, if you are not 100% certain.

Best, Jen, Galaxy team

 

ADD COMMENTlink written 4.4 years ago by Jennifer Hillman Jackson25k

Hi, Thanks for your reply. 

 

I am surprised that FASTQ trimmer would find a single base remove it as an adaptor but then leave the associated quality sequence information behind meaning they are not even. 

 

I have just been to Trim sequences (version 1.0.0) but I cannot add my sequences to it for some unknown reason. I just changed the datatype to fastq from fastqcsanger and it did not make a difference but it says that it will take FASTA/Q file. 

 

Many thanks,

James

ADD REPLYlink written 4.4 years ago by james.lloyd0

I have now changed the format to fastqsanger and the tool can see it so I will give it a try tomorrow. 

 

Thanks again,

James

ADD REPLYlink written 4.4 years ago by james.lloyd0

Changing the input format from fastqcsanger to fastqsanger made the difference and I get 36 character long data for noth sequence and quality data. Thanks for the help. I will try to be more careful with things like this in the future. 

ADD REPLYlink written 4.4 years ago by james.lloyd0
1

I'm so glad that worked out for you. Datatype can be a tricky item to assign, not only for public sources but even in-house sequencing. But so much depends on getting these early QA steps correct - in the Support help where it states that this is the #1 underlying reason for analysis problems, that isn't an exaggeration. We have much help on tools forms, vimeo, wiki, tutorials, Pages, etc., but all use cases of course are not covered, this is a constant work-in-progress. If the wiki doesn't cover something you think is important, please feel free to become an editor and create a page explaining (and I can link it into the appropriate hub, just point me to it) or send me the details, we can collaborate on format/content, and get it added in. Some redundancy, from a difference perspective or example case, is totally fine. Others will surely benefit and your wiki account name will be on the list of contributors for whatever is created (editors are tracked/credited). You can even create "Tutorial" help here in Galaxy Biostar, and I'll create a wiki match and/or link the two together for easy access from either side. 

Anyone else reading this - same offer is open for all, across all topics/tools (including Tool Shed tools, or those on other Public Galaxies)! Sharing knowledge/experience is a wonderful gift to our open source (and open "protocol") community! :) 

Thanks! Jen, Galaxy team

ADD REPLYlink written 4.4 years ago by Jennifer Hillman Jackson25k

Thanks! If I get to the point where I feel I can add something or find a gap in resource info that needs filling I will be in touch. One issue is that I am new to the lab and starting a project using old data from another post-doc that has been used for analysis but not exactly the same as what I need so I am not familiar with the early QA steps. I hope to generate my own data for this project soon to get a better idea and hopefully I will have a handle on what the date is then. But until then I must mine this data to find a direction and to see if the project is feasible. 

Thanks again, this is a great resource and I hope to use it to answer a few questions that have started bubbling in my mind now that I have starting down the computational route. James

ADD REPLYlink written 4.4 years ago by james.lloyd0
1

Thanks James, I am so glad this worked. Your kind comments make what we do, and our generous extended community team, everyday and open source - well, feel special. We think it is, but is nice to know we are on the right track! 

Whether using inherited data, public data, or even your own data that wasn't as well annotated as you wish you might have done way back when (!), there is almost always a way to figure it out and move forward. Do let us know whenever we can help again! Then I can't wait to see you handing out the advice from your vast experience, after navigating through your current project (yes, you can!) :)

All the best, Jen, Galaxy team

ADD REPLYlink written 4.4 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour