Question: Fastq Groomer And Compute Quality Statistics
0
gravatar for John David Osborne
7.5 years ago by
John David Osborne160 wrote:
I noticed that for our new Ilumina data (which generate Sanger format) the FastQ groomer output is identical to the Ilumina FastQ input file. I was hoping to go ahead and just use the raw FastQ files as input (saving disk space) for computing quality statistics to look at box plots, but it appears that the tool "Compute Quality Statistics" appears to require that the data have been run through FastQ Groomer first. Is there a way to get around this and is this a bug? I assuming this is some sort of safety measure built into this tool? -John
• 1.5k views
ADD COMMENTlink modified 7.5 years ago by Tilahun Abebe40 • written 7.5 years ago by John David Osborne160
0
gravatar for fubar
7.5 years ago by
fubar1.1k
Australia
fubar1.1k wrote:
You can avoid the space/time overhead of grooming and get comprehensive QC reports using the new wrapper for FastQC (under NGS: QC) - it takes fastq of any flavour (and bam) groomed or not, producing a superset of the compute quality stats output without the need for an intermediate step. Highly recommended. -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
ADD COMMENTlink written 7.5 years ago by fubar1.1k
Thanks Ross, I don't see it under my local install - are there any pre-written scripts to integrate it with a local galaxy instance? I assume you are talking about this tool here: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ -John ________________________________________ To: John David Osborne Cc: galaxy-user@bx.psu.edu Subject: Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics You can avoid the space/time overhead of grooming and get comprehensive QC reports using the new wrapper for FastQC (under NGS: QC) - it takes fastq of any flavour (and bam) groomed or not, producing a superset of the compute quality stats output without the need for an intermediate step. Highly recommended. -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
ADD REPLYlink written 7.5 years ago by John David Osborne160
Hi, John. it's on main and test - ie the FastQC wrapper is distributed with the current stable and central branches so your local tool_conf.xml may be out of date since it's not automagically refreshed from the distro .sample ? If you do a diff of your local tool_conf.xml with the current distributed sample, you should see the lines you need to add which points to rgenetics/fastqc.xml Thu,Jun 09 at 10:22am grep -i fastqc tool_conf.xml <label text="FastQC: fastq/sam/bam" id="fastqcsambam"/> <tool file="rgenetics/rgFastQC.xml"/> Like everything else, you'll want to install the jar locally so it can be found by the cluster - the default location is tool-data/shared/jars/FastQC so the tool can find the fastqc perl script (yes, I know...but it's worth it!) <command interpreter="python"> rgFastQC.py -i $input_file -d $html_file.files_path -o $html_file -n "$out_prefix" -f $input_file.ext -e ${GALAXY_DATA_INDEX_DIR}/shared/jars/FastQC/fastqc I hope this helps? -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
ADD REPLYlink written 7.5 years ago by fubar1.1k
0
gravatar for Peter Cock
7.5 years ago by
Peter Cock1.4k
European Union
Peter Cock1.4k wrote:
If you know your data is already in Sanger FASTQ format, you can say this when uploading the data into Galaxy. Or, use the "pencil" icon to edit the attributes and change the file type. This doesn't change the file itself on disk. Peter
ADD COMMENTlink written 7.5 years ago by Peter Cock1.4k
0
gravatar for Tilahun Abebe
7.5 years ago by
Tilahun Abebe40 wrote:
Hi guys, We are trying to load Illumina data to our local Galaxy instance. The files are between 700 MB and 2.2 GB. Files below 2 GB load in less than 5 minutes. Files larger than 2 GB don't upload at all. We installed Galaxy locally because we thought loading files will be faster than the server version. Any suggestions to solve this problem is highly appreciated. Tilahun
ADD COMMENTlink written 7.5 years ago by Tilahun Abebe40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 169 users visited in the last hour