Question: What Is The Minimum Quality Should I Set For Filter Fastq?
0
gravatar for Du, Jianguang
5.8 years ago by
Du, Jianguang370
Du, Jianguang370 wrote:
Dear All, I am analysing RNA-seq datasets for differential splicing events between cell types. Some of my reads contain bed nucleotides, should I run Filter FASTQ to remove these "not so good" reads? If I do need to, what is the "Minimum Quality" should I set for the Filter? Thanks. Jianguang
• 973 views
ADD COMMENTlink modified 5.7 years ago by Mathew Bunj100 • written 5.8 years ago by Du, Jianguang370
0
gravatar for Loraine, Ann
5.8 years ago by
Loraine, Ann60
Loraine, Ann60 wrote:
Hi, If you are aligning against a reference genome, I don't think you really need to do this. In my experience, you don't need to filter low-quality bases for splicing or differential expression analysis. I've mainly worked with read lengths of 50 bases or more and genomes where 99% of introns are 5 kb or smaller. So this advice may not to apply to your case. Best wishes, Ann Ann Loraine, Ph.D. Associate Professor Department of Bioinformatics and Genomics University of North Carolina at Charlotte North Carolina Research Campus 600 Laureate Way Kannapolis, NC 28081 704-250-5750 aloraine@uncc.edu http://www.transvar.org http://www.bioviz.org http://www.uncc.edu Date: Thu, 23 Aug 2012 14:48:29 +0000 To: "galaxy-user@lists.bx.psu.edu<mailto:galaxy- user@lists.bx.psu.edu="">" <galaxy-user@lists.bx.psu.edu<mailto:galaxy- user@lists.bx.psu.edu="">> Subject: [galaxy-user] What is the minimum Quality should I set for Filter FASTQ? Dear All, I am analysing RNA-seq datasets for differential splicing events between cell types. Some of my reads contain bed nucleotides, should I run Filter FASTQ to remove these "not so good" reads? If I do need to, what is the "Minimum Quality" should I set for the Filter? Thanks. Jianguang ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
ADD COMMENTlink written 5.8 years ago by Loraine, Ann60
0
gravatar for Mathew Bunj
5.7 years ago by
Mathew Bunj100
Mathew Bunj100 wrote:
I have  a chipseq data which ha sbeen alined against bacterial genome. I am trying to figure out how I can use peak calling MACS in Galaxy main server. Do I need to use the bactaerial genome (in genome option of data uplaod) in uplaoding the data. Could some one diect me if I can add my own custom genome for MACS program with in Galaxy main.   Thanks
ADD COMMENTlink written 5.7 years ago by Mathew Bunj100
Hello Mathew, If you already have mapped your data, then you can just upload the BAM/SAM dataset(s), sort if necessary, leave the database unassigned, and run MACS. This workflow has an example of how to sort a BAM file and send to MACS - you don't have to use this exactly, in fact the settings (especially for MACS) are likely not appropriate. Just examine the general sort rules and use the parts of it that make sense for your purposes, and run the tools independently or modify to create your own workflow: http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/sort-bam-for-peak- calling-macs-tool If you want to convert SAM-to-BAM (not really necessary) or when starting with raw sequence data that needs to be mapped (or find that you want to map it again), then the reference custom genome should be loaded along with the sequence data. Again, leave the database unassigned for all. The general protocol is covered in #3 from the Using Galaxy paper (make adjustments for tag size, effective genome size, etc. as needed, using the MACS documentation linked from the tool's page as a guide): http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012 To prepare, load, troubleshoot, and use a custom reference genome with tools (such as mapping tools), please see this wiki and the links it points to. http://wiki.g2.bx.psu.edu/Support#Custom_reference_genome In short, tool forms that have a custom genome option will ask "Choose the source for the reference list:" or similar - you will select "History" and then select the dataset where your custom reference genome has been uploaded in fasta format and assigned the datatype "fasta". It is very important that the chromosome/scaffold identifiers in the reference genome and those in any other files that refer to it are identical (in for example, a SAM or GTF dataset). This is where doing all of the analysis within Galaxy can be sometimes easier, since our tools maintain this internal data consistency. This should help to get you started, but please let us know if you need more help as the analysis proceeds, Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 5.7 years ago by Jennifer Hillman Jackson25k
Thanks Jen for a detailed explanation. One question I have is- If I run the MACS on my pre-aligned  reads on ChIPseq data, will I be be able to annotate my peaks from MACS either using  fetch to  closet non-overlapping feature or profile annotation. In summary is there a way to annotate peaks for bacterial genes under any other tool in galaxy. Thanks Mathew ________________________________ To: Mathew Bunj <mathewbunj@yahoo.com> Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu> Subject: Re: [galaxy-user] Adding a custom genome for using MACS in Galaxy Hello Mathew, If you already have mapped your data, then you can just upload the BAM/SAM dataset(s), sort if necessary, leave the database unassigned, and run MACS. This workflow has an example of how to sort a BAM file and send to MACS - you don't have to use this exactly, in fact the settings (especially for MACS) are likely not appropriate. Just examine the general sort rules and use the parts of it that make sense for your purposes, and run the tools independently or modify to create your own workflow: http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/sort-bam-for-peak- calling-macs-tool If you want to convert SAM-to-BAM (not really necessary) or when starting with raw sequence data that needs to be mapped (or find that you want to map it again), then the reference custom genome should be loaded along with the sequence data. Again, leave the database unassigned for all. The general protocol is covered in #3 from the Using Galaxy paper (make adjustments for tag size, effective genome size, etc. as needed, using the MACS documentation linked from the tool's page as a guide): http://main.g2.bx.psu.edu/u/galaxyproject/p/using-galaxy-2012 To prepare, load, troubleshoot, and use a custom reference genome with tools (such as mapping tools), please see this wiki and the links it points to. http://wiki.g2.bx.psu.edu/Support#Custom_reference_genome In short, tool forms that have a custom genome option will ask "Choose the source for the reference list:" or similar - you will select "History" and then select the dataset where your custom reference genome has been uploaded in fasta format and assigned the datatype "fasta". It is very important that the chromosome/scaffold identifiers in the reference genome and those in any other files that refer to it are identical (in for example, a SAM or GTF dataset). This is where doing all of the analysis within Galaxy can be sometimes easier, since our tools maintain this internal data consistency. This should help to get you started, but please let us know if you need more help as the analysis proceeds, Best, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 5.7 years ago by Mathew Bunj100
Hello Mathew, An an interval file of annotations and most of the tools under the group 'Operate on Genomic Intervals' can be used to annotate (find overlap with) an interval/bed file of your peaks, with the exception of the 'Profile Annotation' tool. This tool functions on certain genomes only - and not custom genomes. Take care, Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 5.7 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 68 users visited in the last hour