4.2 years ago by
United States
Hi Zhenhua,
This is not needed for most analysis. It is generally a more straightforward approach to go ahead and map the sequences (in pairs, the statistics from the tool run will provide mapping success rates), then filter after for properly paired reads (and optionally remove unmapped reads). You would want to do this in preparation for variant analysis workflows. But, for RNA-seq workflows using the Tuxedo tools, no filtering is required - Cufflinks/Cuffdiff will only consider mapped pairs passing the criteria set in the tool form parameters in the analysis.
However, if you wish to do this at the start anyway (it does make for simplier statistics) - one way is in the Published workflow 'Create matched paired end datasets' below (created by Dave Clements). I just uploaded it right now to Main and have only tested it so far on a CloudMan Galaxy, but I wouldn't expect any problems using it on the public Main Galaxy instance (the tools included are identical between the two). But, any feedback about issues would be appreciated. You can modify the workflow yourself after importing it of course, but we'd love to know so we can fix ours, too (will be testing it on Main very soon - this week, am publishing it early for you!).
http://usegalaxy.org/u/galaxyproject/w/re-pair-paired-ends-after-qc-may-have-broken-them-imported-from-uploaded-file
Hopefully one of these alternative works out for you! Best, Jen, Galaxy team