Hi Folks Ap;ogies up front for a newbie question. I have a set of short gene targeting sequence strings for CRISPRs and TALs. I want to compare them to all the transcripts for a given gene and identify how many transcripts are targeted. I can write scripts to do this but would like to try and do this in Galaxy.
Would the workflow look like this: 1. Download for each gene it's transcript set from UCSC 2. Upload the target sequences 3. Use the Text Manipulation grep tool to search for sequences that match the targets, record matches for each transcript 4. Use the EMBOSS revseq tool to reverse complement a target sequence to search for target strings designed to the opposite orientation 5. Use the Text Manipulation tools to concatenate and format the result file to identify for a given target string which transcripts it was found in.
Does this sound about right? Any gotchas in here? Or better ways to do this? Thanks!