Manipulate Fastq Question

Question: Manipulate Fastq Question

7.2 years ago by

graham etherington (TSL) • 100 wrote:

Hi, I currently have read names with the format: @N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0: and would like to change them to the format: @N57638:1:64JU0AAXX:1:1:1057:943/1 I use Manipulate FASTQ, on all reads and set 'Manipulate Reads on:' to 'Name/Identifier', ('String Translate' becomes the only option). I then set the 'From:' field to '1:Y:0:' and the 'To:' field to '/1' (without the literal quotes). I get the following error: Traceback (most recent call last): File "/home/home/galaxy/software/galaxy- central/tools/fastq/fastq_manipulation.py", line 37, in main() File "/home/home/galaxy/software/galaxy- central/tools/fastq/fastq_manipulation.py", line 25, in main new_read = fastq_manipulator.match_and_manipulate_read( fastq_read ) File "/home/home/galaxy/software/galaxy- central/database/job_working_directory/942/tmpgp13Qy", line 15, in match_and_manipulate_read new_read = manipulate_read( fastq_read ) File "/home/home/galaxy/software/galaxy- central/database/job_working_directory/942/tmpgp13Qy", line 8, in manipulate_read new_read.identifier = "@%s" % new_read.identifier[1:].translate( maketrans( binascii.unhexlify( "313a593a303a" ), binascii.unhexlify( "2f31" ) ) ) ValueError: maketrans arguments must have same length So, do the From and To fields really need to be the same length? This seems rather strange and unhelpful. Am I doing something wrong? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK

galaxy • 733 views

ADD COMMENT • link •

modified 7.2 years ago by Jennifer Hillman Jackson ♦ 25k • written 7.2 years ago by graham etherington (TSL) • 100

7.2 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hi Graham, This may be the long way around the transformation, but the workflow shared here will convert the identifiers without requiring any programing/regular expression knowledge: http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/transform-fastq- nameidentifer To use this: 1 - log into galaxy and switch histories to one containing this dataset (if needed) 2 - click on the link above 3 - click on "Import workflow" at the top of the page, right of center, next to the green "+" icon 4 - on the "Import successful" page, click on "start using this workflow" 5 - on the "Your workflows" page, click on the down arrow at the end of "imported: Transform fastq name/identifier" to open the menu, then click on "Run" (second choice in list). If you ever need to reach this page again, just click on "Workflow" in the top menu bar. 6 - your history from step 1 will now display with the workflow in the center panel. 7 - set "Step 1: Input dataset", annotated as "CASAVA 1.8+ FASTQ file", to the FASTQ file with the identifiers like: "@N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0:" 8 - click on "Run workflow" When run to completion, the intermediate datasets will be hidden, leaving only the final dataset: a groomed (using quality score type "Sanger") FASTQ file. Hopefully this helps. Feel free to make changes, the imported copy of the workflow is yours to modify. Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support

ADD COMMENT • link written 7.2 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »