Question: Wig To Bigwig Error
0
gravatar for Michael Sikes
6.6 years ago by
Michael Sikes40 wrote:
Hi, I have hit a brick wall when trying to convert wig files from the GEO to bigwig files. Each time I try (and I have tried many times since October), I get the same error. For example, here is a downloaded wig file, that I assigned to the mouse mm8 genome, and the error I got when I tried to convert it to a bigwig file. The dataset came from Bing Ren's lab, and its GEO record is here: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM560344 The wig file was uploaded to Galaxy Dec. 8, 2011, and I assigned mm8 rather than mm9 based on the GEO record: 49: GSM560344_03112009_313D2AAXX_B7.wi ~960,000 lines format: wig, database: mm8 Info: uploaded wig file display at UCSC main The details for this upload are as follows: Tool: Upload File Name: GSM560344_03112009_313D2AAXX_B7.wi Created: Dec 08, 2011 Filesize: 12.1 Mb Dbkey: mm8 Format: wig Tool Version: Tool Standard Output: stdout Tool Standard Error: stderr Input Parameter Value File Format auto Genome Conditional (files_metadata) 32 Inheritance Chain GSM560344_03112009_313D2AAXX_B7.wi The wig-to-bigWig conversion on data 49 (using the wig to bigwig conversion tool in the convert formats toolbox) was run on March 21, 2012 and gave the following error: 77: Wig-to-bigWig on data 49 0 bytes An error occurred running this job:line 152351 of stdin: chromosome chr13 has 120614378 bases, but item ends at 120614600 line 298005 of stdin: chromosome chr17 has 95177420 bases, but item ends at 95177625 line 325066 of stdin: chromosome chr16 has 98252459 bases, but item ends at 9825252 The details for this operation are as follows: Tool: Wig-to-bigWig Name: Wig-to-bigWig on data 49 Created: Mar 21, 2012 Filesize: 0 bytes Dbkey: mm8 Format: bigwig Tool Version: Tool Standard Output: stdout Tool Standard Error: stderr Input Parameter Value Convert 49: GSM560344_03112009_313D2AAXX_B7.wi Conditional (settings) 1 Items to bundle in r-tree 256 Data points bundled at lowest level 1024 Clip chromosome positions True Do not use compression True Inheritance Chain Wig-to-bigWig on data 49 I gather that the chromosome ends are not being snipped off, even though I toggle this option on the galaxy conversion tool. And I know it's doing something, because if I toggle that option off, I get an error that includes "broken pipe" and simply aborts. I apologize for knowing so little about the bioinformatics involved here. And I'm sure I've overlooked something that is likely obvious to others and/or failed to provide some critical bit of info in this email. But any help would be greatly appreciated. Thanks, Mike
galaxy • 3.6k views
ADD COMMENTlink modified 6.6 years ago • written 6.6 years ago by Michael Sikes40
0
gravatar for Jennifer Hillman Jackson
6.6 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi Michael, This particular .wig file has a data format problem that is the root cause of the conversion error. Specifically, there is an extra track line in the file. This can be found using unix tools with a grep or in Galaxy with the tool "Filter and Sort -> Select" by matching the pattern "track". Ideally this would be corrected and resubmitted by the data author before use, since how/why this was inserted and what impact it has would need to be examined. Since you noticed problems with other GEO files (conversion problems), verifying the .wig format and making any necessary corrections would also be advised. Hopefully this helps! Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org http://galaxyproject.org/wiki/Support
ADD COMMENTlink written 6.6 years ago by Jennifer Hillman Jackson25k
Jennifer, Thanks for your help. I ran the filter and sort tool as advised, and then ran the wig to bigwig on the new history item generated by the filter. This time I got a different error: 84: Wig-to-bigWig on data 83 0 bytes An error occurred running this job:stdin is empty of data Error running wigToBigWig. 83: Select on data 49 1 line, 1 comments format: wig, database: mm8 Info: Matching pattern: track Again, I'm sure I left off something obvious. Could you tell me what I did wrong? Thanks, Mike Michael Sikes, Ph.D. Associate Professor of Immunology North Carolina State University Microbiology Department 4524A Gardner Hall Campus Box 7615 Raleigh, NC 27695 Ph: 919-513-0528 Fax: 919-515-7867 email: mlsikes@ncsu.edu
ADD REPLYlink written 6.6 years ago by Michael Sikes40
Hi Mike, I apologize if I wasn't clear, but the 'Select' was to show you how to identify the multi-track group wig files. I wanted to give you a way to screen similar files going forward. The wig-to-bigWig program in Galaxy comes from UCSC. It accepts .wig files with a single track group as input: http://genome.ucsc.edu/goldenPath/help/bigWig.html (see step #1) The data author lab can either submit the data as single track group .wig files, or, if you are confident that the multiple track group .wig format is expected and OK from this source, split the file. There are no specific tools in Galaxy to do this, but something like this would work: - Text Manipulation -> "Add column", "1", Iterate? = yes - "Select", "track" - note the line number of track lines - "Remove beginning of a file", using line numbers, and the -original- .wig file, to break up into individual .wig files. Good luck! Jen Galaxy team -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.6 years ago by Jennifer Hillman Jackson25k
0
gravatar for Michael Sikes
6.6 years ago by
Michael Sikes40 wrote:
Jen, A couple of uninformed questions. I gather from your response that the author lab submitted a multiple track group .wig file instead of a single track group .wig file, and that I need to generate a single track group file before the bigwig conversion will work. So, with regard to the instructions below, I am to run the text manipulation on the original author submitted .wig file. Then run "filter and sort-- Select lines that match an expression" on the newly created file that: "Matching" the pattern: "track". This generates yet another file that has the following info: 88: Select on data 87 1 line, 1 comments format: wig, database: mm8 Info: Matching pattern: track track type=wiggle_0 visibility=full name="Smc3_mES" autoScale=on color=100,0,100 1 track visibility=dense name="Smc3_mES enriched regions - 1e-09" color=100,0,100 892178 Is the number 892173 the number of track lines? If so, do I then do the "Remove beginning of a file" using 892178 on the original author .wig file? Mike Michael Sikes, Ph.D. Associate Professor of Immunology North Carolina State University Microbiology Department 4524A Gardner Hall Campus Box 7615 Raleigh, NC 27695 Ph: 919-513-0528 Fax: 919-515-7867 email: mlsikes@ncsu.edu
ADD COMMENTlink written 6.6 years ago by Michael Sikes40
Hi Mike, Yes, to get the .wig file out of the data, select the first "892177" lines. (Selecting "892178" would include the second track line, which you don't want). After looking one more time, not all data appears to be .wig. This is a multiple track group file, labeled as .wig, but the second track is .bed, not .wig. The data didn't look right at the first pass examination (the second track line didn't have the "type=wiggle_0" declaration), which is why I thought it would be a good idea to contact the data authors in my original reply and not attempted to manipulate the data yourself (instead ask them to have it reviewed and resubmitted, or at least confirmed). It now is pretty clear what the merge consists of = .wig + .bed. If you really wanted to try to use the data as-is, I would start by interpreting/labeling the first track as .wig, second track as .bed (once split), and carefully examining the results from any research you perform with it. Apologies for the complicated file analysis, Jen Galaxy team Details about why the first track looks like a .wig file, the second track looks like a .bed file. NOTE: these example data have line counts added for clarification. When you select the data to create working files, use the original dataset without line counts. All "variable Step" declaration lines are before the second track line at 892,178, and after that the file continues to line 925,183 in bed format. - Select on "Step" variableStep chrom=chr11 span=25 2 variableStep chrom=chr10 span=25 74501 variableStep chrom=chr13 span=25 119959 variableStep chrom=chr12 span=25 152353 variableStep chrom=chr15 span=25 185476 variableStep chrom=chr14 span=25 224351 variableStep chrom=chr17 span=25 253339 variableStep chrom=chr16 span=25 298007 variableStep chrom=chr19 span=25 325068 variableStep chrom=chr18 span=25 352583 variableStep chrom=chrM span=25 377622 variableStep chrom=chr1 span=25 378109 variableStep chrom=chr3 span=25 431654 variableStep chrom=chr2 span=25 468728 variableStep chrom=chr5 span=25 538115 variableStep chrom=chr4 span=25 600376 variableStep chrom=chr7 span=25 663953 variableStep chrom=chr6 span=25 726093 variableStep chrom=chr9 span=25 770436 variableStep chrom=chrX span=25 819431 variableStep chrom=chr8 span=25 830175 - Select first lines from dataset=10 - http://genome.ucsc.edu/goldenPath/help/wiggle.html track type=wiggle_0 visibility=full name="Smc3_mES" autoScale=on color=100,0,100 1 variableStep chrom=chr11 span=25 2 3000251 0.6 3 3000276 1.5 4 3000301 1.6 5 3000326 1.7 6 3000351 1.7 7 3000376 1.7 8 3000401 1.7 9 3000426 1.6 10 - Select last lines from a dataset= 33006 (calculated from 925183-892178+1) - http://genome.ucsc.edu/FAQ/FAQformat.html#format1 track visibility=dense name="Smc3_mES enriched regions - 1e-09" color=100,0,100 892178 chr11 3023275 3023700 892179 chr11 3028200 3028225 892180 chr11 3039225 3039275 892181 chr11 3040500 3040525 892182 chr11 3070325 3070375 892183 chr11 3080650 3080675 892184 chr11 3085850 3085950 892185 chr11 3097450 3097475 892186 chr11 3190200 3190275 892187 (...more until end of file...) -- Jennifer Jackson http://galaxyproject.org
ADD REPLYlink written 6.6 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour