Question: Questions About Column Output In Operate On Genomic Intervals -> Profile Annotations
0
gravatar for James Wagner
5.3 years ago by
James Wagner20
James Wagner20 wrote:
Hello, I tried using this tool today after inputting a bed file containing 1509 intervals of 100 bp each, spread across all 22 autosomes. First of all, despite the fact that my input file contained intervals for 22 chromosomes, the value of "allCoverage" seemed to be the same as the value of the coverage of that table only for chr1. I was not really sure about the tableRegionCoverage column, as for most of the autosomes I had input data spread throughout the chromsome with points a few Mb away from either end, but I was getting a value in this column only about 1/3 of what I get when downloading the data directly from UCSC and summing the interval sizes. There were also many cases where nrCoverage > allCoverage, even when I reduced each input genomic interval to only 1 bp to avoid redundancy in the input file. Based on these descriptions of the columns I would expect allCoverage >= nrCoverage at all times. Just wondering if you could clarify what these columns are supposed to mean or how to reconcile these apparent inconsistencies.
• 853 views
ADD COMMENTlink modified 5.3 years ago by Jennifer Hillman Jackson25k • written 5.3 years ago by James Wagner20
0
gravatar for Jennifer Hillman Jackson
5.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi James, Full column descriptions are at the bottom of the Profile Annotations tool form. Are you working on the public Main Galaxy instance, or can you duplicate this on Main https://main.g2.bx.psu.edu/ usegalaxy.org)? It would be helpful if you could share a history link and point to the dataset(s) with these values - at first pass they do seem off, but we can look into why. Leave all inputs and outputs undeleted in the history when you email back the share link please. You send email me directly to keep your data private. How to share a history: http://wiki.galaxyproject.org/Support#Shared_and_Published_data Thanks! Jen Galaxy team -- Jennifer Hillman-Jackson http://galaxyproject.org
ADD COMMENTlink written 5.3 years ago by Jennifer Hillman Jackson25k
Hello Jen and other members, here is a history with the interval dataset I uploaded and the results I get when doing a Profile Annotations summary. In particular I am concerned about why in some cases tableChromosomeCoverage < tableRegionCoverage and allCoverage < nrCoverage. https://main.g2.bx.psu.edu/u/jwag/h/unnamed-history Thanks so much
ADD REPLYlink written 5.3 years ago by James Wagner20
0
gravatar for Jennifer Hillman Jackson
5.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hi James, The problem seems to have come from using the default "all tables" when running the tool. The current assumption is that you will need to re-run all in this set, not just the ones you highlight with the odd calculations. For now, just run in smaller groups to obtain the correct results. We will be working out how to best address the "all data" query going forward. Running through every human track at UCSC (which includes ENCODE, for certain genomes, such as the one you are using) is a vast amount of data to pull and parse. Thank you for sharing your history and sorry for the inconvenience, Jen Galaxy team -- Jennifer Hillman-Jackson http://galaxyproject.org
ADD COMMENTlink written 5.3 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour