Question: November 24, 2010 Galaxy Development News Brief
7.6 years ago by
Jennifer Hillman Jackson ♦ 25k
Jennifer Hillman Jackson ♦ 25k wrote:
November 24, 2010 Galaxy Development News Brief Here are the highlights of the following upgrade: hg pull -u -r 8729d2e29b02 http://bitbucket.org/galaxy/galaxy- central/wiki/Features/DevNewsBrief/2010_11_24 Galaxy's FTP Server New Data Upload Option * User how-to: http://bitbucket.org/galaxy/galaxy-central/wiki/UploadViaFTP * Configuration instructions for local installs: http://bitbucket.org/galaxy/galaxy-central/wiki/Config/UploadViaFTP OpenID Login * User how-to and config instructions: http://bitbucket.org/galaxy/galaxy-central/wiki/OpenIDAuthentication NGS Simulation Tool * Allows user to simulate multiple Illumina runs with several parameters that can be set. o On each run, one position is randomly chosen to be polymorphic and sequencing errors are also simulated. o The primary output is a png with two different plots. o The other output shows summary statistics about the simulation. * NGS simulation tool location: tools/ngs_simulation/ngs_simulation.xml Tophat and Cufflinks RNA-seq Tools * Addition of RNA-seq analysis tools Tophat and Cufflinks. o Together, these tools can be used to analyze RNA-seq data to understand alternative splicing and isoforms, gene and isoform expression, and perform statistical tests for differential expression. o Galaxy supports Tophat version 1.1.1 and later and Cufflinks version 0.9.1 and later. (These are the versions included this distribution). Import or Export Workflows & Histories * Workflows can now be downloaded/exported to a file and uploaded/imported into Galaxy, making it easy to move workflows between Galaxy instances. * Beta feature: Histories can also be downloaded or moved from one Galaxy instance to another, subject to these limitations: o history archives can be uploaded/imported only via URL, not file o histories must be shared in order for them to be importable via archive o tags are not currently imported o reproducibility is limited as parameters for imported jobs are not always recovered and set Even Better Data Visualization with Trackster * Trackster now supports interactive filtering for VCF quality values and BED score values. * For example, a user can drag a slider to filter a file of splice junctions to view junctions supported by different numbers of reads. trackster splice example * Improved CIGAR support to BAM display. Properly displays matches, deletions, skipped bases, and clipping. Padding for insertions are currently not represented in the display. * GFF feature blocks are now displayed correctly, along with name, strand, and score information. * General enhancements o Removed right-hand pane, allow inline re-ordering and configuration of elements o Moved navigational controls to the top o Histogram display for LineTracks and overview o New navigational slider and new overview settings under the dropdown corresponding to the track name o Summary view now shows maximum y-axis value o Can change draw color of LineTrack o When editing track config, "Enter" and "Esc" keys submit and cancel the changes, respectively o Don't index bottom level for summary_tree, greatly reducing computation time (>5x speedup) while not sacrificing usability Refactored to pass JSLint * Tuning o Fix ReferenceTrack issue. o Don't re-add new datasets when refreshing after using "Add into current viz" link. o To prevent browser lockup, only display up to 50 lines of features by default (user-editable in future). Coming soon: add warning message when this occurs. o Fix LineTrack rendering bug when more than one tile on screen. Native Data set Re-organization * Galaxy now uses a set of data tables instead of simple loc files to organize, document, and store native genome data sets. * Why Data tables? Better data management for long term stability! o Allows the information in the loc file, including the path, to be changed. o By using a unique ID as the parameter value, data links in existing workflows are preserved. * Most tools (PerM, Bowtie, BWA, Lastz, Megablast, SRMA, Tophat) that previously used loc files now have the new data tables organization implemented. * Better data tracking has allowed for more informative genome name display in tool dropdown boxes. * For local installations: o See the new wiki describing how to use data tables: https://bitbucket.org/galaxy/galaxy-central/wiki/DataTables o More help for NGS tool setup (update pending): https://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup Sample Tracking * Complete re-write of the Framework and User interface (database schema unchanged). * New interactive interface to select files to transfer from the sequencer to Galaxy data libraries. * The data transfer feature now uses Galaxy RESTful API. * Full documentation detailing the new functionality and how to use it will be available within a few weeks through the home Galaxy Wiki. Instantiating Galaxy * New checkouts will now perform all necessary setup directly in run.sh, there is no longer a need to run setup.sh prior to run.sh. setup.sh will be removed in a future distribution). Analysis Tools * Enable 'FASTX-Toolkit for FASTQ data' as a subsection under 'NGS: QC and manipulation' in tool_conf.xml.sample/main. Includes special handling for when the shell only allows for strict Bourne syntax. * Add descriptive labels to output dataset names for MACS peakcalling tool. * Taxonomy tools updated for better error reporting. Includes special handling for when the shell only allows for strict Bourne syntax. * Refactor sam_bitwise_flag_filter tool, simplifying it and making it fastet when there are multiple flag criteria Tool Dependency Enhancements * Addition of the 'package' type to <requirement> tags in the tool config. 1 Syntax for tool configs is: <requirements> <requirement type="package" version="X.Y.Z">NAME</requirement> </requirements> 2 Next, a directory should be created, and the path to that directory should be set in universe_wsgi.ini as 'tool_dependency_dir'. 3 Galaxy will then source the following file prior to executing the tool's
4 The 'version' attribute of the 'requirement' tag is optional and if
left off, Galaxy will look for the following instead:
* UI: new style for dropdown menus.
* Now uses jStore to save folder expansion state.
* Pre-generate and cache variables so that expensive functions like
jQuery.siblings, jQuery.filter and jQuery.find only have to be called
minimum amount of times. Provides significant speedup to loading of
large data libraries.
* Add basic support for Bowtie indexes as a datatype
bowtie_color_index), available via datatype conversion. Currently, the
indexes need to be converted manually from the FASTA file before use
Bowtie, but they can be reused.
* A new sample loc file (tool-data/all_fasta.loc.sample) was added
lists fasta files. A script
was created that can be used to generate this loc file for local
* New gff2bed tool to convert GFF3 files to BED.
* Modified Filter and Sort -> Filter tool to operate correctly on
with a variable number of columns, such as in SAM files.
* New datatype added: VCF (variant call format).
* Add descriptive labels to output dataset names for MACS peakcalling
* Add name/designation to HDA name for new datasets created in
* Shift management of the interaction between workflow outputs and
HideDatasetActions to the front end editor.
* No usability changes, but this resolves the issue with multiple
HideDatasetActions being created.
* Existing workflows displaying multiple HideDatasetActions per step
the Run Workflow screen will persist. These extra HideDatasetActions
harmless, but a simple edit workflow -> save will remove them.
* Workflow Inputs change:
o Workflow inputs that aren't a subtype of text, were previously
o Added 'data' datatype to registry, which will allow both text and
binary inputs (and their subtypes) to workflow input steps.
o Note that this will allow a user to change the datatype of
something to 'data'.
User Interface (UI)
* New function for downloading metadata files associated with datasets
(such as bai indices for bam files). See the Save icon drop-down menu.
* Enable display of unicode characters in history and workflow
annotations and when listing and running workflows.
* Dynamicically generated popup-style menus. Greatly improves load
especially for data libraries having potentially large menu.
* Labels next to checkboxes can now be clicked to check the
* Radio boxes in tool forms now also have clickable labels as well.
* New style for search boxes in grids. Grid items will no longer show
outline when hovered upon if there are no actions to be performed.
when the page is loaded.
* Remove the creation of a background element that closes the active
menu clicked. Instead, bind an event to close active menus to the
document object of current and all other framesets. Tested in IE.
* Make links in split menu buttons "go through" instead of popping up
the menu options.
* Functional Test Framework: new nose plugin that shows a diff between
tests failed this time and last time.
* Documentation update to add more options added to the sample config
* Fix for TextToolParameter.get_html_field when provided value is an
empty string but default value specified in tool is non-empty string.
Fixes issue with rerun button where if a user had input an empty
the form displayed when rerun would have the default value from the
and not the actual previously specified value.
* Fix for Integer/FloatToolParameter.get_html_field() when 'value' is
provided as an integer/float. Fixes an issue seen when saving
If an integer or float tool parameter is changed to a value of 0 or
and saved, the form field would be redisplayed using the default tool
value; and not the value that is now saved in the database.
* Fix for setting columns in workflow builder for ColumnListParameter.
e.g. allows splitting lists of columns by newlines and commas and
* Fixes for rerun action to recurse grouping options when checking
unvalidated values and cloned HDAs. Better selection of corresponding
HDAs from cloned histories, when multiple copies exist.
* Have rerun action make use of tool.check_and_update_param_values().
Fixes Server Error issue when trying to rerun updated tools.
* Fix for display framework to work with workflows that contain tools
that have been updated. Previously, this would cause a server error
trying to view a workflow or a page with an embedded workflow that
contained an updated tool.
* Fix bug that was causing Page item selection grids to be initialized
twice and hence causing grid paging to fail.
* Add some space between adjacent embedded items on Pages.
* Fix path to closebox.png image so screencast close button is shown
* Fix the Admin -> Manage Jobs interface when using multiple Galaxy
* When possible (e.g. Python >= 2.6), don't use tons of memory to
* Fix cluster stdout/stderr handling that could cause excessive memory
usage if stdout/stderr were very large.
* Make the PBS runner actually stop jobs when a user deletes output.
This would only work before if the Galaxy user was a PBS "operator"
only using a single process setup.
* Cause waiting jobs to fail if any of their inputs fail to set
* Fix 'import from current history' for Data Libraries that was
metadata files that are not visible. Fix this same issue for 'Copy
history items' feature.
* DRMAA runner now uses get_id_tag() in Wrapper instead of job_id
directly for creation of .sh .o and .e files, as well as some
* Prevent Rename Dataset Action from allowing a blank input.
hg clone http://www.bx.psu.edu/hg/galaxy galaxy-dist
Galaxy is supported in part by NSF, NHGRI, the Huck Institutes of the
Life Sciences, and The Institute for CyberScience at Penn State.
-- Galaxy Team
ADD COMMENT • link •