Heads up! This is a static archive of our support site. Please go to help.galaxyproject.org if you want to reach the Galaxy community. If you want to search this archive visit the Galaxy Hub search
Hi All, one quick Q: how can I set max per-file depth to more than
8000 under mpileup. Thanks!
Michael
________________________________
Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
Hello,
The Galaxy mpileup wrapper (available in the Tool Shed
http://usegalaxy.org/tool shed) or as implemented on the public Main
instance (http://usegalaxy.org), just uses the default, which is 8000
(option -D).
The SAMtools manual has the details, including the command-line for
the
adjusting the max-file depth considered. The usage has a specific use
case, and is not used with SNP calling, but coverage calculations:
http://samtools.sourceforge.net/mpileup.shtml
This is not to be confused with the per-sample position depth for SNP
calling (-d). This can be adjusted within the 1-8000 window. The
default
is 250. If set to a value over 8000, the option -D will override it
before it can be applied, should the tool is given input data that
fits
this criteria. Depending on the size of your input, memory could still
be an issue if depth is set very large. If there is a memory related
error, this is a probable cause, and a local or cloud with sufficient
resources is the alternative.
While max-file depth cannot be increased on the public server, in a
local instance the wrapper could be adjusted to include/specify the -D
parameter as an _*input*_ option that can be modified. This is the
tool
.xml (for ease of viewing, downloading the repository is really best
way
to access the it in the most current version):
http://toolshed.g2.bx.psu.edu/repository/view_changeset?ctx_str=44a18a
94d7a9&id=01d08a1b766b864e
Note from the manual usage in the SAMtools manual, that BAQ
calculations
are not compatible with the choice to increase -D (according to the
tool
authors), so adjust the tool form options at execution time for this
(the tool .xml will help with mapping parameters, if the tool help in
the UI is not enough). And finally, be aware that this usage could
significantly increase the memory profile of the tool, so it is
probably
not appropriate for a local on a personal desktop/laptop, but testing
on
your own data will answer that definitively. A local run on a server
or
a cloud with extended memory resources is most likely a better choice.
In general, if it runs on the line command, it will run in Galaxy, and
the reverse (fails line command, will fail in Galaxy - the underlying
tool is the same).
These were the constraints the last time the development team gave
feedback about the tool. If there are any updates, we will post
another
reply. It is also possible that a member of our development community
has already modified the tool wrapper (but not submitted it to the
Tool
Shed yet) and they will respond. I ran a search on the getgalaxy
archives (searches dev resources, and that includes the
galaxy-dev@bx.psu.edu mailing list), and didn't find anything myself:
http://galaxyproject.org/search/getgalaxy/
Hopefully this helps to explain and offer some choices.
Jen
Galaxy team
--
Jennifer Hillman-Jackson
http://galaxyproject.org