Question: Suggestion For Multithreading
0
gravatar for Louise-AmélieSchmitt
7.6 years ago by
Louise-AmélieSchmitt160 wrote:
Hello everyone, I'm using TORQUE with Galaxy, and we noticed that if a tool is multithreaded, the number of needed cores is not communicated to pbs, leading to job crashes if the required resources are not available when the job is submitted. Therefore I modified a little the code as follows in lib/galaxy/jobs/runners/pbs.py 256 # define PBS job options 257 attrs.append( dict( name = pbs.ATTR_N, value = str( "%s_%s_% s" % ( job_wrapper.job_id, job_wrapper.tool.id, job_wrapper.user ) ) ) ) 258 mt_file = open('tool-data/multithreading.csv', 'r') 259 for l in mt_file: 260 l = string.split(l) 261 if ( l[0] == job_wrapper.tool.id ): 262 attrs.append( dict( name = pbs.ATTR_l, resource = 'nodes', value = '1:ppn='+str(l[1]) ) ) 263 attrs.append( dict( name = pbs.ATTR_l, resource = 'mem', value = str(l[2]) ) ) 264 break 265 mt_file.close() 266 job_attrs = pbs.new_attropl( len( attrs ) + len( pbs_options ) ) (sorry it didn't come out very well due to line breaking) The csv file contains a list of the multithreaded tools, each line containing: <tool id="">\t<number of="" threads="">\t<memory needed="">\n And it works fine, the jobs wait for their turn properly, but information is duplicated. Perhaps there would be a way to include something similar in galaxy's original code (if it is not already the case, I may not be up-to-date) without duplicating data. I hope that helps :) Best regards, L-A
galaxy • 1.5k views
ADD COMMENTlink modified 7.6 years ago • written 7.6 years ago by Louise-AmélieSchmitt160
0
gravatar for Blake L. Dixon
7.6 years ago by
Blake L. Dixon20 wrote:
If anyone could take me off this mailing list, that would be great. thanks, Blake Dixon Date: Tuesday, April 19, 2011 7:39 am Subject: [galaxy-user] suggestion for multithreading To: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu>
ADD COMMENTlink written 7.6 years ago by Blake L. Dixon20
0
gravatar for Louise-AmélieSchmitt
7.6 years ago by
Louise-AmélieSchmitt160 wrote:
Just one little fix on line 261: 261 if ( len(l) > 1 and l[0] == job_wrapper.tool.id ): Otherwise it pathetically crashes when non-multithreaded jobs are submitted. Sorry about that. Regards, L-A Le mardi 19 avril 2011 à 14:33 +0200, Louise-Amélie Schmitt a écrit :
ADD COMMENTlink written 7.6 years ago by Louise-AmélieSchmitt160
Hi Louise-Amélie, I haven't done anything with this code yet, but I wanted to let you know that we'll eventually be adding it, I'm just going to change the implementation slightly. I'd like to merge the functionality of the csv into an xml config I'm already working on (but haven't yet fully decided on the syntax). And it should be possible for tools to access these parameters in the command line template. A lot of our NGS tools have the number of threads to use hardcoded in the tool config, which is bad. --nate
ADD REPLYlink written 7.5 years ago by Nate Coraor3.2k
2011/6/2 Nate Coraor <nate@bx.psu.edu>: On a related point, I've previously suggested a $variable could be defined for use in tool XML wrappers to set the number of threads. This number could come from a general configuration file, or perhaps via the cluster settings - the point is the tool doesn't need to know, it just gets told how many threads it is allowed. Peter
ADD REPLYlink written 7.5 years ago by Peter Cock1.4k
Hello all,   Does anyone know if Galaxy can process Ion Torrent Data? Currently, it appears that Ion Torrent data is not a supported platform. I know that the data is in FastQ Sanger, so I would think there would be a way to incorporate it into one of the existing platform pipelines, but which one?   Ion Torrent protocols are very similar to 454 in terms of library construction, etc. Should I use Lastz - Roche -454 mapping mode? I want to compare the Ion Torrent data with Illumina data (from the same DNA), so perhaps I should use the illumina workflow.   Any insight would be very helpful,   Thanks, Mike Dufault
ADD REPLYlink written 7.5 years ago by Mike Dufault270
Hello Mike, You may be interested in the wrapper for TMAP that has been added to the Tool Shed for use with local installs. Search for "tmap" to locate the tool. http://galaxyproject.org/Tool%20Shed Great question, thank you for your patience while we reviewed, Best, Jen Galaxy team -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
ADD REPLYlink written 7.3 years ago by Jennifer Hillman Jackson25k
Yeah, that's the idea. job_conf.xml: <job_conf> <destinations> <destination name="default" runner="local"/> <destination name="pbs_default" runner="pbs"> <native_param name="queue">batch</native_param> </destination> </destinations> <tools> <tool id="upload1" destination="default"/> <tool id="bowtie_wrapper" destination="pbs_default"> <resource type="cores">8</resource> </tool> </tools> </job_conf> pbs.py then knows to translate '<resource type="cores">8</resource>' to '-l nodes=1:ppn=8'. Your tool can access that value a bunch, like $__resources__.cores. The same should be possible for other consumables. --nate
ADD REPLYlink written 7.5 years ago by Nate Coraor3.2k
Sounds good. Would there be a global default setting somewhere in universe.ini or elsewhere for when job_conf.xml didn't set a value? Peter
ADD REPLYlink written 7.5 years ago by Peter Cock1.4k
default_cluster_job_runner will remain for backwards compatibility, but we'll ship a sample job_conf.xml that runs everything locally by default. --nate
ADD REPLYlink written 7.5 years ago by Nate Coraor3.2k
Hello Galaxy Team,   Does anyone know if there is a problem with the most recent release of Galaxy Cloudman? I have launched several instances and when I try to access them using the Public DNS, it will not connect. I have given the AWS servers up to 30mins to establish the DNS web page, but it still does not work.   I have used Cloudman extensively, and my key-pair, security groups, etc have not changed. I even set up new ones in case that was an issue, but no luck.  I don't think the problem is on my end.   Any help would be appreciated.   Thanks, Mike
ADD REPLYlink written 7.5 years ago by Mike Dufault270
Hi Mike, I just started a brand new instance with an account that is different from the one used to create required AWS components and Cloudman came up just fine, having started all of the required services. Not that it should matter much, but are you instantiating a brand new instance or recovering an existing one? Are you able to ssh into the instance? If yes, Cloudman log is accessible from /mnt/cm/paster.log so it should be possible to see what's going on. Enis
ADD REPLYlink written 7.5 years ago by Enis Afgan680
number to Haha, and I did that before realizing I could do just what I needed by writing tool-specific pbs:// URLs at the end of the config file... I'm such an idiot. But I really like what you did of it and I have a couple of questions. Concerning the single-threaded tools, what would happen if the number of threads set in the xml file was >1 ? Could it be possible to forbid a tool to run on a given node? Thanks, L-A
ADD REPLYlink written 7.5 years ago by Louise-AmélieSchmitt160
Haha, okay, I don't think i even noticed since I was distracted by your implementation being a step in the way we want to go with it. It'd consume extra slots, but the tool itself would just run as usual. Hrm. In PBS you could do it using node properties/neednodes or resource requirements. I'd have to think a bit about how to do this in a more general way in the XML. --nate
ADD REPLYlink written 7.5 years ago by Nate Coraor3.2k
Le 02/06/2011 21:39, Nate Coraor a écrit : damn, I shouldn't have said anything :D So what about an attribute in the tool tag that would notify wether the tool is actually multithreaded, so that this doesn't happen? Something like multithreaded="true/false" ? Ok thank you! :) L-A
ADD REPLYlink written 7.5 years ago by Louise-AmélieSchmitt160
I'm not sure if it's something we need to enforce (who knows, maybe I have some reason for having a single-threaded tool reserve multiple slots), but I do think there should be some way for tool authors to make it known that their tool is multithreaded. --nate
ADD REPLYlink written 7.5 years ago by Nate Coraor3.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour