Question: Galaxy Join Takes Too Long?
0
gravatar for Milad Bastami
5.8 years ago by
Milad Bastami30 wrote:
I'm trying to joint two large intervals (one with 800,000 intervals and the other with about 350,000 intervals) using operates on intervals > join tool . I have no idea how long it should takes normaly. Two days have past and it is still runnig. Is there any limitation in file size for this tool? Any help would be appreciated.   Regards, Milad Bastami PhD student of Medical Genetics Department of Medical Genetics Shahid Beheshti's university of Medical Science 
• 695 views
ADD COMMENTlink modified 5.8 years ago by Jennifer Hillman Jackson25k • written 5.8 years ago by Milad Bastami30
0
gravatar for Jennifer Hillman Jackson
5.8 years ago by
United States
Jennifer Hillman Jackson25k wrote:
Hello Milad, I am not sure if you are using the public Galaxy Main server (https://main.g2.bx.psu.edu) or your own local computer, or if the jobs are yellow and running or still grey and in the waiting-to-run stage. If using Galaxy Main, and if actually running (dataset is yellow), this type of job sometimes can take a while to run on very large datasets. To improve the chances of a successful run and to increase the speed of any run, make sure to put the large dataset as the first input and the smaller file as the second input. The interval operations tools use an indexing strategy where the second input file is the portion loaded memory and the first file is processed against it. Hopefully this helps, Jen Galaxy team -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
ADD COMMENTlink written 5.8 years ago by Jennifer Hillman Jackson25k
Hi Jen Thanks for your information, I'm using the public Galaxy main server and the job is yellow.It was a good point, I put the large dataset as the second input, I will wait  if no success I will treat as you said. Milad Bastami PhD student of Medical Genetics Department of Medical Genetics Shahid Beheshti's university of Medical Science  (https://main.g2.bx.psu.edu) or your own local computer, or if the jobs are yellow and running or still grey and in the waiting-to-run stage. this type of job sometimes can take a while to run on very large datasets. of any run, make sure to put the large dataset as the first input and the smaller file as the second input. The interval operations tools use an indexing strategy where the second input file is the portion loaded memory and the first file is processed against it. The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo /galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
ADD REPLYlink written 5.8 years ago by Milad Bastami30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 180 users visited in the last hour