Question: Galaxy crash messed up NGINX AND workflow editing tool ?
2.8 years ago by
christophe.habib340 wrote:

Hi everyone.

I have a big issue on a local instance of Galaxy for diagnosis purpose. Here is what happened :

I installed this instance last october and I had many issues du to the quality of the network and the proxy of the hospital where I work. Finally I managed to make things work properly, but I didn't installed/configured any scheduler to manage the jobs.

Yesterday, my colleagues told me they tried to launch the analysis workflow (22 steps, 2 fastq and 1 bed file as inputs) for several patients (like 10, we have a small server) and the instance was no longer accessible. Indeed : blank page when we tried to go to this instance. I went on the IRC channel to find some help.

Thanks to Dannon, we managed to find out that the issue was related to Nginx : all the requests were permission denied :

GET [HTTP/1.1 403 Forbidden 1ms]

My colleagues really needed to have access again to that instance to perform diagnosis, so I managed to get the access to the instance by changing the first line of the nginx.conf from user www-data to user galaxy.

So here is the first thing : I would like to return to the original www-data within nginx because I am afraid that a similar crash might change rights to galaxy itself. Do you have any idea how to give back his rights to www-data ?

The second this is, I thought that everything went back to the normal. BUT when I try to edit an existing workflow I have a message error server. In the paster.log I have this message :

And In galaxy itself we have this error :

Server error
{"upgrade_messages": {}, "steps": {"0": {"tool_id": null, "data_inputs": [], "data_outputs": [{"extensions": ["input"], "name": "output"}], "form_html": "

Several important things regarding this matter :

  • I can export this workflow and the files content is right (and Guerler could import it on his instance)
  • if I import it back it shows the same error.
  • If I create a NEW workflow everything works properly

I really think this is related to the first problem with Nginx. I mean, this workflow was created when the whole galaxy thing was accessible thanks to Nginx which has www-data as worker, not galaxy. Plus the whole galaxy repertory owners were galaxy:www-data at that time. I tried to modify owners to galaxy:galaxy but it didn't fix the problem.

So I think the most important problem is the first one, and the second one will be solve by itself. I would really need your help to fix this. It is such a mess to set up a new instance because of the hospital proxy that I don't want to start from scratch again.

Thank you for your help !

workflow crash galaxy nginx
2.8 years ago • written 2.8 years ago by christophe.habib340

I just noticed that when I try to edit a workflow I have this error in the nginx log file:

So my guess may be right ... but I still don't know how to fix nginx !

2.8 years ago • written 2.8 years ago by christophe.habib340

I found another "issue". Windows user can't download file anymore probably because galaxy user is the worker of nginx instead of www-data.

Well I really need to fix this ...

2.8 years ago by christophe.habib340
2.8 years ago by
christophe.habib340 wrote:

I did this command

chmod -R g+rwx /home/galaxy/diag_19102015/

And the workflow are editable again. But I dont really like it.

As well, to solve the download issue, i think will have to tive to galaxy all the rights that belongs to www-data to begin with. And i don't like it ... :(

2.8 years ago by christophe.habib340
2.8 years ago by
Nate Coraor3.2k
United States
Nate Coraor3.2k wrote:

For the record, the permissions problems have been fixed. They were cased by the Galaxy user's home directory (a parent to the Galaxy code directory) being set drwx------ (0700).

2.8 years ago by Nate Coraor3.2k
Yes thank you for your help. I didn't reported it yet because I want to see if I can reproduce the issue. And I didn't have the time to check if the download issue as fixed. I will report all my conclusions next week.
2.8 years ago by christophe.habib340

Hi Nate,

I thought that everything worked properly, unfortunately this morning I noticed that the permissions on galaxy went back to drwx------ (0700) when I asked to my colleague to connect on  this instance from his windows XP computer and when he refreshed the welcome page.

I have no explanations about that. I asked my colleague not to use this instance, and I wasn't there the past few days. So nothing from users should have lead to this change (no overload, no misuse, etc.).

How these rights could changed on their own ? This behaviour is a big issue for me...

Once again I do not see where I should start to solve this problem. I checked the /var/log/auth.log and nobody logged in with root to do this kind of bad joke (and I am supposed to be the only one to connect as root).

I hope you will have an idea about that.

Thank you !

2.8 years ago • written 2.8 years ago by christophe.habib340

What are the correct permissions settings for the directories?

2.2 years ago by steve30
