Can't import pandas with python script

Question: Can't import pandas with python script

4 months ago by

Good morning,

I have a python script that I've added to my Galaxy instance. When I try to run it I get the following error:

ImportError: No module named pandas

I've tried to search for a solution to this issue but I found little information. I found in the toolshed 2 packages of pandas with the other libraries that I need but after installing one I get the same problem. How can I fix this problem? Is there a way to manually install these python libraries in galaxy with conda? (I couldn't understand how to do this in the documentation)

This script also uses latex since the output is a pdf file. I can run this script on the server where the galaxy instance is but since the libraries I need don't work on galaxy I assume that latex won't work as well. Is there a way to be able to use the script?

Thank you for the attention!

new tool python importerror • 421 views

ADD COMMENT • link •

modified 4 months ago by gb • 60 • written 4 months ago by luciaaheitor • 20

4 months ago by

gb • 60

gb • 60 wrote:

Galaxy executes the python scripts in a virtual environment. So it uses the packages that are installed there and not the ones on your machine. I think you need to add a requirement https://docs.galaxyproject.org/en/master/admin/dependency_resolvers.html#dependency-resolvers. You can also not use the venv https://docs.galaxyproject.org/en/latest/admin/framework_dependencies.html But some one else can probaly explain it better.

Personally I execute a bash script with the python command because it works easier in my current set-up. (Not sure if it is recommended)

So my xml script:

<command interpreter="bash">
my_tool_name.sh $input $output $param1 $param2 $param3
</command>

Bash script:

#!/bin/bash
my_tool.py -i $1 -o $2 -t $3 -f $4 -m $5

And my python script starts with:

#!/usr/bin/python3

Bash script if it is a python pipeline:

#!/bin/bash
tempFolder=$(mktemp -d /your/location/XXXXXX)
my_tool.py -i $1 -o $tempFolder -t $3 -f $4 -m $5
mv $tempFolder"/outputfile" $2
rm -rf $tempFolder

If I have python scripts that are using multiple files or have multiple outputs I do something like this. I make a temporary folder and the script uses that folder to store all temp and output files. If the script is finished I move (mv) them to the galaxy output location. And remove that temp folder. If I would execute the python pipeline from the command line I have a nice organised output folder. If I make a pipeline with python most of the time I have a function like this in it:

call(["mkdir","-p", args.tempdir])
call(["mkdir", tempdir + "/temp_files"])
call(["mkdir", tempdir + "/output_files"])

ADD COMMENT • link modified 4 months ago • written 4 months ago by gb • 60

I think this is exactly what I needed, thank you so much! I've been trying to get this to work but the problem that I wanted to address here is solved. I had requirements added to the xml file of the tool but I didn't see any change anywhere and the information available wasn't very clear to me on how to proceed.

Once again, thank you for sharing your method!

ADD REPLY • link written 4 months ago by luciaaheitor • 20

Gitter thread: https://gitter.im/galaxyproject/Lobby?at=5b60c535ac380e3f3a10ac1d

Glad they were able to help you to sort out this issue, too, combined with gb's advice! Jen

ADD REPLY • link written 4 months ago by Jennifer Hillman Jackson ♦ 25k

Also realize that you need to restart galaxy every time when you make a change to a xml file. Your problem is solved now but if you are installing a python package in the venv of galaxy you see the installation progress when you start galaxy the first time after you added it. So if you do not see the installation progress something went wrong.

ADD REPLY • link written 4 months ago by gb • 60

Similar posts • Search »