Question: getting a reference genome failed on a local galaxy instance
0
gravatar for patrick.augereau
4.4 years ago by
France
patrick.augereau0 wrote:

I am trying to get a reference genome for drosophila in my local instance of galaxy using the Tool-shed; I installed the data manager successfully, and tried to "run data manager tools"/fetching reference genome; but it did not succeed; looking to the result of my command using "view data manager jobs", I can see that the result of the command:

python /home/patrick/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/2ebc856bce29/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py "/home/patrick/galaxy-dist/database/files/000/dataset_33.dat" --dbkey_description 'D. melanogaster Apr. 2006 (BDGP R5/dm3) (dm3)'

There was an error; I asked for the info (i button on the left) and for complete view, and finally, looked the Tool Standard Error (stderr):

Traceback (most recent call last):
  File "/home/patrick/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/2ebc856bce29/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py", line 350, in <module>
    if __name__ == "__main__": main()
  File "/home/patrick/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/2ebc856bce29/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py", line 345, in main
    REFERENCE_SOURCE_TO_DOWNLOAD[ params['param_dict']['reference_source']['reference_source_selector'] ]( data_manager_dict, params, target_directory, dbkey, sequence_id, sequence_name )
  File "/home/patrick/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/data_manager_fetch_genome_all_fasta/2ebc856bce29/data_manager_fetch_genome_all_fasta/data_manager/data_manager_fetch_genome_all_fasta.py", line 182, in download_from_ucsc
    ftp = FTP( UCSC_FTP_SERVER )
  File "/usr/lib/python2.7/ftplib.py", line 120, in __init__
    self.connect(host)
  File "/usr/lib/python2.7/ftplib.py", line 135, in connect
    self.sock = socket.create_connection((self.host, self.port), self.timeout)
  File "/usr/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 111] Connection refused

There was a connexion problem; usually, I got this message when either the address was wrong, or when I could not establish a connexion. My lab is behind a proxy; could it be a port problem and which port must be open ?

Thanks in advance for any help.

software error local • 1.8k views
ADD COMMENTlink modified 4.4 years ago by Daniel Blankenberg ♦♦ 1.7k • written 4.4 years ago by patrick.augereau0
1
gravatar for Daniel Blankenberg
4.4 years ago by
Daniel Blankenberg ♦♦ 1.7k
United States
Daniel Blankenberg ♦♦ 1.7k wrote:

This data manager was attempting to connect to the UCSC ftp site at ftp://hgdownload.cse.ucsc.edu. It is possible that the UCSC ftp server was down at the time that you attempted to run the tool. Can you try to manually ftp to the site from your machine using a shell? e.g.

$ ftp hgdownload.cse.ucsc.edu

 

ADD COMMENTlink written 4.4 years ago by Daniel Blankenberg ♦♦ 1.7k

Thanks for your answer; using ftp from the terminal, I received the same "connection refused"; however, using filezilla, I can connect to the server; so, it seems there is either a proxy problem, or a problem elsewhere I can't figure out; if it's a proxy problem, is it possible and how can I specify a proxy in the toolshed ?

I set the proxy at the system level, and the connection to internet is OK (same if I set the proxy manually in firefox).  So everything should be Ok.

I am using ubuntu 14.04.

 

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by patrick.augereau0

Using gftp instead of ftp, I got a little bit more understanding of what is going on (at least I suppose); in fact, I have been confronted to a request of ID/Password from hgdownload.cse.ucsc.edu (it looks strange since I could connect to the server using filezilla); but I don't know what to use;  neither my Id/Pass from usegalaxy.org nor these from http://genome.ucsc.edu are recognized; do I have to register specifically to hgdownload.cse.ucsc.edu ?

 

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by patrick.augereau0

By the way, the failing may also comes from my request being incomplete. Where do I get the information about the field "UCSC's DBKEY for source FASTA" in the data manager ?

Thanks for any help.

ADD REPLYlink written 4.4 years ago by patrick.augereau0

This is the dbkey that UCSC assigns to their genomes, it may or may not match the one that you designate for use inside of Galaxy. However, this is likely not the cause for the failure, as the traceback indicates that an FTP connection could not be established, and you seem to have the same problem with the terminal ftp command line.

ADD REPLYlink written 4.4 years ago by Daniel Blankenberg ♦♦ 1.7k

For the UCSC server:

username: anonymous

password: your full email address

ADD REPLYlink written 4.4 years ago by Daniel Blankenberg ♦♦ 1.7k

Thanks for your hints; I've not yet solved my problems, but this seems related to how my ftp service work, and I am trying to solve that at the moment.

 

ADD REPLYlink written 4.4 years ago by patrick.augereau0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour