Question: rna-STAR in galaxy local
0
gravatar for b.roux
3.5 years ago by
b.roux10
United Kingdom
b.roux10 wrote:

Hi all

I'm trying to run rnastar form my galaxy local. I've downloaded rnastar from the test toolshed and when I executed it I got an error message: 

failure preparing job

and for more details:

Traceback (most recent call last):
  File "/Users/br237/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 157, in prepare_job
    job_wrapper.prepare()
  File "/Users/br237/galaxy-dist/lib/galaxy/jobs/__init__.py", line 828, in prepare
    self.command_line, self.extra_filenames = tool_evaluator.build()
  File "/Users/br237/galaxy-dist/lib/galaxy/tools/evaluation.py", line 371, in build
    self.__build_command_line( )
  File "/Users/br237/galaxy-dist/lib/galaxy/tools/evaluation.py", line 387, in __build_command_line
    command_line = fill_template( command, context=param_dict )
  File "/Users/br237/galaxy-dist/lib/galaxy/util/template.py", line 9, in fill_template
    return str( Template( source=template_text, searchList=[context] ) )
  File "/Users/br237/galaxy-dist/eggs/Cheetah-2.2.2-py2.7-macosx-10.6-intel-ucs2.egg/Cheetah/Template.py", line 1004, in __str__
    return getattr(self, mainMethName)()
  File "cheetah_DynamicallyCompiledCheetahTemplate_1425642260_92_41633.py", line 112, in respond
  File "/Users/br237/galaxy-dist/lib/galaxy/tools/wrappers.py", line 122, in __getattr__
    self._fields[ name ] = self._input.options.get_field_by_name_for_value( name, self._value, None, self._other_values )
  File "/Users/br237/galaxy-dist/lib/galaxy/tools/parameters/dynamic_options.py", line 578, in get_field_by_name_for_value
    assert field_name in self.columns, "Requested '%s' column missing from column def" % field_name
AssertionError: Requested 'pathls' column missing from column def

Any thoughts on that?

Thanks for your help

alignment • 1.8k views
ADD COMMENTlink modified 3.5 years ago by fubar1.1k • written 3.5 years ago by b.roux10
0
gravatar for Dannon Baker
3.5 years ago by
Dannon Baker3.7k
United States
Dannon Baker3.7k wrote:

Just looking at it, this may be a bug in the rna-star wrapper?  I would try to contact the repository owner through the toolshed.  Note 'pathls' in that last line there.

AssertionError: Requested 'pathls' column missing from column def
ADD COMMENTlink written 3.5 years ago by Dannon Baker3.7k
0
gravatar for fubar
3.5 years ago by
fubar1.1k
Australia
fubar1.1k wrote:

Thanks for reporting this - fixed in Repository 'rgrnastar_203e' revision 936a6ca8d60f (repository tip) - not sure how that bogus text got there but it's gone now - please try updating the tool in your local galaxy and let me know if that helps?

Note that there's a corresponding data manager for rnastar index files - which are huge (~25GB for hg19 indexed with exon boundary annotation!) also in the test repository at https://testtoolshed.g2.bx.psu.edu/view/fubar/data_manager_rnasta - once you've used the reference genome data manager to download some genomes, the rnastar index builder can be used to create indexes for rnastar - and if you supply a gene model gff3 (obviously, it must match that reference sequence fasta) splice junctions will be reported as part of the mapping. A final word - read the docs about shared memory - the rnastar tool is designed to share the index among multiple jobs to save on cluster ram but you'll want to make sure that if you are running multiple mapping jobs, your job configuration should send them all to queues on the same physical machine(s) where the index can be in shared ram. 

 

ADD COMMENTlink written 3.5 years ago by fubar1.1k

Hi Fubar

Thanks for the update. I can now run it, however, now I've got another issue: when the job finishes (in like 10 seconds) all my outputs are empty. Is there something else I need to install.

Note, that I'm a novice in bioinformatic :)

Thanks for your help

ADD REPLYlink written 3.5 years ago by b.roux10

Sorry to hear it's still not working for you. The good news is that the tool passes tests using planemo under linux for me so I'll need some more information to try to figure out what you are seeing go wrong!

What hardware are you are testing it on? Specifically how much free RAM is available because STAR is very memory hungry and if you try to run on a laptop or other machine without enough RAM it will fail - typically an error message about a file missing called SAindex?

Can you please take a look for any messages from running the job in your Galaxy log and also please look at the stderr and stdout under the "i" tab for one of the outputs and paste anything you find there? 

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by fubar1.1k

Hi Fubar

I'm running Galaxy on iMac OSX 10.2 with 2.9 GHz i5 core and 32 GB of RAM.

I think I've got more than one issue here. I actually realised that data manager failed to produce index for rna-star. I tried both with hg19 (fetched by galaxy) and hg38 (manually installed). I added the gencode v19 and v21 respectively for both genome. Both failed with this bug report:

Traceback (most recent call last):
  File "/Users/br237/shed_tools/testtoolshed.g2.bx.psu.edu/repos/fubar/data_manager_rnasta/363d6797d366/data_manager_rnasta/data_manager/rnastar_index_builder.py", line 119, in <module>
    if __name__ == "__main__": main()
  File "/Users/br237/shed_tools/testtoolshed.g2.bx.psu.edu/repos/fubar/data_manager_rnasta/363d6797d366/data_manager_rnasta/data_manager/rnastar_index_builder.py", line 116, in main
    n_threads=options.runThreadN )
  File "/Users/br237/shed_tools/testtoolshed.g2.bx.psu.edu/repos/fubar/data_manager_rnasta/363d6797d366/data_manager_rnasta/data_manager/rnastar_index_builder.py", line 64, in build_rnastar_index
    proc = subprocess.Popen( args=cl, shell=False, cwd=target_directory, stderr=tmp_stderr.fileno() )
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 709, in __init__
    errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1326, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

So instead I tried to use the genome without index. As I said, the job was reported as successful (green) but empty. I had a some warning in stderr/stdout:

stdout:

Warning: Some stderr/stdout text
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.

stderr

Warning: Some stderr/stdout text
/Users/br237/galaxy-dist/database/job_working_directory/000/8/galaxy_8.sh: line 17: STAR: command not found
[main_samview] fail to open "Aligned.out.sam" for reading.
/Users/br237/galaxy-dist/database/job_working_directory/000/8/galaxy_8.sh: line 17:  1694 Done(1)                 samtools view -Shb Aligned.out.sam
      1695 Segmentation fault: 11  | samtools sort - AlignedSorted 2> /dev/null

I hope this helps.

Thank you

ADD REPLYlink written 3.5 years ago by b.roux10
0
gravatar for fubar
3.5 years ago by
fubar1.1k
Australia
fubar1.1k wrote:

ok - thanks - this is helpful - these are relatively untested outside my lab which is why they're in the test toolshed ! Yes, that should be sufficient RAM - as long as you have heaps of spare disk space you should be fine with STAR - but be warned, SAindex for the tiny 2 sequence fasta file used in the tool test is 1.5GB on disk!

I'll take a look at the data manager later today to find the source of that error - will update here when I have something more to report.

One experiment to try if you can be bothered - try using a history fasta file as the genome - the rnastar mapper tool should index and use it automatically - that removes the data manager from the equation temporarily although it adds substantially to job running time.

ADD COMMENTlink written 3.5 years ago by fubar1.1k

Thanks, I'll try that tomorrow morning and let you know.

I also wanted to try to downloads the SA file and SA index from here:

http://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARgenomes/GENCODE/GRCh38_Gencode21/

It might do the work.

 

ADD REPLYlink written 3.5 years ago by b.roux10

ok I tried what you said about running star with reference genome from my history: well, the same happened. It ran, came out positive but still empty outputs. And I got the same messages from stderr/stdout.

ADD REPLYlink written 3.5 years ago by b.roux10

Hi Fubar

I don't know if you had time to look at Star.

Maybe this can help, when I go to "Manage installed tool shed repositories" for rgrnastar_203e it says: "installed, missing tool dependencies" and when I click on the directory, it says that the missing tool is rnastar version 2.4.0d. So I tried to reinstall it but I get this message "This tool dependency&#39;s required tool dependency rnastar version 2.4.0d has status Error.Installed tool dependencies: rnastar". I'm not sure that this, is the problem for me not getting rnastar to work.

Thanks

ADD REPLYlink written 3.4 years ago by b.roux10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour