Problem with new tool definition xml file

<tool id="vardict" name="VarDict" version="21.07.2018"> <description>Variant caller</description> <requirements> </requirements> <command detect_errors="exit_code"><![CDATA[ vardict -G "$input1" -f "$input3" -b "$input2" -F 0 -c 1 -S 2 -E 3 -g 4 "$input4" | teststrandbias.R | var2vcf_valid.pl -f "$input3" -A > "$output1" ]]></command> <inputs> <param type="data" name="input1" format="fasta" /> <param type="data" name="input2" format="bam" /> <param type="data" name="input3" format="" /> <param type="data" name="input4" format="bed" /> </inputs> <outputs> <data name="output1" format="vcf" /> </outputs> <help><![CDATA[ /home/lucia/galaxy/tools/vardict/vardict [-n name_reg] [-b bam] [-c chr] [-S start] [-E end] [-s seg_starts] [-e seg_ends] [-x #_nu] [-g gene] [-f freq] [-r #_reads] [-B #_reads] region_info VarDict is a variant calling program for SNV, MNV, indels (<120 bp default, but can be set using -I option), and complex variants. It accepts any BAM format, either from DNA-seq or RNA-seq. There are several distinct features over other variant callers. First, it can perform local realignment over indels on the fly for more accurate allele frequencies of indels. Second, it rescues softly clipped reads to identify indels not present in the alignments or support existing indels. Third, when given the PCR amplicon information, it will perform amplicon-based variant calling and filter out variants that show amplicon bias, a common false positive in PCR based targeted deep sequencing. Forth, it has very efficient memory management and memory usage is linear to the region of interest, not the depth. Five, it can handle ultra-deep sequencing and the performance is only linear to the depth. It has been tested on depth over 2M reads. Finally, it has a build-in capability to perform paired sample analysis, intended for somatic mutation identification, comparing DNA-seq and RNA-seq, or resistant vs sensitive in cancer research. By default, the region_info is an entry of refGene.txt from IGV, but can be any region or bed files.

<param type="data_meta" name="Reference" format="fasta" /> <param type="data" name="BAM file" format="bam" /> <param type="integer" value="0.005" name="Frequency" /> <param type="data" name="BED file" format="bed" />

4 months ago by

Hotz, Hans-Rudolf • 1.8k

Switzerland

Hotz, Hans-Rudolf • 1.8k wrote:

The whitespace character in the name attribute are most like the culprit. The name is used as variable in the command line. Also, the type="data_meta" only works as an attribute for 'filter', as far as I know (for full details see: https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-inputs-param )

try:

<param type="data" name="Reference" format="fasta" />
<param type="data" name="BAMfile" format="bam" />
<param type="integer" value="0.005" name="Frequency" />
<param type="data" name="BEDfile" format="bed" />

Hope this helps

Regards, Hans-Rudolf

ADD COMMENT • link written 4 months ago by Hotz, Hans-Rudolf • 1.8k

Thank you for replying!

I'm sorry for the "data_meta" part, it was an oversight on my part. However, I did as you suggested and it still doesn't work... I've run the linting tool from planemo with the changed xml and this was the output:

Applying linter tests... WARNING

.. WARNING: No tests found, most tools should define test cases. .. WARNING: No valid test(s) found. Applying linter output... CHECK .. INFO: 1 outputs found. Applying linter inputs... CHECK .. INFO: Found 4 input parameters. Applying linter help... CHECK .. CHECK: Tool contains help section. .. CHECK: Help contains valid reStructuredText. Applying linter general... CHECK .. CHECK: Tool defines a version [21.07.2018]. .. CHECK: Tool defines a name [VarDict]. .. CHECK: Tool defines an id [vardict]. .. CHECK: Tool targets 16.01 Galaxy profile. Applying linter command... CHECK .. INFO: Tool contains a command. Applying linter citations... CHECK .. CHECK: Found 2 likely valid citations. Applying linter tool_xsd... CHECK .. INFO: File validates against XML schema. Failed linting

The xml indeed doesn't include a test case but I don't think this is the issue. Does this give more insight on the issue?

The xml as it is now:

<tool id="vardict" name="VarDict" version="21.07.2018">
<description>Variant caller</description>
<requirements>
</requirements>
<command detect_errors="exit_code"><![CDATA[
    vardict -G "$input1" -f "$input3" -b "$input2" -F 0 -c 1 -S 2 -E 3 -g 4 "$input4" | teststrandbias.R | var2vcf_valid.pl -f "$input3" -A > "$output1"
]]></command>
<inputs>
<param type="data" name="Reference" format="fasta" />
<param type="data" name="BAMfile" format="bam" />
<param type="integer" value="0.005" name="Frequency" />
<param type="data" name="BEDfile" format="bed" />
</inputs>
<outputs>
    <data name="output1" format="vcf" />
</outputs>
<help><![CDATA[
        /home/lucia/galaxy/tools/vardict/vardict [-n name_reg] [-b bam] [-c chr] [-S start] [-E end] [-s seg_starts] [-e seg_ends] [-x #_nu] [-g gene] [-f freq] [-r #_reads] [-B #_reads] region_info

VarDict is a variant calling program for SNV, MNV, indels (<120 bp default, but can be set using -I option), and complex variants.  It accepts any BAM format, either
from DNA-seq or RNA-seq.  There are several distinct features over other variant callers.  First, it can perform local
realignment over indels on the fly for more accurate allele frequencies of indels.  Second, it rescues softly clipped reads
to identify indels not present in the alignments or support existing indels.  Third, when given the PCR amplicon information,
it will perform amplicon-based variant calling and filter out variants that show amplicon bias, a common false positive in PCR
based targeted deep sequencing.  Forth, it has very efficient memory management and memory usage is linear to the region of
interest, not the depth.  Five, it can handle ultra-deep sequencing and the performance is only linear to the depth.  It has
been tested on depth over 2M reads.  Finally, it has a build-in capability to perform paired sample analysis, intended for
somatic mutation identification, comparing DNA-seq and RNA-seq, or resistant vs sensitive in cancer research.  By default,
the region_info is an entry of refGene.txt from IGV, but can be any region or bed files.

(...)

]]></help>
<citations>
    <citation type="doi">10.1093/nar/gkw227</citation>
    <citation type="bibtex">

@misc{githubVarDict, author = {LastTODO, FirstTODO}, year = {TODO}, title = {VarDict}, publisher = {GitHub}, journal = {GitHub repository}, url = {https://github.com/AstraZeneca-NGS/VarDict}, }</citation> </citations> </tool>

ADD REPLY • link modified 4 months ago • written 4 months ago by luciaaheitor • 20

you need to adjust the command line using the variable names $Reference, $BAMfile, $Frequency, $BEDfile, instead of $input1, etc

ADD REPLY • link written 4 months ago by Hotz, Hans-Rudolf • 1.8k

As suggested, I did just that and it still doesn't appear on Galaxy. Applying the linting tool once again fails just like I sent before. I also tried going back to planemo's original xml and just change the integer parameters (leaving the names as inpu1, etc) and it still gives the same error.

    <command detect_errors="exit_code"><![CDATA[
    vardict -G "$input1" -f "$input3" -b "$input2" -F 0 -c 1 -S 2 -E 3 -g 4 "$input4" | teststrandbias.R | var2vcf_valid.pl -f "$input3" -A > "$output1"
]]></command>
<inputs>
    <param type="data" name="input1" format="fasta" />
    <param type="data" name="input2" format="bam" />
    <param type="integer" value="0.005" name="input3" />
    <param type="data" name="input4" format="bed" />
</inputs>

I'm really sorry for the trouble but I really can't figure the problem out...

ADD REPLY • link written 4 months ago by luciaaheitor • 20

what is the error you get, when you restart galaxy and it tries to load the tool?

ADD REPLY • link written 4 months ago by Hotz, Hans-Rudolf • 1.8k

The tool just doesn't show on Galaxy:

What I get with the planemo xml altered https://ibb.co/gfKMKT

What I get without altering it: https://ibb.co/kXY968

ADD REPLY • link modified 4 months ago • written 4 months ago by luciaaheitor • 20

I believe you, that the tool doesn't show up.....my suggestion was 'to restart galaxy and look for the error you get, when it tries to load the tool'

If you do that, you will get the following error: "ValueError: An integer is required"

well, this is a typical oversight, hence change the line to:

and it should work

ADD REPLY • link written 4 months ago by Hotz, Hans-Rudolf • 1.8k

You were right, I got that error. I have now changed the xml as you suggested and everything is now working properly. Thank you so much for your time!

ADD REPLY • link written 4 months ago by luciaaheitor • 20

Similar posts • Search »