Question: Can I make a 'gene model for splice junctions' for use in RNA STAR inside Galaxy?
gravatar for eelynlim.13
6 months ago by
eelynlim.1330 wrote:

I am a complete newbie to RNA-seq analysis. I am trying to follow online tutorials etc to learn the analysis by myself. I only have a bog-standard office desktop PC to work on, so I'm trying to do everything (as far as possible) in Galaxy.

Trying to map reads with RNA STAR, it is asking for a 'gene model for splice junctions'. Based on the information I have found on other forums, I have downloaded the genome file and the annotation file, but the instructions after that invariably require running STAR on my local machine. Is it possible to generate this gene model within Galaxy itself? If not, are there other workarounds?

Any suggestions immensely appreciated!

rna-seq star gff3 galaxy gtf • 405 views
ADD COMMENTlink modified 6 months ago by Jennifer Hillman Jackson25k • written 6 months ago by eelynlim.1330
gravatar for Jennifer Hillman Jackson
6 months ago by
United States
Jennifer Hillman Jackson25k wrote:


In the Galaxy wrapped version of RNA STAR, the tool form setting Gene model (gff3,gtf) file for splice junctions expects either a GFF3 or GTF formatted annotation dataset. It should be based on the same exact reference genome build/version used for mapping.

Where to obtain the annotation vary by the genome. iGenomes, Gencode, and NCBI are common sources. When both a GFF3 and GTF is available from the source, the GTF version is slightly preferred because several other RNA-seq tools only accept GTF annotation, not GFF3, and you may wish to reuse this annotation in downstream steps.

Support FAQs: >>

Galaxy tutorials, including those for RNA-seq analysis:

Thanks! Jen, Galaxy team

ADD COMMENTlink written 6 months ago by Jennifer Hillman Jackson25k

Thanks Jen! That's super helpful.

I've downloaded a file with the title 'gencode.vM17.primary_assembly.annotation.gtf', and uploaded in to Galaxy. I think this is the annotations file that you mention can be fed into RNA STAR, but for some reason it isn't being recognised as a valid file in the 'gene model' field, even though it is present in my history and can be viewed. Would you have any ideas as to what's gone wrong there? Much appreciated!!

ADD REPLYlink written 6 months ago by eelynlim.1330

The datatype needs to be either gff3 or gtf. The datatype and database metadata assignment must be made correctly for a tool to recognize the input.

The Upload tool may not always guess the correct datatype when using "autodetect". You can assign this in Upload or Edit Attributes.

The genome's datatype cannot be autodetected, but you can assign it in Upload or Edit Attributes. It should be a match for the datatype for the built-in or custom reference genome selection used for mapping.

Please double check that you have the gtf file (inspect the data content and compare to the expected gtf format specification) and assign that datatype in the Upload tool or after Upload by using the Edit attributes functions. Then make sure the target database is hg38 with RNA STAR and that the GTF has hg38 assigned as the "database". This should resolve not only this problem but others you may encounter when a gff3 is used.

Be aware that RNA STAR uses much more memory to process a job than HISAT2. If the job ends of failing with a memory problem, then try HISAT2 instead.


Specifically, review these for the "how-to":

  • Common datatypes explained
  • The tool I'm using does not recognize any input datasets. Why?
  • How do I find, adjust, and/or correct metadata?
  • Mismatched Chromosome identifiers (and how to avoid them)
  • (summary of common solutions, including the above) My job ended with an error. What can I do?
ADD REPLYlink modified 6 months ago • written 6 months ago by Jennifer Hillman Jackson25k

Thanks so much Jen, it's running now. I'll look up the rest of the stuff as I get to them! Really appreciate your help!

ADD REPLYlink written 6 months ago by eelynlim.1330
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 183 users visited in the last hour