Question: RNA-star: splice junction
0
gravatar for flippy23
8 weeks ago by
flippy230
flippy230 wrote:

Hello,

I am new to RNA-seq data. I am using the built-in index and using the reference genome without a built-in gene model (hg38) to align in STAR. How do I find or construct the gene model file for splice junctions?

Thanks

rna-seq star galaxy splice • 131 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by flippy230

Update: I used the "ALL" option in this database: https://www.gencodegenes.org/releases/current.html

Would this be appropriate for whole blood RNA-seq?

ADD REPLYlink written 8 weeks ago by flippy230
0
gravatar for Jennifer Hillman Jackson
8 weeks ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

That annotation is a good match for hg38. Be sure to use the GTF version of the annotation. There are three choices and which to use depends on what you are doing -- for many, the CHR version is the easiest to work with (simpler). You could run the analysis using the different version and compare to make the decision for yourself.

The data will load with the datatype gff assigned due to the presence of header lines. Some tools can use the data that way, others will require that you remove the header lines and change the datatype to be gtf.

Remove header lines with the tool Select using the options "NOT Matching" and the regular expression ^#.

Thanks! Jen, Galaxy team

ADD COMMENTlink written 8 weeks ago by Jennifer Hillman Jackson25k
0
gravatar for flippy23
8 weeks ago by
flippy230
flippy230 wrote:

Hello

Thank you, that helps out a lot. I used the GTF version that was provided in the link to the right of all the files. Does this resolve those issues that you are talking about? This is what occurs at the top of the file:

description: evidence-based annotation of the human genome (GRCh38), version 28 (Ensembl 92)

provider: GENCODE

contact: gencode-help@ebi.ac.uk

format: gtf

date: 2018-03-23

Should I manually remove this, or where is the tool that you are referring to? Are these issues why the RNA-star job has yet to run on the galaxy (I have no other current jobs running)

Thanks

ADD COMMENTlink written 8 weeks ago by flippy230

The description lines need to be removed. The Select tool is in the history panel -- search for it at the top.

Any jobs that were started using the data with a header still on it should be deleted/purged and rerun with the reformatted GTF.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 176 users visited in the last hour