RNAseq: RNA STAR ---Mapping to a viral genome

Question: RNAseq: RNA STAR ---Mapping to a viral genome

10 weeks ago by

Hi, I am analyzing an infection time course in a human cell line transduced with Vaccinia virus. The Vaccinia virus genome is not available in genomes section. I used the instructions here (https://galaxyproject.org/learn/custom-genomes/) to download the Vaccinia virus genome (https://www.ncbi.nlm.nih.gov/nuccore/AY243312.1?report=fasta) (approx 195Kb genome). Normalized using the toolkit NormalizeFasta trimmed the first line and used this as my custom genome. I made this as a custom build and performed alignment using STAR with no success.

1) The same setup works fine with BWA alignment. Could someone give any insights?

2) Viral genomes have no chr contig references. Is there anyone who has used a viral genome as a custom genome. Can anyone let me know how to proceed with the chromosome coordinate formatting for RNA STAR alignment?

3) How do you append Viral genome to the human genome and merge human and viral GTF/GFF files?

Thanks for your help/advice, anything that would involve galaxy tools for manipulation would be a plus. Cheers Nambi

mixed genome viral infection alignment rna-seq • 221 views

ADD COMMENT • link •

modified 9 weeks ago by Jennifer Hillman Jackson ♦ 25k • written 10 weeks ago by esundaramoorthy • 0

9 weeks ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

RNA STAR expects the input fastq data to be spliced (RNA). BWA expects the input to be unspliced (DNA or unspliced RNA).
The ">" line of the custom genome is the "chromosome". It is Ok if this is just "chr", or you can modify that yourself to be more specific. Just make sure that it is all "one_word" -- no spaces and is a match for the content of the related GTF dataset to avoid a mismatch problem.
You could upload both reference genomes and combine them (Concatenate) into a single fasta target custom genome, but with such a large genome (human), tools may run out of memory during runtime when using a public server. You could set up your own Galaxy and allocate more memory and/or pre-index the custom genome into a built-in genome (to avoid re-indexing every time you map against it). The reference GTF could also be combined with Concatenate - just make sure that each "chromosome" identifier is unique within the file and is an exact match to the associated content in the custom genome.

FAQs: https://galaxyproject.org/support/

Tutorials: https://galaxyproject.org/learn/

Thanks! Jen, Galaxy team

ADD COMMENT • link modified 9 weeks ago • written 9 weeks ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »