Reference genome/ annotation

Question: Reference genome/ annotation

4.4 years ago by

United States

hsuehyuanc • 0 wrote:

Hello,

I am running apple RNAseq data. When I do Cufflink, Cuffcompare, and Cuffdiff, what reference genome or annotation should I use for each process.

The reference data I have are:

Malus_x_domestica.v1.0-primary.transcripts.gff3

Malus_x_domestica.v1.0.consensus2contigs.gff

apple_genome_contigs.nuc

They are not probably the correct ones, what types of genome data should I use for each purpose?

Also, the Galaxy only take GFF3/GTF file, how should I do with GFF files if I need to use them?

Thank you,

HsuehYuan

rna-seq • 1.3k views

ADD COMMENT • link •

modified 4.4 years ago by Jennifer Hillman Jackson ♦ 25k • written 4.4 years ago by hsuehyuanc • 0

4.4 years ago by

Jennifer Hillman Jackson ♦ 25k

United States

Jennifer Hillman Jackson ♦ 25k wrote:

Hello,

We can help .. here are the guidelines for common usage with the Tuxedo RNA-seq pipeline using a reference genome that is not-native to the Galaxy instance you happen to be working on:

1. The tuxedo pipeline will accept "reference annotation" in GTF or GFF3 format. GFF is not supported (will not contain the transcript/gene identifiers necessary to be useful). So you will be using this file: Malus_x_domestica.v1.0-primary.transcripts.gff3

2. The "reference genome" you need to be using is same consensus genomic backbone that the gff3 is based on (the chromosome identifiers and coordinates the transcripts/gene bounds are mapped to). This would be " Malus_x_domestica.v1.0"? You want the .fasta version of the genome loaded into Galaxy. As a "Custom Reference Genome".

3. You data for "Malus_x_domestica.v1.0.consensus2contigs.gff" and "apple_genome_contigs.nuc" maps back details about how the "reference annotation" was created from the source genomic contigs. Useful in case there is transcript assembly/gene assembly or splice variant question/discrepancy in a region and you wish to investigate (real or artifact/sequencing issue).

Key links to help and many more details (including tutorials, etc):

https://wiki.galaxyproject.org/Support#Tools_on_the_Main_server:_RNA-seq
https://wiki.galaxyproject.org/Support#Custom_reference_genome
https://wiki.galaxyproject.org/Support#Reference_genomes

I have made some assumptions from the given information, so please add clarification where I have misunderstood the context of the available inputs and we can work from there to further customize a solution.

Take care! Jen, Galaxy team

ADD COMMENT • link written 4.4 years ago by Jennifer Hillman Jackson ♦ 25k

Similar posts • Search »