Question: Gff3 And Metagenome Data?
0
Thomas Haverkamp • 30 wrote:
Hi all,
has anybody an idea how to do the following in galaxy?
I have short (400bp) metagenome reads and I have used meta-Genemark to
find
protein coding regions in the unassembled reads. Meta-Genemark outputs
a
GFF3 file (you find a sample at the bottom of the post).
I saw that Galaxy has a tool to fetch sequence from a genome file
using a
GFF format file: "Extract Genomic
DNA<http: main.g2.bx.psu.edu="" tool_runner?tool_id="Extract+genomic+DNA+" 1="">using
coordinates from assembled/unassembled genomes ". I would like to use
that tool, if possible.
The problem is however that I get the following error: Unspecified
genome
build, click the pencil icon in the history item to set the genome
build.
Of course I have no genome, so I am a bit stuck and I have no clue on
how to
use the coordinates in my GFF file to extract those regions from my
metagenome reads. Anybody an idea for a proper workflow?
Thomas
GFF3 output:
##source-version GeneMark.hmm_PROKARYOTIC 2.7d
##date Thu Mar 24 06:15:18 2011
##Sequence file name: ghm.mfa
##Model file name:
/home/genmark/public_html/metagenome/Prediction/bin_MetaGeneMark/MetaG
eneMark_v1.mod
FV4B4XA01C8BBF GeneMark.hmm gene 1 513 . + 0
gene_id
1
FV4B4XA01D6PDN GeneMark.hmm gene 2 334 . + 0
gene_id
2
FV4B4XA01DC6SS GeneMark.hmm gene 1 390 . - 0
gene_id
3
FV4B4XA01AOJUF GeneMark.hmm gene 2 400 . - 0
gene_id
4
FV4B4XA01CMP07 GeneMark.hmm gene 1 465 . + 0
gene_id
5
FV4B4XA01CIPQZ GeneMark.hmm gene 1 228 . + 0
gene_id
6
FV4B4XA01DWJZ1 GeneMark.hmm gene 1 459 . - 0
gene_id
7
FV4B4XA01AUE58 GeneMark.hmm gene 237 488 . + 0
gene_id 8
FV4B4XA01C56SJ GeneMark.hmm gene 1 309 . + 0
gene_id
9
FV4B4XA01C56SJ GeneMark.hmm gene 321 422 . + 0
gene_id 10
FV4B4XA01A3DSA GeneMark.hmm gene 3 143 . + 0
gene_id
11