Question: Galaxy-Fetch Sequences-How to extract genomic DNA from Fasta file?
0
gravatar for mohamed_ismail
3.3 years ago by
United States
mohamed_ismail10 wrote:

I am trying to extract Virus genomic DNA sequence using Fetch sequences tools. The source of genomic data is from my history  (Fasta file with the name: >DQ900900.1). 

Unlike human genomic dna, virus genome cannot be labelled with chromosome no. Therefore, I labelled the first column in the interval file as >DQ900900.1. On analysis, I end up with warning message as shown below:

  Unable to fetch the sequence from '35123' to '100' for chrom '>DQ900900.1'. 

I assume something wrong with my labels in the first column of the interval file. Please advice.

Thanks

 

galaxy • 1.1k views
ADD COMMENTlink modified 3.3 years ago by Jennifer Hillman Jackson25k • written 3.3 years ago by mohamed_ismail10
0
gravatar for Jennifer Hillman Jackson
3.3 years ago by
United States
Jennifer Hillman Jackson25k wrote:

Hello,

Remove the ">" from the identifiers and this will likely solve part of the issue. Just make certain that the identifiers in the reference fasta dataset and the interval dataset are identical otherwise.

The other item to check is that the start coordinate is smaller than the end coordinate. And that the start is "0-based", the same as used in BED format. If the sequence to be extracted is on the complementary strand, designate that by including a strand field.

More about common bioinformatics file formats is in the Galaxy wiki (and also many other places across the internet):
http://wiki.galaxyproject.org/Learn/Datatypes

Best, Jen, Galaxy team

ADD COMMENTlink written 3.3 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 146 users visited in the last hour