Question: use HIV as reference genome
0
gravatar for bhavna.hora
4.4 years ago by
United States
bhavna.hora0 wrote:

hi,

i have 4  read outs that cover the whole genome of HIV sequence. i want to align them to form a contig and generate consensus. Then i want to carry out the comparison with a reference sequence. can i do this using galaxy?

there is no option of HIV genome when i try to carry out a paired end alignment. What do i use?

thanks

bhavna

 

assembly bwa bowtie galaxy • 1.4k views
ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by bhavna.hora0
1
gravatar for tom.bair
4.4 years ago by
tom.bair10
United States
tom.bair10 wrote:

You can try making your own custom genome build User -> Custom Builds -> upload your own fasta file.

 

You might need to do several to cover all of the subgroups unless you know which one it is likely from

 

 

ADD COMMENTlink written 4.4 years ago by tom.bair10
0
gravatar for bhavna.hora
4.4 years ago by
United States
bhavna.hora0 wrote:

i am trying to upload the fasta file but am unable to do so,

can i get all the references from the HIV database

ADD COMMENTlink written 4.4 years ago by bhavna.hora0

Would you be able to explain more about the problems you are having with uploading the fasta file? Where are you working? The public Main Galaxy instance (http://usegalaxy.org), a local, cloud, or public Galaxy? On Main and a CloudMan Galaxy, FTP is set up by default. On a local, this can be configured. On a public Galaxy, each site is different - quotas, FTP enabled or not, these can make a difference.

Or maybe better - start with the wiki links below, see if they address the problem, then post back and let us know how it goes with more details then.

For reference, you will want to use FTP if the data is over 2G (and often even if not, it is simple to do and quick when there are multiple files). And the help for Custom reference genomes covers many specific troubleshooting tips for formatting once in Galaxy.
http://wiki.galaxyproject.org/Support#Loading_data
http://wiki.galaxyproject.org/Support#Custom_reference_genome
  -> review then follow links to details at
      http://wiki.galaxyproject.org/Learn/CustomGenomes

The source of sequence data can be from anywhere. And can really be almost any fasta file. There is no restriction that it is even genomic, or an entire genome. For example, sometimes it makes sense to perform analysis with a transcriptome, or a single chromosome/region of interest, or to create multiple versions of a "reference genome" for sub-groups (such as Tom mentioned).

But that is content - let's deal with the loading issue first. That should never be an persistent problem, or at least one that cannot be solved, unless on a public site that does not support FTP and the fasta file is large (and perhaps Custom reference genomes are not supported).

Thanks! Jen, Galaxy team

ADD REPLYlink written 4.4 years ago by Jennifer Hillman Jackson25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 170 users visited in the last hour