Question: FASTA sequences not blasting.
gravatar for gbernard
13 days ago by
gbernard0 wrote:

Hello Great Minds,

My FASTA sequences ( in tabular format) are not blasting. I downloaded the UNIPROT fasta.gz file to use as a database. Please provide some direction.

Best regards. GCB

rna-seq • 77 views
ADD COMMENTlink written 13 days ago by gbernard0

Not sure what you mean but this is already wrong:

My FASTA sequences ( in tabular format) are not blasting

Fasta is a certain format, first you have a line starting with ">" and a description and the next line is a sequence.

So your file with sequences needs to be in in FASTA format not tabular. And would need to do a blastp.

ADD REPLYlink written 12 days ago by gb30

Thank you so much! You are correct. The sequences are in fasta format with a Trinity header from the assembly. I downloaded the UNIPROT protein database and unzipped the file to compare against my sequences. Both are in FASTA format according to Galaxy. The issue now is to figure out what is the input value for the subject/database sequences and protein database section in the NCBI BLASTx. The protein database section reads "no Blast dbp available", I tried BLASTing my assembled reads against the UNIPROT database which is saved in my history and no results. Any suggestions? Is there a database I can import?

Best regards,

ADD REPLYlink written 11 days ago by gbernard0

I am not familiar with the public servers so I can not help you more. But you need to mention which galaxy server you are using. Also check the input, before you can blast against a reference you need to index the reference fasta file. So basicly you convert a fasta file to a blast database.


I just checked and went to the tool named "NCBI BLAST+ blastx". You need to change the setting "Subject database/sequences" to "fasta file from history". You also need to check the genetic code setting.

And an other thing, I am not sure if this is the best approach. Uniprot is a large database and your trinity output probaly contains a lot of reads so I think it will take very long. Maybe you can reduce your input. Like removing the duplicates or something. You can also try diamond.

ADD REPLYlink modified 10 days ago • written 10 days ago by gb30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 155 users visited in the last hour