Question: No error with CD-HIT-EST-2D
gravatar for peri.tobias
2.7 years ago by
peri.tobias0 wrote:

I am using a HPC at my facility to cluster 8 denovo transcriptomes for downstream analysis using CD-HIT--EST-2D. Here is my script based on manual at:

cd-hit-est-2d -i BU_Trinity.fasta -i2 BS_Trinity.fasta -o SYZ1.fasta -c 0.95 -n 10 -d 0 -M 16000 - T 8

There is no error output so I can't work out the problem but it stops running as below.

Job Name: SYZ1_CD-HIT Execution terminated Exit_status=1 resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.ncpus=20 resources_used.vmem=0kb resources_used.walltime=00:00:02

Would be happy if anyone can spot anything wrong with my script? Many thanks in advance,


transcriptomes cd-hit • 633 views
ADD COMMENTlink written 2.7 years ago by peri.tobias0

I have had success changing my parameters slightly (see below) - so here I am answering my own question.

cd-hit-est-2d -i BU_Trinity.fasta -i2 BS_Trinity.fasta -o SYZ1.fasta -c 0.95 -n 10 -d 0 -M 0 -T 0

I got two output files: The new clustered fasta file and a list of the clustered contigs. Interestingly the input fasta files were 167M and 141M in size while the new fasta is 91M. I hope I am not losing sequences that arise from closely related genes as these are the ones I am hoping to review in my analysis. Any comments/advice appreciated.

ADD REPLYlink written 2.7 years ago by peri.tobias0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour