Question: Lost Sequence Info On Update
0
gravatar for Chris Cole
9.9 years ago by
Chris Cole150
Chris Cole150 wrote:
Hi, I've recently done an svn update on my local Galaxy install, which all seemed to go well. However, in the history pane, the information for fasta and solexafastq files no longer details the number of sequences in the file, it only gives the size of the file. How do I get back the sequence information for those file types? Thanks, Chris
galaxy • 663 views
ADD COMMENTlink modified 9.9 years ago by Greg Von Kuster840 • written 9.9 years ago by Chris Cole150
0
gravatar for Greg Von Kuster
9.9 years ago by
Greg Von Kuster840 wrote:
Hello Chris, The more recent version of the Galaxy code to which you've upgraded has changes to the set_peek() methods of the data type classes that use less memory. Although the previous version of the code provided the number of sequences in the files, doing so was memory intensive for large files. To revert to this behavior in your local instance, you'll need to revert the set_peek() methods in the Fasta and FastaSolexa classes in ~/lib/galaxy/datatypes/sequence.py to be: class Fasta( Sequence ): """Class representing a FASTA sequence""" file_ext = "fasta" def set_peek( self, dataset ): dataset.peek = data.get_file_peek( dataset.file_name ) count = size = 0 for line in file( dataset.file_name ): if line and line[0] == ">": count += 1 else: line = line.strip() size += len(line) if count == 1: dataset.blurb = '%d bases' % size else: dataset.blurb = '%d sequences' % count class FastqSolexa( Sequence ): """Class representing a FASTQ sequence ( the Solexa variant )""" file_ext = "fastqsolexa" def set_peek( self, dataset ): dataset.peek = data.get_file_peek( dataset.file_name ) count = size = 0 bases_regexp = re.compile("^[NGTAC]*$") for i, line in enumerate(file( dataset.file_name )): if line and line[0] == "@" and i % 4 == 0: count += 1 elif bases_regexp.match(line): line = line.strip() size += len(line) if count == 1: dataset.blurb = '%d bases' % size else: dataset.blurb = '%d sequences' % count Greg Von Kuster Galaxy Development Team
ADD COMMENTlink written 9.9 years ago by Greg Von Kuster840
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 161 users visited in the last hour