Question: Bam Files Loading And Mandatory Grooming
Hello everyone Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all. Is it this serious? Would there be a way to bypass that? Thanks! L-A
Hello L-A, Would you be able to help with a bit more detail and testing? #2 sounds like it may be the issue, but without knowing more right now, I'll provided the next troubleshooting steps. 1 - this is in your own install and it is current with the latest -dist or are you using -central (which pull?). Did something change between the last successful BAM load and the new problem? 2 - "without copying" them means maybe that you are using a symbolic link at your own site? This "load" is into a Library (this is how to move data transferred outside of the Galaxy mechanisms into a history, i.e. around a local file system by direct copy or similar). Load into a Library first, then copy to history. Instruction in wiki here: 0Files 3 - or do you mean "without copying" by using the FTP method set up locally, followed by a load into a history? 4 - BAM file metadata (pencil icon -> Edit options) looks OK? Or, this is the problem? 5 - SAMTools is loaded into your instance? Tools function on other BAM files or all? Some simple SAMTools commands (that work with BAM files) function line command OK on these (to rule out problem with BAM files themselves - missing .bai index could be a problem). 6 - if you start with the same data and convert SAM->BAM within Galaxy, does the SAM load and is the resulting BAM file OK? 7 - If you load the BAM file at the public Galaxy web site (using FTP), the load is successful and the BAM file once imported into a history appears OK? Please share link from this test in case we need to examine. (Options -> Share or Publish, generate link, email to me and I can share with dev team if needed). Thanks for providing more info or perhaps you will find that one of these uncovers the issue, Best, Jen Galaxy team -- Jennifer Jackson
Hello Jennifer, and thanks for your answer Yeah, I realise I was not very clear, sorry about that. This is indeed a local install (two of them actually). $ hg head changeset: 5585:8c11dd28a3cf tag: tip user: Nate Coraor <> date: Thu May 19 10:07:53 2011 -0400 summary: Add Picard and fastqc tools to Main I can't remember whether I ever could successfully load a bam file... No, it's just that when I add new datasets to a library from filesystem paths I choose the 'Link files without copying into Galaxy' option. It looks ok. Well I guess. When I click on the file in the library, in the 'Miscellaneous information' section I have: "The uploaded files need grooming, so change your Copy data into Galaxy? selection to be Copy files into Galaxy instead of Link to files without copying into Galaxy so grooming can be performed." Samtools works like charm. I've been using it through these Galaxy instances for a long time now. I can't run anything on them, since they're tagged "error" no tool want them as an input. If I could, I would just ignore the issue. If you mean when the data is actually copied in Galaxy, yes everything works fine since the grooming is perform during the upload process. I never tried the public instance though. If you need me to share one of the files with you there, I'll have to ask for permission first, since the data's not mine. Thanks for your help :) Regards, L-A
Hello L-A, The load is necessary to create/link the index .bam index. No work-around is available, however if you develop one, we would be glad to consider it as an addition. If there are any updates on our side, we will send more information. Thanks! Jen Galaxy team -- Jennifer Jackson
Hello Jennifer, Aaaah so the grooming means creating the .bai file? Ok, I didn't know, thanks for the information. But that leaves me wondering why it is necessary to copy the data (unless the alignments are not sorted yet, which is possible knowing where the data comes from). I won't have much time left to look into it but if I can I'll let you know. Thanks, L-A Le 26/07/2011 19:50, Jennifer Jackson a écrit :
2011/7/27 Louise-Amélie Schmitt <>: Sadly samtools sort doesn't set the header information, so it is not enough to look at that to say if and how a SAM/BAM file is sorted. I think what the Galaxy code does now is attempt to index the BAM file (as a safe way to find out if is really is suitably sorted), and if that fails sorts it by co-ordinate (which will make a new BAM file) and then indexes the sorted version. Peter
Le 27/07/2011 10:17, Peter Cock a écrit : Ah ok, I understand! It should be fine though, since we won't need to load bam files anymore shortly. Thanks for all the information! Best, L-A
Hi, is this still the case? I am having the same problem when trying to upload BAM files to a local Galaxy instance wihtout copying it to the server.

