I have some ChIP-seq data which I have aligned with Bowtie2 on Galaxy which gave me BAM files. I have since run a couple of filtering modules on Galaxy without any problems (inparticular when I used Filter SAM/BAM module to keep mapped reads I selected include header in the output). However, if I then go on to download these BAM files and use them at the command line, I keep getting error messages saying that the files are missing an EOF marker and may be truncated. So I then went back to Galaxy and converted BAM to SAM which is supposed to keep headers (in case that was the issue) and sort by co-ordinates. In Galaxy the files look fine but again if I download them and try to use them at the command line, I'm getting error messages back saying the files are truncated. When I have looked at forums, people suggest that the BAM/SAM files may be damaged.
So my question is, what is Galaxy doing to the files if they are fine and error free in Galaxy but un-usable out of Galaxy? Is there some sort of Galaxy specific format for the files???
My specific reason for wanting to download and use the files at the command line is because I have a nice command that really effectively gets rid of chrUn and random contig reads. When I have tried this in Galaxy in the past, I've found that I can't get rid of them. If there is an effective Galaxy solution for this, I'll happily stay in Galaxy for my analysis and I guess the above isn't a problem. I'd still like to know what's going on though.