Question: Sam Filtering And Header/Sorting Issues
0
gravatar for denis puthier
6.8 years ago by
denis puthier20 wrote:
Dear All, I would like to add some filtering steps in my RNA-Seq pipeline. To do so, I used the accepted.hits from TopHat and apply a filter using NGS: SAM Tools > Filter SAM and select reads with bitwise flag 0x0002. This does the job. However, I am unable to use cufflink after this step and got the following error message that seems to indicate that the file contains no header and is unsorted. Is there a workaround ? Thanks a lot http://main.g2.bx.psu.edu/u/dputhier/h/srx011549 Error running cufflinks. return code = 1 cufflinks: /lib64/libz.so.1: no version information available (required by cufflinks) Command line: cufflinks -q --no-update-check -s 20 -I 300000 -F 0.100000 -j 0.150000 -p 8 -m 200 -g /galaxy/main_pool/pool5/files/003/858/dataset_3858145.dat /galaxy/main_pool/pool1/files/003/858/dataset_3858306.dat [bam_header_read] EOF marker is absent. [bam_header_read] invalid BAM binary header (this is not a BAM file). File /galaxy/main_pool/pool1/files/003/858/dataset_3858306.dat doesn't appear to be a valid BAM file, trying SAM... [14:11:28] Loading reference annotation. [14:11:28] Inspecting reads and determining fragment length distribution. Error: this SAM file doesn't appear to be correctly sorted! current hit is at chr10:181061, last one was at chr1:245006405 Cufflinks requires that if your file has SQ records in the SAM header that they appear in the same order as the chromosomes names in the alignments. If there are no SQ records in the header, or if the header is missing, the alignments must be sorted lexicographically by chromsome name and by position. -- ==================================================================== Denis Puthier laboratoire INSERM TAGC/INSERM U928 Parc Scientifique de Luminy case 928 163, avenue de Luminy 13288 MARSEILLE cedex 09 FRANCE Mail: puthier@tagc.univ-mrs.fr Tel: (National) 04 91 82 87 11 / (International) 33 4 91 82 87 11 Fax: (National) 04 91 82 87 01 / (International) 33 4 91 82 87 01 Web: http://tagc.univ-mrs.fr/puthier http://biologie.univ-mrs.fr/view-data.php?id=245 http://tagc.univ-mrs.fr/tbrowser ====================================================================
rna-seq cufflinks • 2.7k views
ADD COMMENTlink modified 6.8 years ago by Carlos Borroto390 • written 6.8 years ago by denis puthier20
0
gravatar for Carlos Borroto
6.8 years ago by
Washington Metropolitan Area
Carlos Borroto390 wrote:
Hi Denis, In a similar situation I was able to move forward using NGS: Picard (beta) / Replace SAM/BAM Header to copy back the header from the original unfiltered BAM or SAM file. Hope it helps, Carlos
ADD COMMENTlink written 6.8 years ago by Carlos Borroto390
Hi Carlos, I had finally found a workaround by selecting header line matching ^@ and merging this with the SAM file. but I think your solution is far more elegant. I'll try. Thanks 2012/2/28 Carlos Borroto <carlos.borroto@gmail.com> -- ==================================================================== Denis Puthier laboratoire INSERM TAGC/INSERM U928 Parc Scientifique de Luminy case 928 163, avenue de Luminy 13288 MARSEILLE cedex 09 FRANCE Mail: puthier@tagc.univ-mrs.fr Tel: (National) 04 91 82 87 11 / (International) 33 4 91 82 87 11 Fax: (National) 04 91 82 87 01 / (International) 33 4 91 82 87 01 Web: http://tagc.univ-mrs.fr/puthier http://biologie.univ-mrs.fr/view-data.php?id=245 http://tagc.univ-mrs.fr/tbrowser ====================================================================
ADD REPLYlink written 6.8 years ago by denis puthier20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 172 users visited in the last hour