I have aligned by RNA-seq using BWA for Illumina - as I would like to use Naive Variant Caller downstream.
I converted my BWA SAM files to BAM files, and merged the four technical replicates I had.
I them sorted the file - using SAM tools and although green it had this error attached
Ignoring SAM validation error: ERROR: Record 492151, Read name NS500557:56:H5W5MBGXY:1:11109:18381:11447, bin field of BAM record does not equal
When I filtered by dataset using PICARD, for Is Mapped, Is proper pair, MapQ >=20 and NM:>1 I got a vastly smaller file - which I am not sure is correct. I ran Naive Variant caller on it and after almost 24 hours it is still running.
I ran PICARD validate SAM file on my data set and had this summary:
Error Type Count ERROR:INVALID_INDEXING_BIN 83 ERROR:INVALID_MAPPING_QUALITY 270 ERROR:INVALID_TAG_NM 38 ERROR:MATE_NOT_FOUND 3645430 ERROR:MISMATCH_FLAG_MATE_NEG_STRAND 758661 ERROR:MISMATCH_FLAG_MATE_UNMAPPED 270 ERROR:MISSING_READ_GROUP 1 I am not a bioinformatic and don't really understand computer language so if any can help explain this and help me out i would be very grateful. thanks!