I am performing htseq which is returning all counts are zero - here is the summary output: __no_feature 32723329 __ambiguous 0 __too_low_aQual 6544504 __not_aligned 2979996 __alignment_not_unique 0
I went back to double check the BAM files which I generated with bwa. I downloaded fastq files, and wanted to map against a reference genome (fasta format) which I first performed NormalizeFasta tool on. I ran flagstat and for all datasets, over 90% of reads were mapped. I also ran ValidateSamFile tool which returned INVALID_TAG_NM, and no errors once I ignore this. Here is an example of my bam file output (all have either FLAG= 0 or 16, MAPQ = 37, CIGAR = #M, MRNM = 0, MPOS = 0)
@HD VN:1.3 SO:coordinate @SQ SN:NC_011770.1 LN:6601757 @RG ID:bwa SM:bwa PL:ILLUMINA LB:SRP062593 @PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa samse -r @RG\tID:bwa\tSM:bwa\tPL:ILLUMINA\tLB:SRP062593 localref.fa first.sai /galaxy-repl/main/files/026/563/dataset_26563510.dat
SRR2288356.1616560 16 NC_011770.1 1 37 56M * 0 0 TTTAAAGAGACCGGCGATTCTAGTGAAATCGAACGGGCAGGTCAATTTCCAACCAG ;GEACGDF@;:?@7FBCFDG??@C98CC8EAAAEHEIGF@E>GFDDFDDADDD@@@
RG:Z:bwa XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:56
SRR2288356.3274251 0 NC_011770.1 2 37 40M * 0 0 TTAAAGAGACCGGCGATTCTAGTGAAATCGAACGGGCAGG CCCFFFFFHHHHHJJJJJJJJJIJJJJJJJJJJJJJIJJJ RG:Z:bwa XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:40
Any suggestions would be greatly appreciated!