Maybe this is more a question of statistics than bioinformatics but I thought your insight might be valuable. I have run a very standard RNA Seq protocol using TopHat - Cufflinks - Cuffmerge and Cuffdiff to measure differential gene expression from muscle samples. After analysis of the adjusted p_value (q_values) distribution, I am surprised to see mostly discrete (sparse) values rather than a continuous distribution. Particularly when it comes to small q_values. For instance I have more than 100 transcripts for which q = 0.00472172, then 19 transcripts with q = 0.00813184, etc. I guess I am doing something wrong with Cuffdiff, but no idea what it is...
Thanks a lot if you have any advice to improve that.