As my prior question didn't seem to get a response, which was wrapped in my OP as a comment, I am re-posting it as a new question, hoping to get some feedback.
It seems that GATK tools require all files/input to be "read groups verified". However, I just realized the "commonly used setting" of BWA under Galaxy, for example, does NOT specify read groups in the output. My questions are
(a) Is there a reason why a critical piece of info like this is NOT selected as "commonly used" by default? (my understanding is that many non-GATK tools do not require the info of read groups, but to me this is hardly a good reason NOT to include it, especially when BAM is a file of a few MB to GB, adding read groups by default seems VERY innocuous)
(b) are all read groups (ID, CN, DS, DT, FO, KS, LB, PG, PI, PL, PU, SM) readily available in the original fastq? Is there a tool that could extract the info in the fastq and add them? If the info is not available in fastq, where could we find them (when all we have are just fastq files) ?
Any help? Thank you.