Question: Synthetic Sequencing DataSet: Definition
22 months ago
Hello, I have searched on internet the definition and purpose of synthetic DNA database but cannot find the answer.

Can someone explain why someone would like to generate it and how can it be used?

22 months ago
Devon Ryan
This is rather off-topic for this site, but I'll answer anyway. A synthetic DNA database is a database of data generated to be similar to what you would get if you'd done a real sequencing experiment except that the make-up of the input is known. This is useful for benchmarking different tools or methods, since you know the exact proportions/natures/whatever of the various species/sequences in the dataset. I've usually only seen this term in the metagenomic literature, elsewhere people typically just say something like, "we generated X million synthetic reads from the Y genome with some error profile and used it to compare tool A against tool B".

