Continued from part 4.
The simplicity of the MQN system belies its apparent ability to effectively represent structural characteristics of compounds that imply important bioactivities. In order to obtain evidence of the utility of the system for drug discovery programs, this study assessed the potential of MQN representations to form the basis for predictive models and to estimate the structural diversity of chemical libraries.
Statistical analysis of the MQN descriptors for several sets of ligands established the presence of properties that serve to distinguish among the likely targets for the ligands. The construction of classifiers trained on these ligands and tested with both ligands and decoys produced predictive models with accuracy that is competitive with models currently used in drug discovery. It was further shown that transformation of the MQN space by dimensionality reduction techniques and kernel methods did not serve to improve the classification accuracy of the predictive models. Among the categories of descriptors widely used in the construction of QSAR and QSPR models are constitutional descriptors, which provide structural details without consideration of topology or geometry; the results of this study suggest that MQNs can be useful when such descriptors are appropriate.
The rapid calculation of chemical diversity is an important aspect of chemical and fragment library construction. Additionally, the ability of a representation to describe similarity without specifying the precise details of substructures has shown to be useful in scaffold hopping . Comparisons of structural diversity calculations using MQNs with existing diversity calculation techniques based on fingerprint similarity show that an MQN-based approach produces results that are likely competitive, and possibly superior. The discrete nature of the properties means that assessments of MQN-space consist primarily of integer calculations, without extensive requirements for more expensive floating point calculations.
Establishing the value of MQNs is significant because the computational intensity of many existing systems, which may perform expensive calculations of topological surface areas, pharmacophore fingerprints, and partition coefficients, prevents their widespread use on vast chemical spaces. The speed of calculation for MQN properties make it a useful system for the fast assessment and classification of huge chemical spaces, as evidenced by its use in characterizing all of the 977 million compounds in the chemical universe database GDB-13  and the 166.4 billion compounds in GDB-17 . While speed is an important factor in such scenarios, the representation must also be rich enough to produce useful insights. Though the MQN property space is limited by design to providing only constitutional descriptors, and the phenomenon of MQN isomers may produce difficulties yet to be characterized, this study provides evidence for its efficacy as a tool for cheminformatics tasks relevant to drug discovery.
- Böhm HJ, Flohr A, Stahl M. 2004. Scaffold hopping. Drug Discovery Today: Technologies 1(3): 217-224.
- Blum LC, van Deursen R, and Reymond JL. 2011. Visualisation and subsets of the chemical universe database GDB-13 for virtual screening. J Comput Aided Mol Des 25(7): 637-647.
- Ruddigkeit L, Blum LC, and Reymond JL. 2013. Visualization and virtual screening of the chemical universe database GDB-17. J Chem Inf Model 53(1): 56-65.