Towards the Quantum Machine: Fast and Accurate Modeling of Quantum Chemical Properties Using Machine Learning

O. Anatole von Lilienfeld
Seminar

Recently, attempts have been made to apply intelligent data analysis methods to quantum mechanical atomistic simulation [1]. I will discuss our newly introduced machine learning approach for the prediction of quantum chemical properties, based on nuclear charges and atomic positions only [2]. The problem of obtaining molecular properties across chemical compound space, aka. as solving Schroedinger\'s equation, is mapped onto a non-linear statistical regression problem of reduced complexity. We use the "Coulomb"-matrix to encode all Cartesian and atomic number variables of any molecular compound. Based on this representation, Kernel Ridge Regression models or Neural Networks are trained on, and compared to, various properties computed with hybrid density-functional theory for a sub-set of the GDB-13 database [3] consisting of more than seven thousand organic molecules. Cross-validation routinely yields mean absolute errors for out of sample predictions with single digit percentage errors, competitive to density functional theory accuracy. Investigated properties include atomization energies, HOMO/LUMO eigenvalues, static polarizability, excitation and absorption energies. Applicability and transferability is also demonstrated for the prediction of potential energy curves of unseen compounds.