0% found this document useful (0 votes)
37 views

A Concise Review On Recent Developments of Machine Learning For The Prediction of Vibrational Spectra-NA

Uploaded by

alizahid000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

A Concise Review On Recent Developments of Machine Learning For The Prediction of Vibrational Spectra-NA

Uploaded by

alizahid000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

pubs.acs.

org/JPCA Perspective

A Concise Review on Recent Developments of Machine Learning for


the Prediction of Vibrational Spectra
Ruocheng Han,‡ Rangsiman Ketkaew,‡ and Sandra Luber*

Cite This: J. Phys. Chem. A 2022, 126, 801−812 Read Online

ACCESS Metrics & More Article Recommendations


See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

ABSTRACT: Machine learning has become more and more popular


in computational chemistry, as well as in the important field of
Downloaded via UNIV OF EDINBURGH on November 5, 2022 at 16:06:20 (UTC).

spectroscopy. In this concise review, we walk the reader through a


short summary of machine learning algorithms and a comprehensive
discussion on the connection between machine learning methods and
vibrational spectroscopy, particularly for the case of infrared and
Raman spectroscopy. We also briefly discuss state-of-the-art molecular
representations which serve as meaningful inputs for machine learning
to predict vibrational spectra. In addition, this review provides an
overview of the transferability and best practices of machine learning in
the prediction of vibrational spectra as well as possible future research
directions.

■ INTRODUCTION
Understanding vibrational spectroscopy is a key to unlocking
used to transform chemical systems into numerical data and
initiate a way for a computer to learn from the information,
knowledge for in-depth understanding of chemical compounds respectively. Often-used molecular representations include
and in the development of advanced analytical techniques and geometric/structural descriptors such as Coulomb matrix,26
instruments for determining the characteristics thereof. Conven- atom-centered symmetry functions,27 and density-based
tional spectroscopic computer techniques based on methods descriptors like a smooth overlap of atomic positions.28 Learning
derived from quantum mechanics entail time-consuming algorithms such as Gaussian process regression29 and different
types of artificial neural networks are the most common choices
computations, which frequently necessitate significant human
for extraction of the targeted properties (see Categories of
effort. The molecular and materials design process could be
Learning Targets section).
greatly accelerated by utilizing machine learning (ML) tools
Depending on the data required for training models and
capable of learning effectively from known historic or inten-
predicting the output variables, ML methods in chemistry can be
tionally generated data that is already available for millions of
grouped into two categories: electronic structure- and nuclear
chemical compounds. ML is a challenging protocol in which we
structure-based models. The former uses the electronic
necessarily have to create an independent system that can
information such as electronic configurations and electron
translate from one data set to another using only molecular
density obtained from quantum chemistry calculations, while
descriptors in one dimension and their translations in the other.
the latter learns the nuclear structures directly from either
Back in the 1990s, ML studies were first carried out for the
calculations or experiments. The relationship between electronic
identification1−3 and prediction4,5 of vibrational spectra. These
structure, nuclear structure, and spectroscopic properties is
works provided two categories of learning processes: from
shown in Figure 1. On one hand, electronic structure determines
nuclear structures to vibrational spectra (Struc-to-Spec) and
the nuclear structure (and vice versa) and dynamics as, e.g.,
from vibrational spectra to nuclear structures (Spec-to-Struc).
calculated with ab initio molecular dynamics (AIMD). On the
With the development of data-driven techniques and the gradual
other hand, properties such as multipole moments and
increase of capabilities of computers, numerous applications of
developed ML methods involving vibrational spectroscopic
analysis for organic molecules or condensed phase systems have Received: December 9, 2021
been reported in recent literature6−25 with specific attention and Revised: January 21, 2022
domain knowledge contributed to infrared (IR) and Raman Published: February 8, 2022
spectroscopy. To make a model intelligent, appropriate ML
techniques are needed. Feature engineering (selection/con-
struction of descriptors) and learning algorithms are usually

© 2022 American Chemical Society https://doi.org/10.1021/acs.jpca.1c10417


801 J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

Figure 1. Relationship between electronic structure, nuclear structure, and spectroscopic properties: electronic structure determines the nuclear
structure (and vice versa) and the vibrational spectra. Multiple methods can be applied to obtain the spectra from the given nuclear structure. Quantum
chemistry methods (e.g., density functional theory (DFT) and post-Hartree−Fock (HF)) calculate vibrational spectra via (approximated) electronic
structures requiring usually rather costly computational resources (“hiking up mountains”). Nuclear structure-based ML builds a “tunnel” between
nuclear structure and vibrational spectra and gives predictions on both sides (Struc-to-Spec or Spec-to-Struc) through a “bullet train”. Yet another
possibility is to estimate electronic structure and use it to predict vibrational spectra (information on electronic structure passed through “satellites”),
which can be called electronic structure-based ML.

polarizabilities, which are essential for the calculation of 5. Discussion: discuss the connection between ML and
vibrational properties, also rely on the electronic structure. vibrational spectroscopy and transferability of an ML
Though electronic structure can potentially provide a better model.
description for the spectroscopic properties due to their “direct”
relationships, the corresponding descriptors are usually 6. Concluding remarks and outlook: summarize the entire
computationally expensive (e.g., requiring ab initio calculations). review and provide possible challenges in the ML for
Speed-up of static IR and Raman spectra has, for instance, been vibrational spectra in the future.
achieved by calculating only selected vibrational normal modes
and associated intensities with high accuracy (“intensity- Note that the review is not on vibrational spectroscopy with
tracking”).30−32 In the aspect of vibrational spectra prediction, machine-learned force fields (MLFF), focusing only on power
most ML studies6−11,15−25 engage in building a “tunnel” spectra.


between nuclear structure and spectra (or spectroscopic
properties), while several others12−14 use atomic densities to THEORETICAL BACKGROUND
represent the electronic structure and make use of the “satellite”
link (see Figure 1). Two categories of methods are mainly adopted for vibrational
In light of existing ML methods and their applications in spectroscopy relying on first-principles approaches, based on
vibrational spectroscopy, a concise review is a good kick-off and either a static approach, usually based on the harmonic
is essential for those who are interested in this field. The approximation, or a dynamic approach often using density
structure of this review is organized as follows: functional theory-based molecular dynamics (DFT-MD).33
1. Theoretical background: introduce static and dynamic Static Calculation. A vibrational spectrum can be calculated
calculation methods in vibrational spectroscopy and from normal modes which are obtained using the Hessian matrix
formulas for IR and Raman spectra. (H) of an equilibrium structure R0 defined in eq 1. The
expression of the matrix element Hα,β is shown in eq 2, where E is
2. Categories of learning targets: categorize the literature by
different types of training targets, which correspond to the the electronic energy, and Rα and Rβ represent components of
properties introduced in 1. the atomic position vector under Cartesian coordinates (X, Y,
Z), and thus R = (X1, Y1, Z1, X2, Y2, Z2, ..., XM, YM, ZM) assuming
3. Data transformation: introduce various molecular repre- M atoms in the system. The normal mode can be calculated by
sentations/descriptors used for transforming molecules/ diagonalization of the mass-weighted Hessian matrix. The static
crystal structures to numeric data from the literature.
calculation is usually only based on one equilibrium structure of
4. The zoo of machine learning algorithms: introduce ML the researched system, so the thermal effect is not considered or
algorithms as well as their application conditions and solely considered after the calculation by applying an artificial
features. broadening for the intensity bands.
802 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

ij H1,1 ... H1,3M yz


jj zz
jj z
jH ... H1,3M zzzz
H1,2 the Boltzmann constant, and T is the temperature. The

H = jjjj 2,1 zz
expression of S(a2, γ2) depends on the experimental setup, and

jj... zz
H2,2
jj zz
a time correlation formalism is given, e.g., in eqs 27 and 28 of ref

jj H ... H3M ,3M zz


... ... ... 36. In a static approach, the derivatives of the α elements with
k 3M ,1 {
H3M ,2 respect to normal coordinates are needed. On the DFT level,
(1)
each component of α can be analytically calculated, for instance,
∂ 2E by density functional perturbation theory (DFPT),37−40 linear
Hα , β = |R = R 0 α , β = 1, 2, ..., 3M response time-dependent DFT (LR-TDDFT),41 or real-time
∂R α ∂Rβ (2) time-dependent DFT (RT-TDDFT).36,42−44
Dynamic Calculation. Another way of computing vibra-
tional spectra is to calculate the Fourier transform of time
correlation functions obtained from MD simulations. This
■ CATEGORIES OF LEARNING TARGETS
Struc-to-Spec. Using a static approach, Ye et al. adopted a
method takes into account the contribution of different nuclear divide-and-conquer strategy and constructed a vibrational
configurations along MD trajectories, going beyond the model Hamiltonian45 for an entire protein.6 The peptide
harmonic approximation and including finite temperature bonds in the protein backbone are regarded as oscillators, and
effects. It is worth noting that DFT is usually employed for the nearest-neighboring coupling is considered for the
MD because of a good balance between the accuracy of the formulation of the Hamiltonian. Diagonal elements and off-
obtained results and the computational cost but might be too diagonal elements in the vibrational Hamiltonian are separately
computationally expensive for very large systems in order to predicted with neural networks.
capture the result within a reasonable time scale. More studies in the field have focused on the ML prediction of
Infrared Spectroscopy. IR spectroscopy measures the spectroscopic properties in dynamic calculations. Previous
absorption of radiation in the infrared region by matter, which works have used atomic partial charge as a learning target and
corresponds to the change of electric dipole moments of the constructed molecular dipole moments for different nuclear
substances under study. The IR absorption intensity in a configurations of MD trajectories.7−9 Zhang et al. focused on a
dynamic approach can be calculated as34 condensed phase system which requires cell polarization for the
calculation of IR spectra; therefore they provide a prediction on
IIR (ω) ∝ ∫ ⟨μ̇ (τ)μ̇ (τ + t )⟩τ e−iωt dt Wannier centers rather than atomic positions.10 Kananenka et al.
(3) made a direct prediction on electric dipole derivative and
where μ̇ is the time derivative of the electric dipole moment, ω is transition frequency for the OH stretching mode of water.11
the vibrational frequency, τ is the time lag, and t is the time for Several works have introduced ML on the polarizability tensor α
integration. ⟨μ̇ (τ) μ̇ (τ + t)⟩τ denotes the time correlation of μ̇ . In so as to predict Raman intensities.12−14 In these studies, α is
a static approach, IR spectra can be computed using derivatives divided into several components according to its symmetry.
of the electric dipole moment with respect to normal Raimbault et al. applied ML on each individual polarizability
coordinates. component αγδ,12 Wilkins et al. decomposed α into its
irreducible components (the scalar α(0) and the five-vector
μ= ∑ Pμν⟨ϕμ|r |ϕν⟩ α(2)) for learning,13 and Zhang et al. decomposed α into three
μν (4) submatrices for learning.14 In the aspect of semiclassical
dynamics, Gandolfi et al. introduced the idea of division of the
μ= ∑ qJ RJ vibrational degrees of freedom into subspaces to reduce the
J (5) dimensionality of the potential term.15
Aside from those based on spectroscopic properties (e.g.,
In DFT as a quantum chemistry method, one for instance dipole moments, polarizabilities), there are some studies directly
calculates the electric dipole moments via a trace of the density trained on spectral frequencies or intensities. A counter-
matrix (Pμν) and integrals of the electric dipole operator r (r = propagation (CPG) network46 has been used to predict discrete
(x, y, z)) over (atomic orbital) basis functions {ϕμ} and {ϕν} IR absorption intensities in several early works by Gasteiger and
(see eq 4). However, more approximate approaches have also co-workers.4,5,16,17 The full spectrum is discretized into
been employed whose atoms are treated as point charges absorption intensities at every 4−40 cm−1, and each interval of
(atomic partial charge of atom J, qJ), and the electric dipole the frequency range is predicted separately. In a similar manner,
moment is calculated via eq 5, where RJ represents the Cartesian Yildiz et al. made predictions for both IR and Raman
coordinates of atom J. intensities.18,19 Recently, Ren et al. trained and evaluated a
Raman Spectroscopy. Raman spectroscopy results from model on 22K molecules targeting vibrational frequency, IR
the inelastic scattering of light in the infrared, visible, or intensity, and Raman intensity of OH and CO bond
ultraviolet region by matter, which corresponds to the change of stretches.20 For condensed phase systems, Kwac and Cho
electric-dipole−electric-dipole polarizability of the substances studied the solvent effect by investigating the vibrational
under investigation. Raman scattering intensity IRaman is given frequency shift Δω of water using quantum mechanics/
as34 molecular mechanics (QM/MM) simulations,21,22 and Hu et
(ωin − ω)4 1 al. predicted vibrational frequencies and Raman intensities of 8
IRaman(ω) ∝ S(a 2, γ 2) vibrational modes for molecule−metal surface systems.23
ω ℏω
1 − exp − k T ( ) B (6)
Spec-to-Struc. Spec-to-Structhe reverse approach of the
Struc-to-Spec ansatzwas also introduced. At the early stage,
where S(a , γ ) is a combination of isotropic and anisotropic
2 2
ML studies have been carried out for recognition of vibrational
invariants of Placzek-type polarizability tensor α,35 ω is the spectroscopy. Visser et al. introduced partial least-squares (PLS)
vibrational frequency, ωin is the frequency of incident light, kB is and ANN models for the band pattern recognition of IR spectra,
803 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

and certain functional groups/substructures could be deter- Despite being invariant to molecular translation and rotation,
mined.1,2 Similarly, Carrieri and Lim provided an NN solution the CM is not geometrically invariant to atom permutations. To
to predict the existence of 9 compounds from calculated solve this puzzle, a variety of alternative representation methods,
spectra.3 Fu and Hopkins provided an unsupervised learning such as Permutation-Invariant Polynomials (PIP),49 Randomly
scheme to cluster isomers of the protonated phenylalanine/ Sorted Coulomb Matrices (RSCM),50 Bag of Bonds (BoB),50
serine dimer and identified them from the experimental IR and Permutation Invariant Vectors (PIV)51, have been proposed
spectra.24 Fine et al. made use of an autoencoder to connect the for producing a representation of the molecule that is
information between functional groups and peaks of Fourier independent of the ordering of the atoms.
transform infrared (FTIR) and mass spectroscopy (MS).25 One Smooth Overlap of Atomic Positions. Smooth overlap of
can then use this information to predict the functional groups or atomic positions (SOAPs) can encode atomic geometries in
identify the composition of mixtures from corresponding FTIR their chemical environment using a local expansion of a
and MS data. Ren et al. applied a long short-term memory Gaussian smeared atomic density based on the radial basis
(LSTM) network for the structure recognition of molecules up (gn(r)) and real spherical harmonic functions (Ylm(θ, ϕ)).28
to 10 heavy atoms (C, N, O, F).20 SOAP is suitable for predicting local properties such as atomic

■ MOLECULAR DATA TRANSFORMATION


An atom in a molecule can be thought of as a word in a sentence.
forces or chemical shifts but requires partitioning of global
properties such as total energies. The SOAP kernel between two
atomic environments (? and ?′) can be retrieved as a
Making a computer understand the informational correlation normalized polynomial kernel of the partial power spectra.
between atoms needs a representation to fit the target of the The working equation of the SOAP kernel (KSOAP) retrieved as a
prediction (label). A presentation should not only be rotation- normalized polynomial kernel of the two neighbor partial power
ally and translationally invariant but also inexpensive in terms of spectra p and p′ is given by

ij p·p′ yz
(p, p′) = jjjj zz
computational complexity to generate the machine codes from
zz
ξ

j p·pp′·p′ z
the Cartesian coordinates. The Z matrix (also known as internal
k {
SOAP
coordinates) is a simple traditional representation that is often K
used to describe the geometry of an entire molecule as it is (8)
invariant to the rotation. where ξ is a positive integer. The elements of the p vector are
For brevity, in the following, we discuss a few selected defined as
molecular representations.
Geometric Descriptors. All geometric descriptors that
˜l
cn˜lm
Z1Z 2 8 Z1† Z 2
provide information about the spatial coordinates of atoms in a pnn =π ∑ cnlm
2l + 1 m (9)
molecule relate to the symbolic representation of the molecule.
There are several geometric descriptors, including the molecular
where n and n′ are indices for radial basis functions up to nmax, l is
Z matrix and standard and effective coordination numbers. The
the angular degree of the spherical harmonics up to lmax, m is an
descriptor of the molecular matrix represents each atom’s
integer such that |m| ≤ l, and Z1 and Z2 are atomic species. The
coordinates (x, y, z) in Cartesian space. In contrast to most
coefficients cZn′lm and cZ†
nlm are defined as the inner products of
descriptors, these descriptors are able to distinguish isomeric
spherical harmonic functions with the Gaussian smoothed
molecules (e.g., cis/trans stereoisomers). However, geometri-
atomic density for atoms with the atomic number Z (ρZ), and its
cally based representations can still be problematic, since they
complex conjugate, respectively.28
omit electronic structure. More sophisticated representations
Atom-Centered Symmetry Functions. Atom-centered
developed in the past decade leverage and include electronic
symmetry functions (ACSFs) generalize the output of multiple
descriptions such as atomic force, electron configuration, and
two- and three-body functions to estimate the local electronic
correlation between orbitals.47 It can also be difficult to calculate
environment near atoms using a fingerprint method that can be
the geometric descriptors due to their complexity.48
customized to detect specific structural features for symmetric
Coulomb Matrix. The Coulomb matrix (CM) introduced
functions.27 Radial symmetric functions for the central atom i
by Rupp et al. is widely employed because of its simplicity and
with neighboring atom j are given as follows:
relatively reduced requirement of a priori knowledge of chemical
properties of a molecule.26 It stores information about how Natom
2
atoms interact with each other. Each pair of atoms in a molecule Giradial = ∑ eη(r − μ) fc (rij)
ij

carries the pairwise electrostatic potential energy. In addition, j≠i (10)


every atom has a set of Cartesian coordinates representing its
location in space and has a charge attached to it. The CM where η and μ are the parameters controlling the width and the

l
between atoms i and j is simply given by position of the Gaussian function. fc is a cutoff function that

o
o
o
o
selects the relevant regions close to the central nucleus to be

o
Cij = o
0.5Zi2.4 if i = j encoded into the ACSF and rij is the displacement between the
m
o
o
o
atoms i and j. Angular symmetric functions have been defined.27
o
o
o
ZiZj
Moreover, an input required to ACSFs is fine-tuning of internal
n ij
if i ≠ j
R (7) parameters that can be properly used to define the Gaussian
function. A similar representation to ACSFs is the Spectrum of
where Z is the atomic number and Rij is the interatomic London and Axilrod−Teller−Muto (SLATM), which has been
separation. The off-diagonal entries of the CM reflect recently used in the literature, but mostly for kernel ridge
Coulomb’s repulsions between the nuclei, and the exponents regression (KRR) models.52 Polynomial functions of the inverse
in the diagonal entries correspond to a polynomial fit linking the of the interatomic distances have also been suggested but are not
atomic number to the overall energies of the unbound atoms. discussed in this review.47
804 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

Gaussian-type Orbital-Based Density Vectors. Gaus-


sian-type orbital (GTO)-based density vector is a descriptor
function alternative to the ACSF and symmetric polynomial
function.22 The GTO-based density vector is given by
i ntype Natom yz
L! jjj z
jj∑ ct ∑ ϕl αl ,lrs(rij)zzz
lx! ly! lz! jj t = 1 j = 1 x y z zz
lx + ly + lz = L t

k {
ρLi , α , r = ∑
s
lx , ly , lz (11)

where lx + ly + lz = L specifies the orbital angular momentum,


ntype is the atomic type in the molecule under study, ct is the type-
dependent weight, Ntatom is the number of atoms of the type t,
and ϕl αl ,lrs(rij) is the Gaussian orbital centered at each atom with
xyz

the parameters α and rs which determine the radial distributions


of orbital functions, given as
2
ϕl αl ,lrs(rij) = x lxy ly z lze−α | r − rs| (12)
xyz

where rij = (x, y, z) indicates the vector from nuclei i to nuclei j,


and r is the magnitude of rij. GTOs with L = 0, 1 are usually
considered for constructing the density vectors for small organic
molecules containing light atoms such as C, H, O, and N.22
The Zoo of Machine Learning Algorithms. ML
approaches have proven their success in tackling fundamental
problems in theoretical chemistry, especially the ability to scale
down the size of the system that can be computationally
investigated and predict molecular properties. Molecular ML
models are generally classified into four categories: supervised
learning, semisupervised learning, unsupervised learning, and
reinforcement learning. In this concise review, we will Figure 2. Graphical representations of (top) unsupervised machine
specifically focus on (un)supervised learning, which has been learning methodsWPGMA, PCA, and autoencoder; and (bottom)
supervised machine learning methodsPLS, NN, GPR, and RF.
widely used in vibrational spectra prediction (see Figure 2).
Clearly, there are a lot of factors to consider as criteria when it
comes to choosing the ML algorithms not only for vibrational Autoencoder (AE). AE is an unsupervised learning
spectra prediction but also for other kinds of molecular property algorithm that constructs an artificial neural network (ANN)
predictions. Nevertheless, there have been a number of software with a low-dimensional representation and high-dimensional
packages as well as useful libraries that can help those who are input.58 Differently from PCA, the autoencoder is able to realize
not familiar with fundamental statistical learning to use ML nonlinear compression with nonlinear activation functions. In
models for their problems.53−55 the study of IR and mass spectroscopy recognition, an
In the next subsections, we discuss ML methods which have autoencoder with rectified linear unit (ReLU) activation has
been used for predicting vibrational spectra.


been applied to remove redundant information and noise from
these two types of spectra.25 It is found that the AE-based model
UNSUPERVISED LEARNING gives better training scores for all types of functional groups in
Weighted Pair Group Method with Arithmetic Mean this study than the original (non-AE) model, which has been
constructed only with neural networks.


(WPGMA). WPGMA is a hierarchical clustering method based
on the pairwise similarity matrix.56 In the subdivision of the
vibrational space of semiclassical dynamics, WPGMA works as SUPERVISED LEARNING
an approximated (in the scope of two-mode interaction) Partial Least-Squares (PLS). PLS first introduced to solve
representation of nuclear vibrational subspaces (nodes) and multicollinearity problems and, similar to PCA, to reduce the
the links between them (edges), with the similarity defined as dimensionality of data.59 At the same time, it takes into account
the interaction between a couple of modes. 15 In the the correlation to response/dependent variables omitted in any
identification of experimental spectra, WPGMA is constructed unsupervised learning method. Early works of spectroscopy
based on cosine distances (inner product of two normalized recognition made use of this technique in assigning vibrational
vectors of representations) between the experimental spectra bands and comparing to the results from ANN and PCA-
and calculated harmonic spectra for a matching algorithm.24 ANN.1,2 It is pointed out that PLS provides a prediction that is
Principal Component Analysis (PCA). PCA is a statistical comparably good for the ANN-based approach and offers better
method adopted to deal with the dimensionality reduction of interpretation regarding the modeling aspect, though it does not
datathe data space is projected from n-dimension to k- intrinsically support nonlinear variations.2
dimension with these k orthogonal features as first k principal Gaussian Process Regression (GPR). GPR is a Bayesian
components.57 Previous studies using the PCA adopt it as a regression approach based on a kernel function that represents
preprocessing technique for the input data of the subsequent the covariance in the Gaussian process.29 GPR builds a
supervised learning (e.g., neural network).2,4 nonparametric model and can provide confidence intervals
805 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
Table 1. Literature References That Rely on the Struc-to-Spec Approach
no learning target ML method representation system ref
1 IR absorption intensities PCA+CPG;46 Atomic properties, interatomic Chemical space. Monosubstituted benzene derivatives (training 185 and test 110); Benzene derivatives (training 487 and test 384) 4,5
CPG46 distances
2 IR absorption intensities Query driven Atomic properties, interatomic Chemical space. Training: SpecInfo database; testing: 16 compounds; Trietazine metabolites 16,17
selection + RDF + distances
CPG46
3 IR/Raman intensities LFFN-EPFs18 IR/Raman intensities N-(2-Methylphenyl) and N-(3-methylphenyl) methanesulfonamides (training 80% and test 20%); 6-choloronicotinic acid 18,19
(training 80% and test 20%). Tested on same molecules, but different regions of the spectra.
4 Atomic partial charge HDNNP65 + ED- ACSFs;27 Geometric Configurational space. Methanol, n-alkanes (C69H140), protonated alanine tripeptide 9
GEKF68 descriptors
5 Atomic partial charge HIP-NN64 Atomic number, interatomic Chemical space. Training: ANI-1x data set, test: Drugbank and Tripeptide data sets; Training: GDB-5 or ANI-1x data sets, testing: 7,8
distances GDB-7to9 (built from refs 69,70), GDB-10to13 (built from ref 71), Drugbank,72 Tripeptides, ANI-MD, S66x873 data sets/
benchmarks
6 Dipole derivative and transition GPR29 + MLP ACSFs27 Configurational space. Condensed phase water (OH stretching mode) 11
The Journal of Physical Chemistry A

frequency
7 Polarizability tensor SA-GPR + λ-SOAP Atomic density74 Configurational space. Paracetamol molecule (training up to 1500 and test 500), Paracetamol crystal (training up to 2000 and 12
kernel60 testing all 2500)
8 Polarizability tensor SA-GPR + λ-SOAP Atomic density74 Chemical space. QM7b75 data set (training 5400 and test 1811) 13
kernel60
9 Vibrational Hamiltonian matrix MLP CM26 Chemical space. Training: 9660 NMA configurations, 5128 GLDP configurations, and test: 12 proteins (1000 configurations 6
each)
10 Wannier center DPMD66 + DW10 Geometric descriptors Configurational space. Condensed phase water/ice at different pressures 10
11 Vibrational frequency shift MLP; CNN ACSFs;27 Simple polynomial Configurational space. NMA/water solution (training 1250 and test 250) 21
functions
12 Vibrational frequencies and RF61 Atomic distances/angle/ Configurational space. BPE on gold surface (training 3600 and test 400) 23

806
intensities torsion, angle to gold surface
13 Permanent dipole moment and EANN67 + T- Atomic density74 Configurational space. Water molecule(s), condensed phase water (training 500 and test 500) 14
polarizability EANN14
14 IR/Raman vibrational modes PG-EA; WPGMA56 Clustered vibrational Configurational space. CH4, trans-NMA 15
subspaces15
77,78
15 Vibrational frequencies and MLP Symmetry functions76 Chemical space. QM8 data set and some molecules from QM10 data set (training:test = 9:1) 20
intensities of OH and CO
vibrational modes
pubs.acs.org/JPCA

16 Vibrational frequency shift of water MLP ACSFs,27 Polynomial Configurational space. Condensed phase water (training/validation 1250 and test 250) 22
vibrational modes functions,79 GTO-based
density vectors22
Perspective

J. Phys. Chem. A 2022, 126, 801−812


https://doi.org/10.1021/acs.jpca.1c10417
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

Table 2. Literature References That Rely on the Spec-to-Struc Approacha


no learning target ML method representation system refs
1 Certain structural PLS;59 PCA- Band pattern/Full Chemical space. Organic, inorganic, polyaromatic compounds (specific 1,2
fragment/functional MLP; MLP spectrum bonding types); 7 structural fragments (training:testing ≈ 1:1)
groups
2 Matter that contains 9 Binary and Band pattern Chemical space. 9 analyte compounds 3
analyte compounds decimal-based
NN
3 Nuclear configurations Basin-hopping Cosine distances Configurational space. Phenylalanine/serine dimer 24
algorithm80
4 Functional groups Autoencoder58 + Fourier transform IR Chemical space. 7393 compounds 25
MLP; RF61 peaks
5 Local structures of OH and LSTM63 DFT-calculated IR Chemical space. Training: QM877,78 data set; testing: 6000 molecules from 20
CO and Raman peaks QM977,78 and QM10 data sets
a
BPE: trans-1,2-bis (4-pyridyl) ethylene; ED-GEKF: element-decoupled global extended Kalman filter; GLDP: glycine dipeptide; LFFN-EPFs:
layered feedforward neural network-empirical physical formulas; NMA: N-methylacetamide; PG-EA: probability graph-evolutionary algorithm;
RDFs: radial distribution functions.

along with the prediction (mean value). Two studies12,13 have rapidly with the number of selected atomic or molecular
used an alternative GPR scheme, symmetry-adapted GPR (SA- properties. It also requires vast amounts of reference data points
GPR) with λ-SOAP kernels,60 for learning the polarizability for the selection and training of those ML features. These issues
tensor of molecules/molecular crystals. have hampered the degree of transferability of many ML models
Random Forest (RF). RF is ensemble learning with a for predicting the targets. When one develops an ML model for
multitude of decision trees that can be used for both vibrational spectra, the model is modified and fine-tuned until it
classification and regression tasks.61,62 It has recently been can excel at the prediction of spectra with the error as low as
applied for the regression on the Raman frequency and intensity possible. However, the ability of the ML model shall not be
of a metal surface system23 and the classification of molecular validated by considering only the accuracy of the model based
functional groups from an encoded latent vector.25 on the test sets but rather by the transferability across other
Artificial Neural Network (ANN). ANN is an algorithm molecules out of configurational space.
referring to a learning model with a collection of connected One of the concerning issues of ML methods that is still
nodes called artificial neurons. The most often used ANN debatable nowadays is the transferability (generalization
structure is a feed-forward network, including, e.g., multilayer capabilities) of learning when one applies models to new
perceptron (MLP) and convolutional neural network (CNN).
systems with the same number of atoms. Another issue is
References 6, 11, and 18−22 (Struc-to-Spec) and refs 1−3 and
transferability to systems with different sizes (i.e., different
25 (Spec-to-Struc) directly use MLP or CNN for the learning
numbers of atoms) compared to that in the training set. For the
procedure. References 4 and 5 apply a counter-propagation
(CPG) neural network46 that uses counter-propagation for the kernel-based method, a model relies on the optimization of a
training rather than commonly used back-propagation (e.g., in linear combination of similarity basis (kernel) functions that
MLP and CNN). Long short-term memory (LSTM) is one of enforces the agreement between objects centered on each point
the recurrent neural network structures that can deal with time in the training set. The transferability of this method generally
series problems with long-term dependencies.63 It is utilized for depends on the extraction of various discriminative features
OH and CO functional group recognition in data sets of related to the similarity among selected samples in the kernel.26
molecules from computed IR and Raman spectra.20 Several Quite interestingly, there is, nevertheless, evidence of a
studies utilize or modify neural networks designed for machine significant increase in the transferability of kernel methods
learning potentials/force fields, namely, hierarchically interact- when high-resolution representations are employed to fit the
ing particle neural network (HIP-NN),64 high-dimensional kernel.81 On the contrary, by extracting features from multiple
neural network potential (HDNNP),65 deep potential molecular layers of a neural network approach, the neural network-based
dynamics/deep Wannier (DPMD/DW),10,66 and embedded method has proven to be independent of such task-oriented
atom neural network/tensorial embedded atom neural network features and hence more flexible for addressing various tasks
(EANN/T-EANN),14,67 so that they can be applied for learning related to automation, therefore offering better transfer learning
of partial charges, electric dipole moments, or polarizabilities.7−9 to unseen systems.82 More specifically, since the input layer of a
Note that EANN/T-EANN can also be used to predict feed-forward neural network has a fixed size, the input vectors
electronic spectroscopy. cannot change dimensions. Once a feed-forward neural network
More details of the literature references regarding ML has been trained on input data represented by vectors of
methods, representations (descriptors), and corresponding representations of length N, it cannot be used to predict the
studied systems including information about training and test properties of a molecule represented by a vector of length M.
data sets for vibrational spectroscopy are provided in Tables 1 Many attempts including divide-and-conquer techniques have
and 2. been made to overcome these issues. In ANN, the idea is to

■ DISCUSSION
Building an ML framework to identify a diverse set of molecular
decompose the target at the global (molecular) level into the
contribution of the characteristics of individual atoms. For the
Struc-to-Spec, one can decompose the structure into fragments
conformations (configurational space) and an ensemble of or many chunks of atoms and learn on them individually.
molecules (chemical space) requires extensive training input Likewise, in the Spec-to-Struc framework, vibrational frequen-
data with respect to a number of molecular features that scale cies and peaks can be disintegrated and generalized, and one can
807 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

Figure 3. Illustration of the best practice for the machine learning pipeline workflow in data science in six steps.

find the direct correspondence between a peak and its relevant structure can potentially be used in the aspect of molecular
fragment. representations. Besides, ΔML (an algorithm that learns from
The intensities of vibrational peaks depend among other the corrections from low-level theory to high-level theory),
things on the local environment around a particular atom (e.g., which has been applied in other fields of computational
functional group, aromatic ring). For the case of using ANN as a chemistry (e.g., force field development), can also be utilized
method, it performs excellently in the provision of high for achieving high accuracy.95−97
transferability of the model because the ANN is composed of The training data, depending on the systems of interest, can
many feed-forward networks that contain a cluster of learning be either manually selected/constructed or obtained from
units, generally one for each type of chemical element present in available data sets. However, to evaluate the performance of
the system. Each one of them can take as input a fixed-size vector different models, data sets specially designed in the aspect of
describing the local environment around a certain fragment. All vibrational spectroscopy, similar to, e.g., the QM9 data set77,78 in
corresponding vibrational peaks are then summed together to the field of MLFFs, are needed. As with other topics using ML,
give the overall spectra of the molecule. The cost function also the transferability and interpretability of the model are major
remains identical to that in the feed-forward MLP. issues of these data-driven techniques. Transferability in
The learning targets adopted in previous studies vary from chemical/configurational space determines the scope of the
system properties (e.g., partial charges, electric dipole moments, application of models, and interpretability of models usually
polarizabilities) to exact spectra (e.g., intensities with associated provides an in-depth view of the learning process. These should
frequencies). The quality of the properties determines the be extensively discussed in ML manuscripts. Except for accuracy
accuracy of the predicted spectra; that is, how well these and precision, other evaluation metrics like sensitivity and
properties can be represented by the information on the system specificity should also be reported if classification techniques are
is a crucial part of ML. We notice that many studies applied applied.
similar descriptors as in MLFFs and achieved reasonable In addition to the general points of ML and vibrational spectra
predictions. It would be interesting to see differences between discussed above, several techniques that are normally used in,
the performance of these descriptors on force/energy and e.g., data science competitions (e.g., Kaggle98), have moved into
spectroscopic property predictions. As illustrated in Figure 1, scientific research. Figure 3 shows the six steps of a typical best
both nuclear and electronic structures can provide meaningful practice of an ML workflow that provides a powerful way to
information. However, only a few studies use the electronic preprocess the data and make a model. Previous reviews99,100
structures as descriptors and are limited to atomic density have suggested adopting many of those techniques such as
(SOAP kernel). Considering the emerging ML studies on exploratory data analysis (EDA)101 and feature engineering in
electronic structures,83−88 there might be a possibility to use step 3 in order to inform the modeling process. Other
density/wave function-based representations or predicted techniques such as balanced cross-validation (CV), metrics
properties for the estimation of spectroscopic properties at a and hyperparameter optimizations, regularization, and ensem-
higher level of calculations for complex molecules. On the topic bling in step 4 of the workflow are important for avoiding the
of learning targets, up to now, ML studies have only been overfitting problem.102 Subsequently, it is necessary to validate
focused on IR/Raman or related system properties. Plenty of an ML model with a testing set and test it using independent
other vibrational spectroscopy techniques in which potentially data sets in step 5. The prediction results from the ML model in
more challenging properties, e.g., higher electric multipole step 6 can be validated with either quantum chemistry references
moments, higher-order polarizabilitites, magnetic moments, or experimental results.
etc., are involved can also become ML targets. Many attempts in statistical learning have been made to the
Some works learn directly from vibrational frequencies and ML field, witnessing the rebirth of ANN in the past few years.103
intensities of specific modes in the spectra. Together with Spec- That is, the backpropagation algorithm has not fundamentally
to-Struc, several attempts have been made to find a mapping changed since it was invented, but rather we have a million times
between chemical structures/substructures/functional groups more central and graphic processing unit (CPU and GPU,
and vibrational peaks. Although this relationship is more respectively) power. We also have algorithmic advances, like the
chemically intuitive than spectroscopic properties, the idea transfer learning with a pretrained model or Hessian-free
behind it is empirical (e.g., characteristic frequency-based optimization for ANN, but actually the most significant
spectrum recognition), making quantification a challenging advancement is due to Moore’s law. Furthermore, many ML
process. Considering the structure−spectrum bidirectional software packages and libraries bring convenience to the
relationship, it will in principle also be possible to make use of community; in particular, one can build and train the ML
well-developed ML strategies of language translation (e.g., models, predict the targets, optimize the workflow, and analyze
seq2seq,89 attention,90 transformers91). Studies92−94 applying the results with less effort compared to a decade ago. With the
generative adversarial networks to extract descriptors for the ML methods and techniques developed so far, a few attempts
Raman spectrum predictions are also known, and same NN have also been made in the community to make ML, especially
808 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

deep learning, more user-friendly and efficient. Widely used Ultimately, computational chemists were enormously excited
libraries such as Scikit-learn104 for feature engineering and
kernel methods and TensorFlow54 and PyTorch105 for easily about ML as far back as the 1990s (then known as pattern
trainable deep learning have been developed to create user- recognition), and many of the implications were clear even then.
centered experiences, helping alleviate the struggle faced by new
users. With the ability of ML methods developed to date, one can

■ CONCLUDING REMARKS AND OUTLOOK


We have reviewed a number of previous works indicating that
foresee that ML has a bright future and is opening the door for
chemists to continue to advance research in the field of
ML methods have become popular for the development of
computer techniques for achieving accurate vibrational spectra vibrational spectroscopy.


compared with reference spectra obtained by quantum
chemistry methods. In this review, we have classified the AUTHOR INFORMATION
methodological paradigm into Struc-to-Spec and Spec-to-Struc.
The former is a direct learning of the structural properties on a Corresponding Author
molecule to predict vibrational spectra of interest, and vice versa Sandra Luber − Department of Chemistry, University of Zurich,
for the latter. Only IR and nonresonant Raman spectroscopy CH-8057 Zürich, Switzerland; orcid.org/0000-0002-
techniques have been targeted up to the present time by ML; 6203-9379; Email: [email protected]
there are therefore various ways in terms of development and
application of ML for other types of techniques, e.g., time- Authors
resolved vibrational spectroscopy, spectroscopy for chiral Ruocheng Han − Department of Chemistry, University of
compounds, or resonance enhanced vibrational spectroscopy, Zurich, CH-8057 Zürich, Switzerland
which are in general more challenging than the (non-time- Rangsiman Ketkaew − Department of Chemistry, University of
resolved) IR/Raman spectroscopy investigated with ML up to Zurich, CH-8057 Zürich, Switzerland
now. Previous works reviewed in this article also imply that ML
can serve as a game-changer for computational spectroscopy to Complete contact information is available at:
address accurate vibrational spectra in comparison with https://pubs.acs.org/10.1021/acs.jpca.1c10417
experiments. That is, ML is in comparison to conventional
methods relatively accurate, low-cost, and user-friendly, and Author Contributions
enables extremely fast predictions of desired spectra. We also ‡
R.H. and R.K. contributed equally to this work.
suggest using ML methods that are well-developed for language
translation and the ΔML technique in the future prediction of Notes
vibrational spectroscopy.
The authors declare no competing financial interest.
Sophisticated molecular representations (descriptors) serve
various purposes in terms of inputs and outcomes. Whole- Biographies
system representations are better suited to global properties
(e.g., for predicting vibrational spectrum of the whole molecule),
but they must be modified to represent local environments.
Global representations such as the CM and BoB are covered in
this review. On the other hand, promising local representations
including the interatomic distance, ACSFs, and GTO-based
density function are used instead to describe the local
environment around each atom in the system. These are
possibly the most commonly used representations in recent
works. Besides, density/wave function-based representations for
electronic structures as discussed are good candidates for
describing chemical systems.
We also discussed the transferability of ML, which currently
causes serious limitations in the practical prediction of real-
world systems and vibrational spectra. When one wants to
increase learning speed and reduce the amount of data (training Ruocheng Han completed his Bachelor’s degree at University of
set) required, a transferable ML model is a must for long-term
sustainability for the prediction of molecular properties in terms Science and Technology of China and The University of Tokyo, Japan,
of vibrational spectra. This technique, so-called transfer
learning106an approach providing a path toward fitting and obtained his Master’s degree in chemistry from ETH Zurich,
general-purpose MLcan unlock these two major benefits, Switzerland. In 2018, he joined the group of Prof. Sandra Luber at
enabling algorithms to learn a new molecular structure task by
using pretrained models and predict vibrational frequencies, and University of Zurich to pursue his PhD in theoretical and computa-
vice versa. Besides, we suggest using specially designed data sets
in the field of ML for vibrational spectroscopy to ensure the tional chemistry. The topics of his research are across methodology
objective evaluation of the model transferabilities. We addition- development in quantum chemistry, machine learning, and advancing
ally discussed the best practice for improving and putting ML
models into production for real-world problems. electronic structure.

809 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

workempirical physical formulas; NMA, N-methylacetamide;


PG-EA, Probability graph-evolutionary algorithm; RDFs, Radial
distribution functions

■ REFERENCES
(1) Visser, T.; Luinge, H.; van der Maas, J. Recognition of visual
characteristics of infrared spectra by artificial neural networks and
partial least squares regression. Anal. Chim. Acta 1994, 296, 141−154.
(2) Luinge, H.; van der Maas, J.; Visser, T. Partial least squares
regression as a multivariate tool for the interpretation of infrared
spectra. Chemom. Intell. Lab. Syst. 1995, 28, 129−138.
(3) Carrieri, A. H.; Lim, P. I. Neural network pattern recognition of
thermal-signature spectra for chemical defense. Appl. Opt. 1995, 34,
2623.
(4) Schuur, J. H.; Selzer, P.; Gasteiger, J. The Coding of the Three-
Rangsiman Ketkaew completed his Bachelor’s and Master’s degrees in
Dimensional Structure of Molecules by Molecular Transforms and Its
chemistry at Thammasat University, Thailand. In 2019, he joined New
Application to Structure-Spectra Correlations and Studies of Biological
Equilibrium Biosciences, Boston, MA as an AWS cloud and quantum Activity. J. Chem. Inf. Comput. Sci. 1996, 36, 334−344.
chemistry consultant. In 2020, he started to pursue a PhD degree in (5) Schuur, J.; Gasteiger, J. Infrared Spectra Simulation of Substituted
theoretical chemistry under the supervision of Prof. Sandra Luber at the Benzene Derivatives on the Basis of a 3D Structure Representation.
University of Zurich, Switzerland. His research focuses on ab initio Anal. Chem. 1997, 69, 2398−2405.
molecular dynamics, enhanced sampling, machine learning, and open- (6) Ye, S.; Zhong, K.; Zhang, J.; Hu, W.; Hirst, J. D.; Zhang, G.;
source software development in the field of catalysis. Mukamel, S.; Jiang, J. A Machine Learning Protocol for Predicting
Protein Infrared Spectra. J. Am. Chem. Soc. 2020, 142, 19071−19077.
(7) Sifain, A. E.; Lubbers, N.; Nebgen, B. T.; Smith, J. S.; Lokhov, A.
Y.; Isayev, O.; Roitberg, A. E.; Barros, K.; Tretiak, S. Discovering a
Transferable Charge Assignment Model Using Machine Learning. J.
Phys. Chem. Lett. 2018, 9, 4495−4501.
(8) Nebgen, B.; Lubbers, N.; Smith, J. S.; Sifain, A. E.; Lokhov, A.;
Isayev, O.; Roitberg, A. E.; Barros, K.; Tretiak, S. Transferable Dynamic
Molecular Charge Assignment Using Deep Neural Networks. J. Chem.
Theory Comput. 2018, 14, 4687−4698.
(9) Gastegger, M.; Behler, J.; Marquetand, P. Machine learning
molecular dynamics for the simulation of infrared spectra. Chem. Sci.
2017, 8, 6924−6935.
(10) Zhang, L.; Chen, M.; Wu, X.; Wang, H.; E, W.; Car, R. Deep
neural network for the dielectric response of insulators. Phys. Rev. B
2020, 102, 102.
(11) Kananenka, A. A.; Yao, K.; Corcelli, S. A.; Skinner, J. L. Machine
Sandra Luber studied chemistry at the University of Erlangen- Learning for Vibrational Spectroscopic Maps. J. Chem. Theory Comput.
Nuremberg, Germany and ETH Zurich, where she received the 2019, 15, 6850−6858.
Master’s degree in 2007. She completed her PhD in (relativistic) (12) Raimbault, N.; Grisafi, A.; Ceriotti, M.; Rossi, M. Using Gaussian
quantum chemistry and theoretical spectroscopy in 2009. After a process regression to simulate the vibrational Raman spectra of
postdoctoral stay in the field of bioinformatics at Biozentrum of the molecular crystals. New J. Phys. 2019, 21, 105001.
(13) Wilkins, D. M.; Grisafi, A.; Yang, Y.; Lao, K. U.; DiStasio, R. A.;
University of Basel, she joined Yale University, USA. After a stay in
Ceriotti, M. Accurate molecular polarizabilities with coupled cluster
industry, she became project group leader at University of Zurich. The theory and machine learning. Proc. Natl. Acad. Sci. U. S. A. 2019, 116,
habilitation thesis was finished in 2016, and she has been professor at 3401−3406.
University of Zurich since 2017. Her research group focuses on the (14) Zhang, Y.; Ye, S.; Zhang, J.; Hu, C.; Jiang, J.; Jiang, B. Efficient
development and application of methods derived from quantum and Accurate Simulations of Vibrational and Electronic Spectra with
mechanics with emphasis on novel methods for spectroscopy, catalysis, Symmetry-Preserving Neural Network Models for Tensorial Proper-
and design of functional compounds. In recent years, focus has been, ties. J. Phys. Chem. B 2020, 124, 7284−7290.
among others, on dynamic first-principles methods for accurate and (15) Gandolfi, M.; Rognoni, A.; Aieta, C.; Conte, R.; Ceotto, M.
efficient simulation of gas and condensed phase systems. Machine learning for vibrational spectroscopy via divide-and-conquer


semiclassical initial value representation molecular dynamics with
application to N-methylacetamide. J. Chem. Phys. 2020, 153, 204104.
ACKNOWLEDGMENTS (16) Selzer, P.; Gasteiger, J.; Thomas, H.; Salzer, R. Rapid Access to
This work is supported by the University of Zurich. We Infrared Reference Spectra of Arbitrary Organic Compounds: Scope
acknowledge funding by The National Centre of Competence in and Limitations of an Approach to the Simulation of Infrared Spectra by
Research (NCCR) “Sustainable chemical processes through Neural Networks. Chem.Eur. J. 2000, 6, 920−927.
catalysis (Catalysis)” of the Swiss National Science Foundation. (17) Kostka, T.; Selzer, P.; Gasteiger, J. A Combined Application of


Reaction Prediction and Infrared Spectra Simulation for the
ABBREVIATIONS Identification of Degradation Products ofs-Triazine Herbicides.
Chem.Eur. J. 2001, 7, 2254−2260.
BPE, Trans-1,2-bis (4-pyridyl) ethy; ED-GEKF, Element- (18) Yildiz, N.; Karabacak, M.; Kurt, M. Neural network consistent
decoupled global extended Kalman filter; GLDP, Glycine empirical physical formula construction for DFT based nonlinear
dipeptide; LFFN-EPFs, Layered feedforward neural net- vibrational spectra intensities of N-(2-methylphenyl) and N-(3-

810 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

methylphenyl) methanesulfonamides. J. Mol. Struct. 2011, 1006, 642− (41) Runge, E.; Gross, E. K. U. Density-Functional Theory for Time-
649. Dependent Systems. Phys. Rev. Lett. 1984, 52, 997−1000.
(19) Yildiz, N.; Karabacak, M.; Kurt, M.; Akkoyun, S. Neural network (42) Yabana, K.; Bertsch, G. F. Time-dependent local-density
consistent empirical physical formula construction for density func- approximation in real time. Phys. Rev. B 1996, 54, 4484−4487.
tional theory based nonlinear vibrational absorbance and intensity of 6- (43) Mattiat, J.; Luber, S. Efficient calculation of (resonance) Raman
choloronicotinic acid molecule. Spectrochim. Acta, Part A 2012, 90, 55− spectra and excitation profiles with real-time propagation. J. Chem. Phys.
62. 2018, 149, 174108.
(20) Ren, H.; Li, H.; Zhang, Q.; Liang, L.; Guo, W.; Huang, F.; Luo, (44) Mattiat, J.; Luber, S. Vibrational (resonance) Raman optical
Y.; Jiang, J. A machine learning vibrational spectroscopy protocol for activity with real time time dependent density functional theory. J.
spectrum prediction and spectrum-based structure recognition. Chem. Phys. 2019, 151, 234110.
Fundam. Res. 2021, 1, 488−494. (45) Hamm, P.; Zanni, M. Concepts and Methods of 2D Infrared
(21) Kwac, K.; Cho, M. Machine learning approach for describing Spectroscopy; Cambridge University Press, 2009.
vibrational solvatochromism. J. Chem. Phys. 2020, 152, 174101. (46) Hecht-Nielsen, R. Counterpropagation networks. Appl. Opt.
(22) Kwac, K.; Freedman, H.; Cho, M. Machine Learning Approach 1987, 26, 4979−4984.
for Describing Water OH Stretch Vibrations. J. Chem. Theory Comput. (47) Musil, F.; Grisafi, A.; Bartók, A. P.; Ortner, C.; Csányi, G.;
2021, 17, 6353. Ceriotti, M. Physics-Inspired Structural Representations for Molecules
(23) Hu, W.; Ye, S.; Zhang, Y.; Li, T.; Zhang, G.; Luo, Y.; Mukamel, S.; and Materials. Chem. Rev. 2021, 121, 9759−9815.
Jiang, J. Machine Learning Protocol for Surface-Enhanced Raman (48) Keith, J. A.; Vassilev-Galindo, V.; Cheng, B.; Chmiela, S.;
Spectroscopy. J. Phys. Chem. Lett. 2019, 10, 6026−6031. Gastegger, M.; Müller, K.-R.; Tkatchenko, A. Combining Machine
(24) Fu, W.; Hopkins, W. S. Applying Machine Learning to Learning and Computational Chemistry for Predictive Insights Into
Vibrational Spectroscopy. J. Phys. Chem. A 2018, 122, 167−171. Chemical Systems. Chem. Rev. 2021, 121, 9816−9872.
(25) Fine, J. A.; Rajasekar, A. A.; Jethava, K. P.; Chopra, G. Spectral (49) Braams, B. J.; Bowman, J. M. Permutationally invariant potential
deep learning for prediction and prospective validation of functional energy surfaces in high dimensionality. Int. Rev. Phys. Chem. 2009, 28,
groups. Chem. Sci. 2020, 11, 4618−4630. 577−606.
(26) Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. (50) Hansen, K.; Montavon, G.; Biegler, F.; Fazli, S.; Rupp, M.;
Fast and Accurate Modeling of Molecular Atomization Energies with Scheffler, M.; von Lilienfeld, O. A.; Tkatchenko, A.; Müller, K.-R.
Machine Learning. Phys. Rev. Lett. 2012, 108, 1 DOI: 10.1103/ Assessment and Validation of Machine Learning Methods for Predicting
PhysRevLett.108.058301. Molecular Atomization Energies 2013, 9, 3404−3419.
(27) Behler, J. Atom-centered symmetry functions for constructing (51) Gallet, G. A.; Pietrucci, F. Structural cluster analysis of chemical
reactions in solution. J. Chem. Phys. 2013, 139, 074101.
high-dimensional neural network potentials. J. Chem. Phys. 2011, 134,
(52) Faber, F. A.; Christensen, A. S.; Huang, B.; von Lilienfeld, O. A.
074106.
Alchemical and structural distribution based representation for
(28) De, S.; Bartók, A. P.; Csányi, G.; Ceriotti, M. Comparing
universal quantum machine learning. J. Chem. Phys. 2018, 148, 241717.
molecules and solids across structural and alchemical space. Phys. Chem.
(53) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion,
Chem. Phys. 2016, 18, 13754−13769.
B.; Grisel, O.; Blondel, M.; Müller, A.; Nothman, J.; Louppe, G. et al.
(29) Williams, C. K. I.; Rasmussen, C. E. Gaussian Processes for
Scikit-learn: Machine Learning in Python; 2018.
Regression. Advances in Neural Information Processing Systems 1996, 8,
(54) Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro,
514−520. C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M. et al. TensorFlow:
(30) Kiewisch, K.; Luber, S.; Neugebauer, J.; Reiher, M. Intensity
Large-Scale Machine Learning on Heterogeneous Systems; 2015; https://
Tracking for Vibrational Spectra of Large Molecules. CHIMIA www.tensorflow.org/, Software available from tensorflow.org.
International Journal for Chemistry 2009, 63, 270−274. (55) Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan,
(31) Luber, S.; Reiher, M. Intensity-Carrying Modes in Raman and G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. et al. PyTorch: An
Raman Optical Activity Spectroscopy. ChemPhysChem 2009, 10, Imperative Style. High-Performance Deep Learning Library, 2019.
2049−2057. (56) Sokal, R. R.; Michener, C. D. A statistical method for evaluating
(32) Luber, S.; Neugebauer, J.; Reiher, M. Intensity tracking for systematic relationships. University of Kansas Science Bulletin 1958, 38,
theoretical infrared spectroscopy of large molecules. J. Chem. Phys. 1409−1438.
2009, 130, 064105. (57) Pearson, K. LIII. On lines and planes of closest fit to systems of
(33) Luber, S. Dynamic ab initio Methods for Vibrational Spectros- points in space. London Edinb. Dubl. Philos. Mag. 1901, 2, 559−572.
copy. CHIMIA International Journal for Chemistry 2018, 72, 328−332. (58) Ballard, D. H. Modular learning in neural networks. AAAI 1987,
(34) Thomas, M.; Brehm, M.; Fligg, R.; Vöhringer, P.; Kirchner, B. 279−284.
Computing vibrational spectra from ab initio molecular dynamics. Phys. (59) Wold, S.; Ruhe, A.; Wold, H.; Dunn, W. J., III The Collinearity
Chem. Chem. Phys. 2013, 15, 6608. Problem in Linear Regression. The Partial Least Squares (PLS)
(35) Jensen, L.; Zhao, L. L.; Autschbach, J.; Schatz, G. C. Theory and Approach to Generalized Inverses. SIAM J. Sci. Comput. 1984, 5, 735−
method for calculating resonance Raman scattering from resonance 743.
polarizability derivatives. J. Chem. Phys. 2005, 123, 174110. (60) Grisafi, A.; Wilkins, D. M.; Csányi, G.; Ceriotti, M. Symmetry-
(36) Mattiat, J.; Luber, S. Time Domain Simulation of (Resonance) Adapted Machine Learning for Tensorial Properties of Atomistic
Raman Spectra of Liquids in the Short Time Approximation. J. Chem. Systems. Phys. Rev. Lett. 2018, 120, 1 DOI: 10.1103/PhysRev-
Theory Comput. 2021, 17, 344−356. Lett.120.036002.
(37) Gonze, X. Perturbation expansion of variational principles at (61) Breiman, L. Random Forests. Mach. Learn 2001, 45, 5−32.
arbitrary order. Phys. Rev. A 1995, 52, 1086−1095. (62) Quinlan, J. R. Induction of decision trees. Mach. Learn. 1986, 1,
(38) Gonze, X. Adiabatic density-functional perturbation theory. Phys. 81−106.
Rev. A 1995, 52, 1096−1114. (63) Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory.
(39) Luber, S.; Iannuzzi, M.; Hutter, J. Raman spectra from ab initio Neural Comput 1997, 9, 1735−1780.
molecular dynamics and its application to liquid S-methyloxirane. J. (64) Lubbers, N.; Smith, J. S.; Barros, K. Hierarchical modeling of
Chem. Phys. 2014, 141, 094503. molecular energies using a deep neural network. J. Chem. Phys. 2018,
(40) Luber, S. Raman Optical Activity Spectra from Density 148, 241715.
Functional Perturbation Theory and Density-Functional-Theory- (65) Behler, J.; Parrinello, M. Generalized Neural-Network
Based Molecular Dynamics. J. Chem. Theory Comput. 2017, 13, Representation of High-Dimensional Potential-Energy Surfaces. Phys.
1254−1262. Rev. Lett. 2007, 98, 1 DOI: 10.1103/PhysRevLett.98.146401.

811 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812
The Journal of Physical Chemistry A pubs.acs.org/JPCA Perspective

(66) Zhang, L.; Han, J.; Wang, H.; Car, R.; E, W. Deep Potential (87) Han, R.; Rodríguez-Mayorga, M.; Luber, S. A Machine Learning
Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Approach for MP2 Correlation Energies and Its Application to Organic
Mechanics. Phys. Rev. Lett. 2018, 120, 1. Compounds. J. Chem. Theory Comput. 2021, 17, 777−790.
(67) Zhang, Y.; Hu, C.; Jiang, B. Embedded Atom Neural Network (88) Han, R.; Luber, S. Fast Estimation of Møller−Plesset Correlation
Potentials: Efficient and Accurate Machine Learning with a Physically Energies Based on Atomic Contributions. J. Phys. Chem. Lett. 2021, 12,
Inspired Representation. J. Phys. Chem. Lett. 2019, 10, 4962−4967. 5324−5331.
(68) Gastegger, M.; Marquetand, P. High-Dimensional Neural (89) Sutskever, I.; Vinyals, O.; Le, Q. V. Sequence to Sequence
Network Potentials for Organic Reactions and an Improved Training Learning with Neural Networks. arXiv e-prints 2014, arXiv:1409.3215.
Algorithm. J. Chem. Theory Comput. 2015, 11, 2187−2198. (90) Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.;
(69) Fink, T.; Bruggesser, H.; Reymond, J.-L. Virtual Exploration of Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need.
the Small-Molecule Chemical Universe below 160 Da. Angew. Chem., arXiv e-prints 2017, arXiv:1706.03762.
Int. Ed. 2005, 44, 1504−1508. (91) Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi,
A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M. et al. Transformers:
(70) Fink, T.; Reymond, J.-L. Virtual Exploration of the Chemical
State-of-the-Art Natural Language Processing. Proceedings of the 2020
Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million
Conference on Empirical Methods in Natural Language Processing: System
Structures (110.9 Million Stereoisomers) and Analysis for New Ring Demonstrations; 2020; pp 38−45.
Systems, Stereochemistry, Physicochemical Properties, Compound (92) Bertz, M.; Yanagisawa, M.; Homma, T. Deep Learning
Classes, and Drug Discovery. J. Chem. Inf. Model. 2007, 47, 342−353. Combined with Surface-Enhanced Raman Spectroscopy for Chemical
(71) Blum, L. C.; Reymond, J.-L. 970 Million Druglike Small Sensing and Recognition. ECS Meeting Abstracts 2021, MA2021-01,
Molecules for Virtual Screening in the Chemical Universe Database 1311−1312.
GDB-13. J. Am. Chem. Soc. 2009, 131, 8732−8733. (93) Zheng, A.; Yang, H.; Pan, X.; Yin, L.; Feng, Y. Identification of
(72) Law, V.; Knox, C.; Djoumbou, Y.; Jewison, T.; Guo, A. C.; Liu, Multi-Class Drugs Based on Near Infrared Spectroscopy and
Y.; Maciejewski, A.; Arndt, D.; Wilson, M.; Neveu, V.; et al. DrugBank Bidirectional Generative Adversarial Networks. Sensors 2021, 21, 1088.
4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2014, 42, (94) Du, Y.; Han, D.; Liu, S.; Sun, X.; Ning, B.; Han, T.; Wang, J.; Gao,
D1091−D1097. Z. Raman spectroscopy-based adversarial network combined with SVM
(73) Brauer, B.; Kesharwani, M. K.; Kozuch, S.; Martin, J. M. L. The for detection of foodborne pathogenic bacteria. Talanta 2022, 237,
S66 × 8 benchmark for noncovalent interactions revisited: explicitly 122901.
correlated ab initio methods and density functional theory. Phys. Chem. (95) Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. Big
Chem. Phys. 2016, 18, 20905−20925. Data Meets Quantum Chemistry Approximations: The {upDelta}-
(74) Willatt, M. J.; Musil, F.; Ceriotti, M. Atom-density representations Machine Learning Approach. J. Chem. Theory Comput. 2015, 11, 2087−
for machine learning 2019, 150, 154110. 2096.
(75) Montavon, G.; Rupp, M.; Gobre, V.; Vazquez-Mayagoitia, A.; (96) Stohr, M.; Medrano Sandonas, L.; Tkatchenko, A. Accurate
Hansen, K.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. Many-Body Repulsive Potentials for Density-Functional Tight Binding
Machine learning of molecular electronic properties in chemical from Deep Tensor Neural Networks. J. Phys. Chem. Lett. 2020, 11,
compound space. New J. Phys. 2013, 15, 095003. 6835−6843.
(76) Botu, V.; Ramprasad, R. Adaptive machine learning framework to (97) Käser, S.; Unke, O. T.; Meuwly, M. Reactive dynamics and
accelerate ab initio molecular dynamics. Int. J. Quantum Chem. 2015, spectroscopy of hydrogen transfer from neural network-based reactive
potential energy surfaces. New J. Phys. 2020, 22, 055002.
115, 1074−1083.
(98) Kaggle. https://www.kaggle.com.
(77) Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J.-L.
(99) Jablonka, K. M.; Ongari, D.; Moosavi, S. M.; Smit, B. Big-Data
Enumeration of 166 Billion Organic Small Molecules in the Chemical
Science in Porous Materials: Materials Genomics and Machine
Universe Database GDB-17. J. Chem. Inf. Model. 2012, 52, 2864−2875. Learning. Chem. Rev. 2020, 120, 8066−8129.
(78) Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. (100) Rahman, M. S.; Khomh, F.; Hamidi, A.; Cheng, J.; Antoniol, G.;
Quantum chemistry structures and properties of 134 kilo molecules. Sci. Washizaki, H. Machine Learning Application Development: Practi-
Data 2014, 1, 1. tioners’ Insights. arXiv 2021.
(79) Jiang, B.; Guo, H. Permutation invariant polynomial neural (101) David, F. N.; Tukey, J. W. Exploratory data analysis. Biometrics
network approach to fitting potential energy surfaces. J. Chem. Phys. 1977, 33, 768.
2013, 139, 054112. (102) Artrith, N.; Butler, K. T.; Coudert, F.-X.; Han, S.; Isayev, O.;
(80) Wales, D. J.; Doye, J. P. K. Global Optimization by Basin- Jain, A.; Walsh, A. Best practices in machine learning for chemistry 2021,
Hopping and the Lowest Energy Structures of Lennard-Jones Clusters 13, 505−508.
Containing up to 110 Atoms. J. Phys. Chem. A 1997, 101, 5111−5116. (103) Schmidhuber, J. Deep learning in neural networks: An overview.
(81) Christensen, A. S.; Bratholm, L. A.; Faber, F. A.; Anatole von Neural Networks 2015, 61, 85−117.
Lilienfeld, O. FCHL revisited: Faster and more accurate quantum (104) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.;
machine learning. J. Chem. Phys. 2020, 152, 044107. Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.;
(82) Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. Journal of
learning using computational intelligence: A survey. Knowledge-Based Machine Learning Research 2011, 12, 2825−2830.
Systems 2015, 80, 14−23. (105) Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan,
(83) Welborn, M.; Cheng, L.; Miller, T. F. Transferability in Machine G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. et al. In Advances in
Learning for Electronic Structure via the Molecular Orbital Basis. J. Neural Information Processing Systems 32, Wallach, H.; Larochelle, H.;
Chem. Theory Comput. 2018, 14, 4772−4779. Beygelzimer, A.; d’ Alché-Buc, F.; Fox, E.; Garnett, R., Eds.; Curran
(84) Dick, S.; Fernandez-Serra, M. Machine learning accurate Associates, Inc., 2019; pp 8024−8035.
exchange and correlation functionals of the electronic density. Nat. (106) Dauphin, G. M. Y.; Glorot, X.; Rifai, S.; Bengio, Y.; Goodfellow,
Commun. 2020, 11, 1. I.; Lavoie, E.; Muller, X.; Desjardins, G.; Warde-Farley, D.; Vincent, P.
(85) Chen, Y.; Zhang, L.; Wang, H.; E, W. Ground State Energy et al. Unsupervised and Transfer Learning Challenge: a Deep Learning
Functional with Hartree−Fock Efficiency and Chemical Accuracy. J. Approach. Proceedings of ICML Workshop on Unsupervised and Transfer
Learning; Bellevue: Washington, USA, 2012; pp 97−110.
Phys. Chem. A 2020, 124, 7155−7165.
(86) Ikabata, Y.; Fujisawa, R.; Seino, J.; Yoshikawa, T.; Nakai, H.
Machine-learned electron correlation model based on frozen core
approximation. J. Chem. Phys. 2020, 153, 184108.

812 https://doi.org/10.1021/acs.jpca.1c10417
J. Phys. Chem. A 2022, 126, 801−812

You might also like