0% found this document useful (0 votes)

10 views10 pages

Patra Et Al. - 2020 - Accelerating Copolymer Inverse Design Using Monte

The paper discusses the application of Monte Carlo tree search (MCTS) to accelerate the inverse design of copolymers, addressing challenges in exploring vast material search spaces efficiently. By integrating MCTS with molecular dynamics simulations, the authors demonstrate the ability to identify optimal copolymer sequences that minimize interfacial energy between immiscible polymers with significantly fewer evaluations. This approach shows promise for broader applications in materials design, particularly in cases where sequence-property data is limited or resource-intensive to obtain.

Uploaded by

theqmy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

Patra Et Al. - 2020 - Accelerating Copolymer Inverse Design Using Monte

Uploaded by

theqmy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Nanoscale

View Article Online

PAPER View Journal
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

Accelerating copolymer inverse design using

Cite this: DOI: 10.1039/d0nr06091g
monte carlo tree search
Tarak K. Patra, *a Troy D. Loeﬄerb,c and Subramanian K. R. S. Sankaranarayanan *b,c

There exists a broad class of sequencing problems in soft materials such as proteins and polymers that
can be formulated as a heuristic search that involves decision making akin to a computer game. AI
gaming algorithms such as Monte Carlo tree search (MCTS) gained prominence after their exemplary per-
formance in the computer Go game and are decision trees aimed at identifying the path (moves) that
should be taken by the policy to reach the final winning or optimal solution. Major challenges in inverse
sequencing problems are that the materials search space is extremely vast and property evaluation for
each sequence is computationally demanding. Reaching an optimal solution by minimizing the total
number of evaluations in a given design cycle is therefore highly desirable. We demonstrate that one can
adopt this approach for solving the sequencing problem by developing and growing a decision tree,
where each node in the tree is a candidate sequence whose fitness is directly evaluated by molecular
simulations. We interface MCTS with MD simulations and use a representative example of designing a
copolymer compatibilizer, where the goal is to identify sequence specific copolymers that lead to zero
interfacial energy between two immiscible homopolymers. We apply the MCTS algorithm to polymer
chain lengths varying from 10-mer to 30-mer, wherein the overall search space varies from 210 (1024)
to 230 (∼1 billion). In each case, we identify a target sequence that leads to zero interfacial energy within a
Received 21st August 2020, few hundred evaluations demonstrating the scalability and efficiency of MCTS in exploring practical
Accepted 13th November 2020
materials design problems with exceedingly vast chemical/material search space. Our MCTS-MD frame-
DOI: 10.1039/d0nr06091g work can be easily extended to several other polymer and protein inverse design problems, in particular,
rsc.li/nanoscale for cases where sequence-property data is either unavailable and/or is resource intensive.

Introduction their thermodynamic properties such as miscibility and

surface tension as well as structure/morphology is strongly
There exists a broad class of soft-materials such as proteins influenced by the sequence in oligomers. It is therefore not
and polymers where the arrangement of moieties i.e. the surprising that a lot of effort has focused on controlling the
sequence plays a critical role in determining their functional- sequence specificity in polymers, proteins and other
ity. For instance, the activities and functionalities of DNA and biomolecules.
other biomolecules are determined by the exact sequence of On the experimental front, progress in synthetic chemistry
amino acids and other chemical moieties in their back has enabled us to exercise an unprecedented control over
bones.1–3 As an example, the arrangement of amino acid sequences in copolymers – such precision polymers remain an
sequence in viruses plays a key role in determining their area of major focus in current fundamental and applied
mutations and hence the effectiveness of the drugs or vaccine polymer research.4–6 Copolymers are a special class of poly-
used to treat them. Likewise, several recent studies indicate mers that comprise of more than one type of chemical species;
that the sequence specificity of the constituents chemical moi- and shows a reach phase behaviour7–9 and tunability in its
eties of a copolymer can lead to more efficient materials – thermophysical properties.10–15 These copolymers are usually
characterized by their mean block length and mass fraction.
One of the area where sequence specificity is found to play
a
Department of Chemical Engineering, Indian Institute of Technology Madras, an import role is the use of copolymer as interfacial com-
Chennai, Tamil Nadu 600036, India. E-mail: [email protected] patibilizers.16,17 Copolymer compatibilizers are commonly
b
Center for Nanoscale Materials, Argonne National Laboratory, Lemont, Illinois
employed to improve the thermodynamic stability of polymer
60439, USA. E-mail: [email protected]
c
Department of Mechanical and Industrial Engineering, University of Illinois at interfaces, and they therefore have wide applicability in emul-
Chicago, Chicago, Illinois 60607, USA. E-mail: [email protected] sions and composite materials.11,18

This journal is © The Royal Society of Chemistry 2020 Nanoscale

View Article Online

Paper Nanoscale

A major challenge in the design of sequence specific poly- The advent of big data analytics and powerful supercompu-
mers lies in the vast combinatorial space that precludes eﬃcient ters have brought AI and ML techniques that can address the
exploration. For instance, a polymer chain with n number of above challenges in materials design. In this front, Monte
possible monomers and m type of monomers will have nm poss- Carlo tress search (MCTS) has emerged as a powerful global
ible combinations that can be likely explored. Even for a binary optimization method that has found wide-spread applications
polymer i.e. 2 types with a chain length of ∼30 units, the total in computer games such as Alpha Go, games such as Bridge,
combinations possible (accounting for double counting) are 229 Poker and many other video games.25,26 MCTS is a probabilis-
which is close to 0.5 billion. Given such an enormous sequence tic and heuristic search algorithm that integrates a tree search
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

space that needs to be explored, it is highly desirable to mini- algorithm with machine learning principles of reinforcement
mize the number of trials needed to arrive at a sequence that learning. MCTS is a decision tree-based approach that builds a
corresponds to a desired target property. Fortunately, the emer- shallow tree of nodes where each node represents a point in
gence of artificial intelligence (AI) positions us uniquely to solve the search space and downstream pathways are generated by a
this seemingly intractable inverse design problem. rollout procedure. The algorithm simultaneously explores
AI and machine learning (ML) has been increasingly inter- potentially better pathways to reach the optimal point in a
faced with molecular simulations to solve the inverse problem search space and exploits a single pathway that has the greatest
and accelerate materials discovery/design. Molecular simu- estimate value of the search function. This combination of
lations such as molecular dynamics (MD) are powerful tech- exploration vs. exploitation and an appropriate trade-off
niques to evaluate the sequence-property relationships. mechanism between them are found to be the most efficient
Typically, MD simulations can sample the configurational & strategy of identifying optimal point for a given function. An
property space and create adequately large structure–property advantage of the MCTS is that if the search gets trapped in a
training datasets of a materials. On the other hand, ML metastable or suboptimal point, it can quickly find another
methods can very efficiently screen this extensive dataset and pathway by growing other branches of the tree utilizing the
identify sequences or configurations that correspond to trade-off mechanism between exploration and exploitation.
desired optimal material properties. Such inverse problems Recently, MCTS has been successfully adopted for material
have been traditionally addressed using evolutionary methods science problems such as predicting silicon–germanium alloy
such as genetic algorithms (GA) or Bayesian optimization (BO) structure with optimal thermal conductance,24,27 and discover-
– these optimizers are combined with MD simulations to ing new synthetic routes of making organic molecules,28
identify target properties for a wide range of materials from in- optimal atoms segregation at grain boundary,29 predicting
organic to semiconductor to polymers.17,19–23 In the last few organic molecules with optimal partition coefficient and other
years, such combination of MD and ML have been successfully properties,30 and enhancing biomolecular sampling.31
deployed to explore the vast configurational space of materials. Here, we draw inspiration from the recent success of AI
Their widespread application to problems of practical interest, algorithms such as MCTS in computer games and aim to
however, requires addressing two bottlenecks. First, the GA develop a design algorithm for sequence problems that are
and BO methods exhibit poor scalability as the design space fast (time to solution) as well as highly scalable. We focus on a
increases.24 Typically, the search space of most practical representative albeit complex polymer inverse design problem,
sequencing problems exceeds several millions and higher. viz., design of sequence of copolymer molecules that corres-
Second, they find difficulties in surmounting suboptimal solu- ponds to a user-desired property. Our goal is to design the
tions and tends to slow down near the optimal points. Third, sequence of the compatibilizer that minimizes the interfacial
each property evaluation in many design problems is computa- tension between immiscible polymers. Block copolymers and
tionally intensive (trajectories over several tens of nanoseconds random copolymers have long been used as compatibilizers
and more), which precludes high-throughput exploration. that reduce interfacial tension between immiscible polymers
Within a typical MD-ML materials design framework, one and improve the stability of the composite materials.16,32
often requires several thousands to millions of direct evalu- These copolymers manipulate nanoscale domain structure
ations or computations of materials properties. In the context and interaction to enhance the stability and mechanical pro-
of soft materials, this poses a major limitation, for instance, perties of composite materials. Recently, evolutionary search
when the MD calculations for each sequence are compu- based on MD simulations (MD-GA) have identified sequence
tational very expensive such as, for instance, computing the specific co-polymers that outperform block and random copo-
thermophysical properties of polymeric materials. We note lymers.17 However, an optimal solution (a sequence specific
that the relaxation in polymeric materials is inherently slow copolymer) for a 20 bead polymer chain via evolutionary
and requires significantly long MD simulations to calculate search required several thousands of MD simulations within
their equilibrium properties. A key challenge in accelerating the MD-GA framework. In practice, the polymer chains can
computer aided molecular-scale polymer design and address often involve several tens of monomers to several hundreds
the sequence problems in materials design is to significantly necessitating algorithms that are efficient and scalable. In our
reduce the number of direct computational evaluation of design workflow, we interface MCTS with MD simulations that
materials property that are required to identify an optimal can- is used for evaluating the objective function for any specific
didate corresponding to the target property. sequence. Our MD simulation based Monte Carlo tree search

Nanoscale This journal is © The Royal Society of Chemistry 2020

View Article Online

Nanoscale Paper

(MD-MCTS) workflow rapidly identifies optimal sequence of chains are placed at both the interfaces. The interfacial area
copolymers corresponding to our desired interfacial tension. between the two homopolymers (36σ × 36σ) is kept constant
We demonstrate the scalability of our workflow by simulating during the simulations. Also the compatibilizer concentration
chain lengths from 10 to 30 monomers – for each case the at each interface which is defined as the compatibilizer mono-
search required only a few hundred evaluations despite the mers per unit area of an interface, is kept constant. Three case
search space extending from 1024 to 1 billion, respectively. studies are conducted each for varying compatibilizer chain
Our work demonstrates the success of AI in eﬃcient and faster length. A total of 414, 207 and 138 compatibilizer copolymer
materials search and is applicable for a broad class of chains of length N = 10, 20 and 30, respectively, are placed at
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

sequence related materials design problems. both the interfaces. This lead to a compatibilizer density of
1.59/σ2. All the systems consist of 36 000 CG beads. The system
is periodic in all three direction. All the simulations employ
Model and methodology the Verlet time integration scheme33pffiffiffiffiffiffiffiffiffiffi
withffi a time step of
0.005τ, where the unit of time is τ ¼ σ m=[. The Nose-Hover
Molecular dynamics of polymers
thermostat and barostat34 are employed to keep the tempera-
We use a generic coarse-grained model to represent two homo- ture and pressure constant during the simulations.
polymers A and B that are immiscible. The compatibilizer is a
copolymer consist with both the A and B type moieties. Within Property evaluation
this model system, two adjacent coarse-grained monomers of All the MD calculations are conducted at a reduce temperature
a polymer is connected by the Finitely Extensible Nonlinear T = 1 and zero pressure in the direction normal to the inter-
Elastic (FENE) potential of the form: face. During a MD summation, a system is initially equili-
2 brated for 2 × 106 MD steps, followed by a production run of
1 r
E ¼ KR20 1 another 2 × 106 steps. During the production run, pressure
2 R0
tensor data are collected, and the surface tension of a system
Here, k = 30∈/σ2 and R0 = 1.5σ. Any two monomers in the is calculated as:
system is interacted via the Lennard-Jones (LJ) potential of the
Lz 1
form: γ 12 ¼ Pzz ðPxx þ Pyy Þ
2 2
12 6
σ σ
V ðrij Þ ¼ 4 [ij Here z is the direction normal to the interface, the in-equili-
rij rij
brium box length along z is represented by Lz. The Pxx, Pyy and
The ∈ij is the interaction energy between any two mono- Pzz are the pressure components along three directions. All the
mers i and j. The size of all the monomers are σ. The LJ inter- MD simulations are conducted using LAMMPS molecular
action is truncated at a cut-off distance rc = 2.5σ to represent dynamics simulation package.35
attractive interaction among the monomers of homopolymers
viz., A–A and B–B interactions. The immiscibility between Material systems
homopolymer A and B is modelled by pure repulsion between We first model a base material where two immiscible homopo-
A and B moieties. This is achieved by choosing rc = 21/6σ for A– lymers of type A and B form interfaces, as shown by the MD
B interaction. The orthogonal simulation box consists of a snapshot in Fig. 1a. A representative MD simulation depicting
total of 693 homopolymer A and 693 homopolymer B that in-equilibrium energy and surface tension as a function of
form two interfaces as shown in Fig. 1a. The compatibilizer time is shown in Fig. 1b. The system attains a steady state

Fig. 1 A polymer blend: (a) MD snapshot of two immiscible homopolymers forming interfaces in a MD simulation box. (b) energy and surface
tension of the system is shown as a function of time during the production run. The energy, time and surface tension values are in LJ units.

This journal is © The Royal Society of Chemistry 2020 Nanoscale

View Article Online

Paper Nanoscale
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

Fig. 2 Binary mapping of copolymers. Copolymer of length N = 10, 20 and 30 are shown in (a), (b) and (c), respectively. The grey and blue beads
represent monomer of type A and B, respectively. The arrows point to the one dimensional binary strings where 0 and 1 correspond to monomer A
and B, respectively. The binary string length is same as the copolymer chain length N. Here, the C represent total number of sequences possible for
a given copolymer of size N.

within the equilibrium run and all the properties are calcu- its reverse sequence are identical in this context. Therefore,
lated by time averaging of data collected in these equilibrium the total number of candidate structures or sequences in the
region of the trajectory. The time average energy and surface search space are 512, 524 288 and 536 870 912 for N = 10, 20
tension of the system are −4.09ε and 1.8ε/σ2, respectively. and 30, respectively. We seek to identify the sequence of moi-
Here, ε and σ are the unit of energy and length, respectively. eties A and B that lead to lowest surface tension of the system
Next, the copolymer chains of both A and B type moieties, at for all the three cases by combining MD and MCTS.
the interfaces, are simulated. We subsequently interface the
MD simulations with MCTS to explore the sequence of A and B Monte Carlo tree search for co-polymer design
type moieties in a given copolymer or compatibilizer that The MD-MCTS workflow for exploring the sequence search
reduces the surface tension of the system. Three compatibilizer space is shown schematically in Fig. 3. The objective is to mini-
chains of length N = 10, 20 and 30 are considered and are mize the surface tension of the system in the presence of com-
shown in Fig. 2. The total number of possible candidate struc- patibilizer chains – the surface tension can be written as γ12 =
tures for a binary chain is C = 2N/2. The denominator 2 is to γ12(x). Here, x∈{0,1}N represent a sequence of 0 and 1 of size N
avoid the double counting of configurations – a sequence and and 0 and 1 correspond to moieties A and B, respectively. We

Fig. 3 MD-MCTS design scheme for copolymer. It comprise of four steps – selection, expansion simulation and backpropagation that are sequentially
conducted as shown by the arrows in a given iteration. In the simulation step, MD calculations of a set of candidate structures are conducted parallelly.
In the MD snapshot, homopolymers are shown as lines; and copolymers are shown as beads. The termination criteria is chosen to be γ12 ≈ 0.0.

Nanoscale This journal is © The Royal Society of Chemistry 2020

View Article Online

Nanoscale Paper

conduct three MD-MCTS calculations each for a specific value “dead end” node to allow more exploration. At a
of N. For a given N, the MD-MCTS begins with randomly gener- dead-end node, the number of possible structures narrows
ating a sequence of 0 and 1 of size N. This candidate serves as to one. This happens when the numbers of k − 1 candi-
the root node of the search tree. A search tree is built in an date structures reach the limit. Here, the J is updated as
incremental and iterative way for searching optimal sequence T t
J J þ max ; 0:1 ; where T is the total number of can-
of a copolymer as shown in Fig. 3. Each node of the tress rep- T
resent a specific sequence of the copolymer. Here the didates to be evaluated, and t is the number of candidates for
sequence is represented by a binary number of digit N for a which the surface tension is already evaluated. Whenever a
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

copolymer of chain length N. Once the termination criteria is new node is added, it is by default selected for one simulation
reached (set to be γ12 ≈ 0), the search (growth of the tress) is cycle as part of the initialization process. Next, we perform the
stopped and the best performing candidate is returned. In expansion of the tree by adding child nodes to the selected
each iteration, four steps – selection, expansion, simulations node. A new child node is created by randomly flipping a digit
and back-propagation are carried out. A child node is selected of the parent node. In the simulation step, a playout is per-
during the selection process based on the upper confidence formed from each of the added children. We roll out 10 struc-
bound (UCB) score.36 The UCB of a node is defined as tures randomly during a playout from a child node. In these
PNi
rffiffiffiffiffiffiffiffiffiffiffiffiffi playouts the structure contained within this node is changed
γk
2 ln vp
ucbi ¼ k¼1vi þ C . Here, γk is the surface tension of the by randomly flipping several polymer groups. Initially the
vi
entire chain from the initial structure is allowed to flip ran-
k-th playout performed by this node and all of its downstream
domly, but for nodes that are deeping in the tree (IE 2nd, 3rd,
child nodes, and vi is the visit/playout count of the node, vp is
or 4th generation nodes) fewer polymer groups are allowed to
the visit/playout count of the parent of this node and C is a
flip. This ensures a convergent behaviour where less and less
constant for balancing the exploration side (the right hand
of the chain is modified as the algorithm picks a path in the
side of the plus sign) and the exploitation (the left hand side
node tree to travel down. The scaling was set to be 100%, 60%,
of the plus sign). All variables in this equation are aggregates
30%, 20%, 10%, 5% for the 0th, 1st, 2nd, 3rd, etc. generation
of a node and it’s child nodes with exception of the exploration
node respectively in the tree. The final level was always set
constant. The exploration constant, C, is of course a hyper
such that only 1 polymer unit was flipped. All 10 playout struc-
parameter that is chosen by the user. However this is a highly
tures are evaluated via long-time scale MD simulations.
critical choice given that the efficiency of the algorithm will be
Finally, in the back propagation step, the visit count of each
determined by this. If the value of C is too small this will
ancestor node of i is incremented by one and the cumulative
cause the exploitation side of the equation to be the dominant
value is also updated to keep consistency. Note that the
term and as a result nodes will only be selected according to
concept of back propagation (i.e. child nodes feeding their
which node has the best score at this moment. This will off
information back up the tree) is a key feature of this algorithm.
course cause the algorithm to quickly flow to the nearest local
Owing to this, a parent also shares the reward that is discov-
minima and get trapped. If the value is too large the selection
process will wander aimlessly. Note that the node with the ered by its child node. As such, the reward for every node, the
highest UCB score is always selected for a given step. The branch starting from the child who discovered it up to the
MCTS algorithm doesn’t directly pick to exploit or explore, it head node is updated. This makes it possible for the algorithm
instead picks according to the best combined total. For a node to either choose to simply flip a few polymer units by selecting
to have a high UCB score, the node must have a good combi- the deepest child node or return back up the tree to one of the
nation of both terms. A good choice of the exploration con- higher parent nodes in the branch, where it has more freedom
stant is critical as this determines how these two terms in how many flips one can perform. This effectively means the
combine to give a UCB score. algorithm is allowed to pick how many flips it wishes to try by
The primary benefit of the MCTS formalism is that when selecting a node of a given depth. This balance allows it to
properly tuned the algorithm will make a weighted choice to both make fine-tuned adjustments or larger adjustments as
continue down a path that appears to have a good solution needed. In the absence of this, either the algorithm never con-
hidden behind it or to cut it’s losses and look somewhere else verged because it was changing too much all the time or it
when it appears to have exhausted its search space down a would become trapped because it was not making a large
given path. As such the choice of the exploration constant is enough change to the polymer.
very critical for optimal performance as it will search a path The initial layer of nodes effectively is given a random
well enough to potentially find a solution, but not waste valu- sequence of polymer chains and the child nodes are refined
able simulation time by over-sampling a given path. Instead of based on the reward. MCTS then picks between either large
trying to manually tune the parameter, the value of C was con- changes or small ones depending on finding sufficient reward
trolled adaptively at each node according to the formula down a particular branch. If a child node is failing to find
pffiffiffi
2J better rewards than its parent, the UCB will start favoring the
C¼ ðzmax zmin Þ. Here, J is the meta parameter which is parent node instead of the child node. This will result in the
4
set to be one and it increases whenever the algorithm reach a algorithm making larger adjustments instead of smaller ones.

This journal is © The Royal Society of Chemistry 2020 Nanoscale

View Article Online

Paper Nanoscale

(i.e. single polymer unit flips are failing to result in further sequences that bring down the surface tension to near zero for
minimize the surface tension suggesting larger number of each of the three cases. An optimal sequence for γ12 ≈ 0.0 for
flips are required). At any given point, any node in the tree each case is shown in Fig. 4b. It is interesting to note that all
regardless of depth, can be chosen if its UCB score becomes the optimal sequences are non-periodic and non-intuitive.
the largest. As such, it is possible to sample any point in the These machine learned sequences combine small and long
total statistical space. Finally, we point out that the depth segments of blocks. These specific arrangement of blocks in
scaling rates are an arbitrary hyperparameter, but in this case the compatibilizer structure lead to more interfacial crossings
were chosen to ensure a smooth transition between all the at the interface than regular di-block copolymers or other peri-
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

polymer units being randomly assigned and only 1 being odic copolymers of large block length. The ML identified poly-
flipped. We find this is obtained by a tree depth of 6 nodes mers outperform di-block copolymers due to their ability to
which strikes a good balance between speed and accuracy. form large number of interfacial crossings at the interface.
MCTS demonstrates exceptional scalability and is able to
achieve zero interfacial energy irrespective of the size or chain
Results length of the polymer compatibilizer.
We access the scalability of the MD-MCTS algorithm by
We first assess the performance of our MCTS-MD workflow as plotting the total number of candidate structures evaluated
shown in Fig. 4. The lowest surface tension as a function of during a given search cycle as a function of the size of the
total number of candidate evaluated during the MCTS iter- compatibilizer (Fig. 5a). This is especially important consider-
ations is shown in Fig. 4a for the three diﬀerent polymer chain ing the long timescale MD simulations that are necessary to
lengths. We find that the MD-MCTS is able to identify optimal perform each evaluation for a specific sequence. MCTS per-

Fig. 4 Performance and prediction of MD-MCTS. (a) The lowest surface tension achieved during the MD-MCTS run is shown as a function of total
number of candidate materials directly evaluated via MD simulations for all three cases. The optimal sequences for copolymers for all three cases
are depicted in (b).

Fig. 5 Scalability of the MCTS based optimization. (a) The total number of candidate structures screened (Cs) during a MCTS cycle is shown as a
function of number of monomer in a copolymer chain. (b) The fraction of candidate structures (Cs/C) screened during a design cycle is plotted as a
function of polymer chain length.

View Article Online

Nanoscale Paper

forms direct evaluation of materials properties of 114, 388 and in a copolymer. The MCTS identifies multiple mean block
415 candidates to achieve an optimal sequence that corres- lengths that correspond to highest performance compatibilizer
pond to our target i.e. zero interfacial energy of the system for (γ12 ≈ 0). For example, there are copolymers of size N = 10 with
copolymers of length 10, 20 and 30, respectively. We only note mean block length bl = 1.8, 2.6 and 3.3 that yield γ12 ≈ 0
a marginal increase in the number of MD simulations with an (Fig. 6a). Similar observation can be made for N = 20 and N =
increase in the system size. This is incredible considering that 30 from Fig. 6b and c, respectively. We also find that
the design space i.e. the total number of candidates, C increase sequences with the same mean block length as the optimal
from 1024 to 1 billion when the chain lengths increase from sequence can exhibit a wide range of interfacial energies. The
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

10 to 30. MCTS is thus able to attain an optimal solution by interfacial energy varies from 0 to 1.2ε/σ2, approximately, for a
screening lower percentage of candidate viz., 22%, 0.07% and mean block length of 3.3 for N = 10. A similar variation of
7.7 × 10−5% of total possible structures for chain length 10, 20 surface tension for a given mean block is observed for N = 20
and 30, respectively. Fig. 5b depicts this ratio Cs/C as a func- and 30 as evident in Fig. 6b and c, respectively. Thus, the
tion of N, clearly indicating the exponentially lower fraction of mean block length alone is a poor predictor of a compatibilizer
candidates required to be screened during the design cycle as performance.
polymer chain length increases. This strongly suggests that the To further understand the uniqueness of the optimal
MD-MCTS design scheme is scalable to extremely large system polymer sequence, we study the interrelationship between
sizes, which has hitherto posed a challenge to evolutionary mean block length, monomer mole fraction and the interfacial
search strategies. energy of the system. Here, the mole fraction is defined as the
Mean block length of a copolymer has long been perceived ratio of the number of monomer of one particular moiety viz.
as an important descriptor of a copolymer’s properties. type-1 moiety to the total number of monomers in a chain.
Therefore, we closely analyse the correlation between any given Fig. 7a, b and c show the variation of interfacial energy for N =
sequence i.e. the relative statistics of type-1 and type-0 moieties 10, 20 and 30, respectively, as a function of mean block length
present in a copolymer chain and the computed surface (lb) and mole fraction of type-1 (q). The deep blue contours in
tension of the system. Surface tension vs. mean block length Fig. 7 corresponds to lowest surface tension of the system. For
for each of the three systems (N = 10, 20 and 30) is shown in N = 10, we observe γ12 ≈ 0.0 for q ≈ 0.5 and lb = 3.3. As N
Fig. 6. Here, the mean block length is calculated as the arith- increases, there are larger patches of isolated blue contours
metic mean of the size of all the blocks of 0’s and 1’s presence that appear in the q − lb surface, indicating greater number of

Fig. 6 Variation of interfacial energy is shown as a function of the mean block length for all the candidate structures screened in this study for N =
10, 20 and 30 in (a), (b) and (c), respectively. Both the surface tension and mean block length are in LJ unit.

Fig. 7 Heat map of interfacial energy as a function of chain length (a) N = 10, (b) N = 20 and (c) N = 30. The interfacial energy γ12 is shown as a func-
tion of mole fraction of A type monomer, q, and mean block length, lb. The surface tension and block length are in LJ unit.

View Article Online

Paper Nanoscale

global optimal points in the q − lb surface. As the chain length Author contributions
further increases, it is possible to achieve many more
sequences that can lead to zero interfacial energy of the T.P., T.L. and S.K.R.S. conceived and designed the project. T.P.
system. Often times, the optimal regions are separated by large and T.L. contributed equally. T.P., T.L. and S.K.R.S. developed
regions of suboptimal points as evident from Fig. 7b and c. the inverse design framework using MCTS. T.L. developed the
Many of the local optimization strategies and even global ones MCTS workflow while T.P. carried out the MCTS optimization.
struggle to navigate around these sub-optimal points. Thus, T.P. carried out the MD simulations of polymers. All authors
the complexity as well as the degeneracy of this configurational interpreted the data and performed analysis. T.P. and S.K.R.S.
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

space increases for longer chain length and poses challenges. wrote the manuscript. All the authors commented and contrib-
The MCTS algorithm is however able to effectively navigate uted to the preparation of the manuscript. SKRS supervised
around these sub-optimal regions by growing other branches the entire project.
of the tree effectively utilizing the trade-off mechanism
between exploration and exploitation. MCTS thus simul-
taneously explores potentially better pathways to reach the Conflicts of interest
optimal point in a search space and exploits pathways that
have the greatest estimate value of the search function. This The authors declare no competing financial or non-financial
combination of exploration vs. exploitation and an appropriate interests.
trade-off mechanism between them, represents a powerful
strategy of identifying optimal point for a given function.
Acknowledgements
The use of the Center for Nanoscale Materials, an Office of
Science user facility, was supported by the U.S. Department of
Conclusions Energy, Office of Science, Office of Basic Energy Sciences,
under Contract No. DE-AC02-06CH11357. This research used
In summary, sequence control in soft matter systems such as resources of the National Energy Research Scientific
polymers and proteins has been a longstanding goal and is Computing Center, which was supported by the Office of
highly desirable for a wide variety of applications. In particular, Science of the U.S. Department of Energy under Contract No.
the emergence of sequence control polymers provides tremen- DE-AC02-05CH11231. An award of computer time was provided
dous opportunities for material design. By controlling the by the Innovative and Novel Computational Impact on Theory
sequence of a copolymer, one can reliably tune their functional- and Experiment (INCITE) program of the Argonne Leadership
ity over a wide range which can be mapped on to their chemical Computing Facility at the Argonne National Laboratory, which
information. The vast combinatorial search space that is was supported by the Office of Science of the U.S. Department
required to be explored for these sequence problems pose a of Energy under Contract No. DE-AC02-06CH11357. The
major challenge. To overcome this, we interface a gaming AI authors would like to acknowledge the support from the
algorithm viz., MCTS with an MD simulator to tackle the com- Argonne LDRD-2017-012-N0 project, UIC faculty start-up fund
plexity of copolymer design. Our MD-MCTS design algorithm is and IIT Madras faculty stat-up fund. This material is based
employed to identify optimal copolymers sequences which can upon work supported by the U.S. Department of Energy, Office
be used as compatibilizer to improve thermodynamics stability of Science, Office of Basic Energy Sciences Data, Artificial
of two immiscible homopolymer blends. Unlike other mole- Intelligence and Machine Learning at DOE Scientific User
cular inverse design strategies i.e. genetic algorithm (GA) or Facilities program under Award Number 34532.
Bayesian optimization (BO) where design time grows rapidly
with the system size, MCTS interfaced with MD appears to
require relatively much smaller number of candidate evalu- References
ations in any given design cycle. We show that one can engineer
the sequence of chemical moieties that will nullify the inter- 1 Y. Gruenbaum, T. Naveh-Many, H. Cedar and A. Razin,
facial tension between immiscible polymers irrespective of the Sequence Specificity of Methylation in Higher Plant DNA,
size of compatibilizer polymer chain. Our work also elucidates Nature, 1981, 292(5826), 860–862, DOI: 10.1038/292860a0.
the correlation between interfacial energy and the sequence stat- 2 P. J. Mitchell and R. Tjian, Transcriptional Regulation in
istics of copolymer compatibilizer molecules and illustrates the Mammalian Cells by Sequence-Specific DNA Binding
complexities associated with sequence control over larger Proteins, Science, 1989, 245(4916), 371–378, DOI: 10.1126/
polymer chain lengths. Finally, we show that MCTS is highly science.2667136.
scalable and efficiently able to navigate large design spaces 3 M. A. Lemmon, J. M. Flanagan, H. R. Treutlein,
typical of most practical sequence control related design J. Zhang and D. M. Engelman, Sequence Specificity
problem. More broadly, the work provides new strategy that can in the Dimerization of Transmembrane .Alpha.-Helixes,
be used for sequence control and inverse design of copolymers Biochemistry, 1992, 31(51), 12719–12725, DOI: 10.1021/
for materials applications. bi00166a002.

View Article Online

Nanoscale Paper

4 J.-F. Lutz, Defining the Field of Sequence-Controlled Blending: Premade Block Copolymers, Macromolecules,
Polymers, Macromol. Rapid Commun., 2017, 38(24), 1996, 29(17), 5590–5598, DOI: 10.1021/ma9602482.
1700582, DOI: 10.1002/marc.201700582. 19 T. K. Patra, V. Meenakshisundaram, J.-H. Hung and
5 J. D. Neve, J. J. Haven, L. Maes and T. Junkers, Sequence- D. S. Simmons, Neural-Network-Biased Genetic Algorithms
Definition from Controlled Polymerization: The next for Materials Design: Evolutionary Algorithms That Learn,
Generation of Materials, Polym. Chem., 2018, 9(38), 4692– ACS Comb. Sci., 2017, 19(2), 96–107, DOI: 10.1021/
4705, DOI: 10.1039/C8PY01190G. acscombsci.6b00136.
6 S. L. Perry and C. E. Sing, 100th Anniversary of 20 T. K. Patra, F. Zhang, D. S. Schulman, H. Chan,
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

Macromolecular Science Viewpoint: Opportunities in the M. J. Cherukara, M. Terrones, S. Das, B. Narayanan and
Physics of Sequence-Defined Polymers, ACS Macro Lett., S. K. R. S. Sankaranarayanan, Defect Dynamics in 2-D
2020, 9(2), 216–225, DOI: 10.1021/acsmacrolett.0c00002. MoS2 Probed by Using Machine Learning, Atomistic
7 F. S. Bates, Polymer-Polymer Phase Behavior, Science, 1991, Simulations, and High-Resolution Microscopy, ACS Nano,
251(4996), 898–905, DOI: 10.1126/science.251.4996.898. 2018, 12(8), 8006–8016, DOI: 10.1021/acsnano.8b02844.
8 A. Chremos, A. Nikoubashman and A. Z. Panagiotopoulos, 21 T. Lookman, P. V. Balachandran, D. Xue and R. Yuan,
Flory-Huggins Parameter χ, from Binary Mixtures of Active Learning in Materials Science with Emphasis on
Lennard-Jones Particles to Block Copolymer Melts, J. Chem. Adaptive Sampling Using Uncertainties for Targeted
Phys., 2014, 140(5), 054909, DOI: 10.1063/1.4863331. Design, npj Comput. Mater., 2019, 5(1), 1–17, DOI: 10.1038/
9 F. S. Bates and G. H. Fredrickson, Block Copolymer s41524-019-0153-8.
Thermodynamics: Theory and Experiment, Annu. Rev. Phys. 22 N. E. Jackson, M. A. Webb and J. J. de Pablo, Recent
Chem., 1990, 41(1), 525–557, DOI: 10.1146/annurev. Advances in Machine Learning towards Multiscale Soft
pc.41.100190.002521. Materials Design, Curr. Opin. Chem. Eng., 2019, 23, 106–
10 I. W. Hamley, Ordering in Thin Films of Block Copolymers: 114, DOI: 10.1016/j.coche.2019.03.005.
Fundamentals to Potential Applications, Prog. Polym. Sci., 23 J. Schmidt, M. R. G. Marques, S. Botti and
2009, 34(11), 1161–1210, DOI: 10.1016/j.progpolymsci.2009. M. A. L. Marques, Recent Advances and Applications of
06.003. Machine Learning in Solid-State Materials Science, npj
11 P. Cigana, B. D. Favis and R. Jerome, Diblock Copolymers Comput. Mater., 2019, 5(1), 1–36, DOI: 10.1038/s41524-019-
as Emulsifying Agents in Polymer Blends: Influence of 0221-0.
Molecular Weight, Architecture, and Chemical 24 T. M. Dieb, S. Ju, K. Yoshizoe, Z. Hou, J. Shiomi and
Composition, J. Polym. Sci., Part B: Polym. Phys., 1996, K. Tsuda, MDTS: Automatic Complex Materials Design
34(9), 1691–1700, DOI: 10.1002/(SICI)1099-0488(19960715) Using Monte Carlo Tree Search, Sci. Technol. Adv.
34:9<1691::AID-POLB18>3.0.CO;2-2. Mater., 2017, 18(1), 498–503, DOI: 10.1080/14686996.
12 H. E. H. Meijer, P. J. Lemstra and P. H. M. Elemans, 2017.1344083.
Structured Polymer Blends, Makromol. Chem., Macromol. 25 D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre,
Symp., 1988, 16(1), 113–135, DOI: 10.1002/masy.19880160109. G. van den Driessche, J. Schrittwieser, I. Antonoglou,
13 U. Sundararaj and C. W. Macosko, Drop Breakup and V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe,
Coalescence in Polymer Blends: The Eﬀects of J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap,
Concentration and Compatibilization, Macromolecules, M. Leach, K. Kavukcuoglu, T. Graepel and D. Hassabis,
1995, 28(8), 2647–2657, DOI: 10.1021/ma00112a009. Mastering the Game of Go with Deep Neural Networks and
14 A. R. Khokhlov and P. G. Khalatur, Conformation- Tree Search, Nature, 2016, 529(7587), 484–489, DOI:
Dependent Sequence Design (Engineering) of AB 10.1038/nature16961.
Copolymers, Phys. Rev. Lett., 1999, 82(17), 3456–3459, DOI: 26 C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas,
10.1103/PhysRevLett.82.3456. P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez,
15 C. E. Sing, J. W. Zwanikken and M. Olvera de la Cruz, S. Samothrakis and S. Colton, A Survey of Monte Carlo Tree
Electrostatic Control of Block Copolymer Morphology, Nat. Search Methods, IEEE Trans. Comput. Intell. AI Games,
Mater., 2014, 13(7), 694–698, DOI: 10.1038/nmat4001. 2012, 4(1), 1–43, DOI: 10.1109/TCIAIG.2012.2186810.
16 Y. Lyatskaya, D. Gersappe, N. A. Gross and A. C. Balazs, 27 T. M. Dieb, S. Ju, J. Shiomi and K. Tsuda, Monte Carlo Tree
Designing Compatibilizers To Reduce Interfacial Tension Search for Materials Design and Discovery, MRS Commun.,
in Polymer Blends, J. Phys. Chem., 1996, 100(5), 1449–1458, 2019, 9(2), 532–536, DOI: 10.1557/mrc.2019.40.
DOI: 10.1021/jp952422e. 28 M. H. S. Segler, M. Preuss and M. P. Waller, Planning
17 V. Meenakshisundaram, J.-H. Hung, T. K. Patra and Chemical Syntheses with Deep Neural Networks and
D. S. Simmons, Designing Sequence-Specific Copolymer Symbolic AI, Nature, 2018, 555(7698), 604–610, DOI:
Compatibilizers Using a Molecular-Dynamics-Simulation- 10.1038/nature25978.
Based Genetic Algorithm, Macromolecules, 2017, 50(3), 29 S. Kiyohara and T. Mizoguchi, Searching the Stable
1155–1166, DOI: 10.1021/acs.macromol.6b01747. Segregation Configuration at the Grain Boundary by a
18 C. W. Macosko, P. Guégan, A. K. Khandpur, A. Nakayama, Monte Carlo Tree Search, J. Chem. Phys., 2018,
P. Marechal and T. Inoue, Compatibilizers for Melt DETC2018(1), 241741, DOI: 10.1063/1.5023139.

View Article Online

Paper Nanoscale

30 X. Yang, J. Zhang, K. Yoshizoe, K. Terayama and K. Tsuda, Time Scales, J. Chem. Phys., 1990, 93(2), 1287–1291, DOI:
ChemTS: An Eﬃcient Python Library for de Novo 10.1063/1.459140.
Molecular Generation, Sci. Technol. Adv. Mater., 2017, 18(1), 34 M. E. Tuckerman, J. Alejandre, R. López-Rendón,
972–976, DOI: 10.1080/14686996.2017.1401424. A. L. Jochim and G. J. Martyna, A Liouville-Operator
31 K. Shin, D. P. Tran, K. Takemura, A. Kitao, K. Terayama and Derived Measure-Preserving Integrator for Molecular
K. Tsuda, Enhancing Biomolecular Sampling with Dynamics Simulations in the Isothermal–Isobaric
Reinforcement Learning: A Tree Search Molecular Ensemble, J. Phys. A: Math. Gen., 2006, 39(19), 5629, DOI:
Dynamics Simulation Method, ACS Omega, 2019, 4(9), 10.1088/0305-4470/39/19/S18.
Published on 14 November 2020. Downloaded by University of Cambridge on 11/20/2020 11:14:47 PM.

13853–13862, DOI: 10.1021/acsomega.9b01480. 35 S. Plimpton, Fast Parallel Algorithms for Short-Range

32 W. Dong, H. Wang, M. He, F. Ren, T. Wu, Q. Zheng and Molecular Dynamics, J. Comput. Phys., 1995, 117(1), 1–19,
Y. Li, Synthesis of Reactive Comb Polymers and Their DOI: 10.1006/jcph.1995.1039.
Applications as a Highly Eﬃcient Compatibilizer in 36 L. Kocsis and C. Szepesvári, Bandit Based Monte-
Immiscible Polymer Blends, Ind. Eng. Chem. Res., 2015, Carlo Planning, in Machine Learning: ECML 2006, ed.
54(7), 2081–2089, DOI: 10.1021/ie503645a. J. Fürnkranz, T. Scheﬀer and M. Spiliopoulou, Lecture
33 M. E. Tuckerman, G. J. Martyna and B. J. Berne, Molecular Notes in Computer Science, Springer, Berlin, Heidelberg,
Dynamics Algorithm for Condensed Systems with Multiple 2006, pp. 282–293, DOI: 10.1007/11871842_29.