0% found this document useful (0 votes)
4 views

Learning to Simulate Complex Physics With Graph Networks

The document presents a machine learning framework called 'Graph Network-based Simulators' (GNS) that learns to simulate complex physical systems involving fluids, rigid solids, and deformable materials. The GNS framework utilizes a graph representation of physical states with particles as nodes and employs learned message-passing to compute dynamics, demonstrating generalization capabilities across various conditions and scales. This approach advances the state-of-the-art in learned physical simulation, offering a more efficient alternative to traditional simulators while maintaining accuracy.

Uploaded by

pradl12121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Learning to Simulate Complex Physics With Graph Networks

The document presents a machine learning framework called 'Graph Network-based Simulators' (GNS) that learns to simulate complex physical systems involving fluids, rigid solids, and deformable materials. The GNS framework utilizes a graph representation of physical states with particles as nodes and employs learned message-passing to compute dynamics, demonstrating generalization capabilities across various conditions and scales. This approach advances the state-of-the-art in learned physical simulation, offering a more efficient alternative to traditional simulators while maintaining accuracy.

Uploaded by

pradl12121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Learning to Simulate Complex Physics with Graph Networks

Alvaro Sanchez-Gonzalez * 1 Jonathan Godwin * 1 Tobias Pfaff * 1 Rex Ying * 1 2 Jure Leskovec 2
Peter W. Battaglia 1

Abstract
Here we present a machine learning framework

Water
and model implementation that can learn to
simulate a wide variety of challenging physi-
cal domains, involving fluids, rigid solids, and
deformable materials interacting with one an-
other. Our framework—which we term “Graph

“Goop”
Network-based Simulators” (GNS)—represents
the state of a physical system with particles, ex-
pressed as nodes in a graph, and computes dy-
namics via learned message-passing. Our re-
sults show that our model can generalize from
single-timestep predictions with thousands of par-
Sand

ticles during training, to different initial condi-


tions, thousands of timesteps, and at least an
order of magnitude more particles at test time.
Our model was robust to hyperparameter choices Time
across various evaluation metrics: the main de-
terminants of long-term performance were the Figure 1. Rollouts of our GNS model for our WATER -3D, G OOP -
number of message-passing steps, and mitigat- 3D and S AND -3D datasets. It learns to simulate rich materials at
ing the accumulation of error by corrupting the resolutions sufficient for high-quality rendering [video].
training data with noise. Our GNS framework
advances the state-of-the-art in learned physical
simulation, and holds promise for solving a wide substantial computational resources, which makes scaling
range of complex forward and inverse problems. up prohibitive. Even the best are often inaccurate due to in-
sufficient knowledge of, or difficulty in approximating, the
underlying physics and parameters. An attractive alternative
to traditional simulators is to use machine learning to train
1. Introduction simulators directly from observed data, however the large
state spaces and complex dynamics have been difficult for
Realistic simulators of complex physics are invaluable to standard end-to-end learning approaches to overcome.
many scientific and engineering disciplines, however tradi-
tional simulators can be very expensive to create and use. Here we present a powerful machine learning framework for
Building a simulator can entail years of engineering ef- learning to simulate complex systems from data—“Graph
fort, and often must trade off generality for accuracy in a Network-based Simulators” (GNS). Our framework imposes
narrow range of settings. High-quality simulators require strong inductive biases, where rich physical states are rep-
resented by graphs of interacting particles, and complex
*
Equal contribution 1 DeepMind, London, UK 2 Department dynamics are approximated by learned message-passing
of Computer Science, Stanford University, Stanford, CA, USA. among nodes.
Email to: [email protected], [email protected],
[email protected], [email protected], [email protected], We implemented our GNS framework in a single deep learn-
[email protected]. ing architecture, and found it could learn to accurately sim-
Proceedings of the 37 th International Conference on Machine ulate a wide range of physical systems in which fluids, rigid
Learning, Online, PMLR 119, 2020. Copyright 2020 by the au- solids, and deformable materials interact with one another.
thor(s). Our model also generalized well to much larger systems and
Learning to Simulate

(a) X t0 X̃ tK

sθ Learned simulator, sθ sθ
Update

Encoder GN1 Processor GNM Decoder


(b)
X G0 + G1 … GM −1 + GM Y

(c) Construct graph (d) Pass messages (e) Extract dynamics info
vj0
e0i,j em
i,j em+1
i,j

xi vi0 vim vim+1 viM yi

Figure 2. (a) Our GNS predicts future states represented as particles using its learned dynamics model, dθ , and a fixed update procedure.
(b) The dθ uses an “encode-process-decode” scheme, which computes dynamics information, Y , from input state, X. (c) The E NCODER
constructs latent graph, G0 , from the input state, X. (d) The P ROCESSOR performs M rounds of learned message-passing over the latent
graphs, G0 , . . . , GM . (e) The D ECODER extracts dynamics information, Y , from the final latent graph, GM .

longer time scales than those on which it was trained. While tiable particle-based simulators have recently appeared, e.g.,
previous learning simulation approaches (Li et al., 2018; DiffTaichi (Hu et al., 2019), PhiFlow (Holl et al., 2020), and
Ummenhofer et al., 2020) have been highly specialized for Jax-MD (Schoenholz & Cubuk, 2019), which can backprop-
particular tasks, we found our single GNS model performed agate gradients through the architecture.
well across dozens of experiments and was generally robust
Learning simulations from data (Grzeszczuk et al., 1998)
to hyperparameter choices. Our analyses showed that perfor-
has been an important area of study with applications in
mance was determined by a handful of key factors: its ability
physics and graphics. Compared to engineered simulators,
to compute long-range interactions, inductive biases for spa-
a learned simulator can be far more efficient for predicting
tial invariance, and training procedures which mitigate the
complex phenomena (He et al., 2019); e.g., (Ladickỳ et al.,
accumulation of error over long simulated trajectories.
2015; Wiewel et al., 2019) learn parts of a fluid simulator
for faster prediction.
2. Related Work
Graph Networks (GN) (Battaglia et al., 2018)—a type of
Our approach focuses on particle-based simulation, which graph neural network (Scarselli et al., 2008)—have recently
is used widely across science and engineering, e.g., compu- proven effective at learning forward dynamics in various set-
tational fluid dynamics, computer graphics. States are rep- tings that involve interactions between many entities. A GN
resented as a set of particles, which encode mass, material, maps an input graph to an output graph with the same struc-
movement, etc. within local regions of space. Dynamics are ture but potentially different node, edge, and graph-level
computed on the basis of particles’ interactions within their attributes, and can be trained to learn a form of message-
local neighborhoods. One popular particle-based method passing (Gilmer et al., 2017), where latent information is
for simulating fluids is “smoothed particle hydrodynamics” propagated between nodes via the edges. GNs and their
(SPH) (Monaghan, 1992), which evaluates pressure and vis- variants, e.g., “interaction networks”, can learn to simu-
cosity forces around each particle, and updates particles’ late rigid body, mass-spring, n-body, and robotic control
velocities and positions accordingly. Other techniques, such systems (Battaglia et al., 2016; Chang et al., 2016; Sanchez-
as “position-based dynamics” (PBD) (Müller et al., 2007) Gonzalez et al., 2018; Mrowca et al., 2018; Li et al., 2019;
and “material point method” (MPM) (Sulsky et al., 1995), Sanchez-Gonzalez et al., 2019), as well as non-physical sys-
are more suitable for interacting, deformable materials. In tems, such as multi-agent dynamics (Tacchetti et al., 2018;
PBD, incompressibility and collision dynamics involve re- Sun et al., 2019), algorithm execution (Veličković et al.,
solving pairwise distance constraints between particles, and 2020), and other dynamic graph settings (Trivedi et al.,
directly predicting their position changes. Several differen- 2019; 2017; Yan et al., 2018; Manessi et al., 2020).
Learning to Simulate

Our GNS framework builds on and generalizes several lines performs well or poorly, etc. We are interested in learning
of work, especially Sanchez-Gonzalez et al. (2018)’s GN- these interactions, which should, in principle, allow learning
based model which was applied to various robotic control the dynamics of any system that can be expressed as particle
systems, Li et al. (2018)’s DPI which was applied to fluid dynamics. So it is crucial that different θ values allow dθ to
dynamics, and Ummenhofer et al. (2020)’s Continuous Con- span a wide range of particle-particle interaction functions.
volution (CConv) which was presented as a non-graph-based
Particle-based simulation can be viewed as message-passing
method for simulating fluids. Crucially, our GNS frame-
on a graph. The nodes correspond to particles, and the
work is a general approach to learning simulation, is simpler
edges correspond to pairwise relations among particles, over
to implement, and is more accurate across fluid, rigid, and
which interactions are computed. We can understand meth-
deformable material systems.
ods like SPH in this framework—the messages passed be-
tween nodes could correspond to, e.g., evaluating pressure
3. GNS Model Framework using the density kernel.
3.1. General Learnable Simulation We capitalize on the correspondence between particle-based
t simulators and message-passing on graphs to define a
We assume X ∈ X is the state of the world at time
general-purpose dθ based on GNs. Our dθ has three steps—
t. Applying physical dynamics over K timesteps yields
E NCODER, P ROCESSOR, D ECODER (Battaglia et al., 2018)
a trajectory of states, Xt0:K = (X t0 , . . . , X tK ). A
(see Figure 2(b)).
simulator, s : X → X , models the dynamics by map-
ping preceding states to causally consequent future E NCODER definition. The E NCODER : X → G embeds
states. We denote a simulated “rollout” trajectory as, the particle-based state representation, X, as a latent graph,
X̃t0:K = (X t0 , X̃ t1 , . . . , X̃ tK ), which is computed itera- G0 = E NCODER(X), where G = (V, E, u), vi ∈ V ,
tively by, X̃ tk+1 = s(X̃ tk ) for each timestep. Simulators and ei,j ∈ E (see Figure 2(b,c)). The node embeddings,
compute dynamics information that reflects how the current vi = εv (xi ), are learned functions of the particles’ states.
state is changing, and use it to update the current state to Directed edges are added to create paths between particle
a predicted future state (see Figure 2(a)). An example is a nodes which have some potential interaction. The edge
numerical differential equation solver: the equations com- embeddings, ei,j = εe (ri,j ), are learned functions of the
pute dynamics information, i.e., time derivatives, and the pairwise properties of the corresponding particles, ri,j , e.g.,
integrator is the update mechanism. displacement between their positions, spring constant, etc.
The graph-level embedding, u, can represent global prop-
A learnable simulator, sθ , computes the dynamics in-
erties such as gravity and magnetic fields (though in our
formation with a parameterized function approximator,
implementation we simply appended those as input node
dθ : X → Y, whose parameters, θ, can be optimized for
features—see Section 4.2 below).
some training objective. The Y ∈ Y represents the dynam-
ics information, whose semantics are determined by the P ROCESSOR definition. The P ROCESSOR : G → G com-
update mechanism. The update mechanism can be seen as a putes interactions among nodes via M steps of learned
function which takes the X̃ tk , and uses dθ to predict the next message-passing, to generate a sequence of updated latent
state, X̃ tk+1 = Update(X̃ tk , dθ ). Here we assume a simple graphs, G = (G1 , ..., GM ), where Gm+1 = GNm+1 (Gm )
update mechanism—an Euler integrator—and Y that rep- (see Figure 2(b,d)). It returns the final graph,
resents accelerations. However, more sophisticated update GM = P ROCESSOR(G0 ). Message-passing allows infor-
procedures which call dθ more than once can also be used, mation to propagate and constraints to be respected: the
such as higher-order integrators (e.g., Sanchez-Gonzalez number of message-passing steps required will likely scale
et al. (2019)). with the complexity of the interactions.
D ECODER definition. The D ECODER : G → Y extracts
3.2. Simulation as Message-Passing on a Graph dynamics information from the nodes of the final latent
Our learnable simulation approach adopts a particle-based graph, yi = δ v (viM ) (see Figure 2(b,e)). Learning δ v
representation of the physical system (see Section 2), i.e., should cause the Y representations to reflect relevant dy-
X = (x0 , . . . , xN ), where each of the N particles’ xi rep- namics information, such as acceleration, in order to be
resents its state. Physical dynamics are approximated by semantically meaningful to the update procedure.
interactions among the particles, e.g., exchanging energy
and momentum among their neighbors. The way particle- 4. Experimental Methods
particle interactions are modeled determines the quality and
generality of a simulation method—i.e., the types of effects Code and data available at github.com/deepmind/deepmind-
and materials it can simulate, in which scenarios the method research/tree/master/learning to simulate.
Learning to Simulate

4.1. Physical Domains using finite differences. For full details of these input and
target features, see Supplementary Material Section B.
We explored how our GNS learns to simulate in datasets
which contained three diverse, complex physical materials: E NCODER details. The E NCODER constructs the graph
water as a barely damped fluid, chaotic in nature; sand structure G0 by assigning a node to each particle and adding
as a granular material with complex frictional behavior; edges between particles within a “connectivity radius”, R,
and “goop” as a viscous, plastically deformable material. which reflected local interactions of particles, and which
These materials have very different behavior, and in most was kept constant for all simulations of the same resolution.
simulators, require implementing separate material models For generating rollouts, on each timestep the graph’s edges
or even entirely different simulation algorithms. were recomputed by a nearest neighbor algorithm, to reflect
the current particle positions.
For one domain, we use Li et al. (2018)’s B OX BATH, which
simulates a container of water and a cube floating inside, all The E NCODER implements εv and εe as multilayer percep-
represented as particles, using the PBD engine FleX (Mack- trons (MLP), which encode node features and edge features
lin et al., 2014). into the latent vectors, vi and ei,j , of size 128.
We also created WATER -3D, a high-resolution 3D water We tested two E NCODER variants, distinguished by whether
scenario with randomized water position, initial velocity they use absolute versus relative positional information. For
and volume, comparable to Ummenhofer et al. (2020)’s the absolute variant, the input to εv was the xi described
containers of water. We used SPlisHSPlasH (Bender & above, with the globals features concatenated to it. The
Koschier, 2015), a SPH-based fluid simulator with strict input to εe , i.e., ri,j , did not actually carry any information
volume preservation to generate this dataset. and was discarded, with the e0i in G0 set to a trainable fixed
bias vector. The relative E NCODER variant was designed
For most of our domains, we use the Taichi-MPM en-
to impose an inductive bias of invariance to absolute spa-
gine (Hu et al., 2018) to simulate a variety of challenging
tial location. The εv was forced to ignore pi information
2D and 3D scenarios. We chose MPM for the simulator
within xi by masking it out. The εe was provided with
because it can simulate a very wide range of materials, and
the relative positional displacement, and its magnitude2 ,
also has some different properties than PBD and SPH, e.g.,
ri,j = [(pi − pj ), kpi − pj k]. Both variants concatenated
particles may become compressed over time.
the global properties g onto each xi before passing it to εv .
Our datasets typically contained 1000 train, 100 valida-
P ROCESSOR details. Our processor uses a stack of M
tion and 100 test trajectories, each simulated for 300-2000
GNs (where M is a hyperparameter) with identical struc-
timesteps (tailored to the average duration for the various
ture, MLPs as internal edge and node update functions, and
materials to come to a stable equilibrium). A detailed list-
either shared or unshared parameters (as analyzed in Re-
ing of all our datasets can be found in the Supplementary
sults Section 5.4). We use GNs without global features or
Materials B.
global updates (similar to an interaction network)3 , and with
a residual connections between the input and output latent
4.2. GNS Implementation Details node and edge attributes.
We implemented the components of the GNS framework D ECODER details. Our decoder’s learned function, δ v ,
using standard deep learning building blocks, and used stan- is an MLP. After the D ECODER, the future position and
dard nearest neighbor algorithms (Dong et al., 2011; Chen velocity are updated using an Euler integrator, so the yi
et al., 2009; Tang et al., 2016) to construct the graph. corresponds to accelerations, p̈i , with 2D or 3D dimension,
Input and output representations. Each particle’s input depending on the physical domain. As mentioned above, the
state vector represents position, a sequence of C = 5 pre- supervised training outputs were simply these, p̈i vectors4 .
vious velocities1 , and features that capture static material Neural network parameterizations. All MLPs have two
properties (e.g., water, sand, goop, rigid, boundary parti- hidden layers (with ReLU activations), followed by a non-
t
cle), xtik = [ptik , ṗik−C+1 , . . . , ṗtik , fi ], respectively. The
2
global properties of the system, g, include external forces Similarly, relative velocities could be used to enforce invari-
and global material properties, when applicable. The pre- ance to inertial frames of reference.
3
In preliminary experiments we also attempted using a P RO -
diction targets for supervised learning are the per-particle
CESSOR with a full GN and a global latent state, for which the
average acceleration, p̈i . Note that in our datasets, we only global features g are encoded with a separate εg MLP.
require pi vectors: the ṗi and p̈i are computed from pi 4
Note that in this case optimizing for acceleration is equivalent
1
to optimizing for position, because the acceleration is computed as
C is a hyperparameter which we explore in our experiments. first order finite difference from the position and we use an Euler
integrator to update the position.
Learning to Simulate

activated output layer, each layer with size of 128. All Experimental 1-step Rollout
MLPs (except the output decoder) are followed by a Lay- N K (×10−9 ) (×10−3 )
domain
erNorm (Ba et al., 2016) layer, which we generally found
WATER -3D (SPH) 13k 800 8.66 10.1
improved training stability.
S AND -3D 20k 350 1.42 0.554
G OOP -3D 14k 300 1.32 0.618
4.3. Training WATER -3D-S (SPH) 5.8k 800 9.66 9.52
B OX BATH (PBD) 1k 150 54.5 4.2
Software. We implemented our models using TensorFlow
WATER 1.9k 1000 2.82 17.4
1, Sonnet 1, and the “Graph Nets” library (2018). S AND 2k 320 6.23 2.37
Training noise. Modeling a complex and chaotic simula- G OOP 1.9k 400 2.91 1.89
M ULTI M ATERIAL 2k 1000 1.81 16.9
tion system requires the model to mitigate error accumu- F LUID S HAKE 1.3k 2000 2.1 20.1
lation over long rollouts. Because we train our models on WATER D ROP 1k 1000 1.52 7.01
ground-truth one-step data, they are never presented with WATER D ROP -XL 7.1k 1000 1.23 14.9
input data corrupted by this sort of accumulated noise. This WATER R AMPS 2.3k 600 4.91 11.6
means that when we generate a rollout by feeding the model S AND R AMPS 3.3k 400 2.77 2.07
with its own noisy, previous predictions as input, the fact R ANDOM F LOOR 3.4k 600 2.77 6.72
C ONTINUOUS 4.3k 400 2.06 1.06
that its inputs are outside the training distribution may lead
it to make more substantial errors, and thus rapidly accu-
mulate further error. We use a simple approach to make the Table 1. List of maximum number of particles N , sequence length
model more robust to noisy inputs: at training we corrupt K, and quantitative model accuracy (MSE) on the held-out test set.
All domain names are also hyperlinks to the video website.
the input velocities of the model with random-walk noise
N (0, σv = 0.0003) (adjusting input positions), so the train-
ing distribution is closer to the distribution generated during
rollouts. See Supplementary Materials B for full details.
Normalization. We normalize all input and target vectors 4.4. Evaluation
elementwise to zero mean and unit variance, using statistics
computed online during training. Preliminary experiments To report quantitative results, we evaluated our models af-
showed that normalization led to faster training, though ter training converged by computing one-step and rollout
converged performance was not noticeably improved. metrics on held-out test trajectories, drawn from the same
distribution of initial conditions used for training. We used
Loss function and optimization procedures. We ran- particle-wise MSE as our main metric between ground truth
t
domly sampled particle state pairs (xtik , xik+1 ) from train- and predicted data, both for rollout and one-step predictions,
ing trajectories, calculated target accelerations p̈tik (subtract- averaging across time, particle and spatial axes. We also
ing the noise added to the most recent input velocity), and investigated distributional metrics including optimal trans-
computed the L2 loss on the predicted per-particle accel- port (OT) (Villani, 2003) (approximated by the Sinkhorn
t
erations, i.e., L(xtik , xik+1 ; θ) = kdθ (xtik ) − p̈tik k2 . We Algorithm (Cuturi, 2013)), and Maximum Mean Discrep-
optimized the model parameters θ over this loss with the ancy (MMD) (Gretton et al., 2012). For the generalization
Adam optimizer (Kingma & Ba, 2014), using a nominal5 experiments we also evaluate our models on a number of
mini-batch size of 2. We performed a maximum of 20M initial conditions drawn from distributions different than
gradient update steps, with exponential learning rate decay those seen during training, including, different number of
from 10−4 to 10−6 . While models can train in significantly particles, different object shapes, different number of ob-
less steps, we avoid aggressive learning rates to reduce vari- jects, different initial positions and velocities and longer
ance across datasets and make comparisons across settings trajectories. See Supplementary Materials B for full details
more fair. on metrics and evaluation.
We evaluated our models regularly during training by pro-
ducing full-length rollouts on 5 held-out validation trajecto- 5. Results
ries, and recorded the associated model parameters for best
rollout MSE. We stopped training when we observed negli- Our main findings are that our GNS model can learn ac-
gible decrease in MSE, which, on GPU/TPU hardware, was curate, high-resolution, long-term simulations of different
typically within a few hours for smaller, simpler datasets, fluids, deformables, and rigid solids, and it can generalize
and up to a week for the larger, more complex datasets. well beyond training to much longer, larger, and challenging
settings. In Section 5.5 below, we compare our GNS model
5
The actual batch size varies at each step dynamically. See to two recent, related approaches, and find our approach
Supplementary Material for more details. was simpler, more generally applicable, and more accurate.
Learning to Simulate

Initial

Prediction

Ground truth

Figure 3. We can simulate many materials, from (a) G OOP over (b) WATER to (c) S AND, and (d) their interaction with rigid obstacles
(WATER R AMPS). We can even train a single model on (e) multiple materials and their interaction (M ULTI M ATERIAL). We applied
pre-trained models on several out-of-distribution tasks, involving (f) high-res turbulence (trained on WATER R AMPS), (g) multi-material
interactions with unseen objects (trained on M ULTI M ATERIAL), and (h) generalizing on significantly larger domains (trained on
WATER R AMPS). In the two bottom rows, we show a comparison of our model’s prediction with the ground truth on the final frame for
goop and sand, and on a representative mid-trajectory frame for water.

To challenge the robustness of our architecture, we used a The GNS model could also learn how the materials respond
single set of model hyperparameters for training across all to unpredictable external forces. In the F LUID S HAKE do-
of our experiments. Our GNS architecture used the relative main, a container filled with water is being moved side-to-
E NCODER variant, 10 steps of message-passing, with un- side, causing splashes and irregular waves.
shared GN parameters in the P ROCESSOR. We applied noise
Our model could also simulate fluid interacting with com-
with a scale of 3 · 10−4 to the input states during training.
plicated static obstacles, as demonstrated by our WATER -
R AMPS and S AND R AMPS domains in which water or sand
5.1. Simulating Complex Materials pour over 1-5 obstacles. Figure 3(d) depicts comparisons
Our GNS model was very effective at learning to simulate between our model and ground truth, and Table 1 shows
different complex materials. Table 1 shows the one-step and quantitative performance measures.
rollout accuracy, as MSE, for all experiments. For intuition We also trained our model on continuously varying material
about what these numbers mean, the edge length of the parameters. In the C ONTINUOUS domain, we varied the fric-
container was approximately 1.0, and Figure 3(a-c) shows tion angle of a granular material, to yield behavior similar to
rendered images of the rollouts of our model, compared a liquid (0◦ ), sand (45◦ ), or gravel (> 60◦ ). Our results and
to ground truth6 . Visually, the model’s rollouts are quite videos show that our model can account for these continu-
plausible. Though specific model-generated trajectories can ous variations, and even interpolate between them: a model
be distinguished from ground truth when compared side-by- trained with the region [30◦ , 55◦ ] held out in training can
side, it is difficult to visually classify individual videos as accurately predict within that range. Additional quantitative
generated from our model versus the ground truth simulator. results are available in Supplementary Materials C.
Our GNS model scales to large amounts of particles and
very long rollouts. With up to 19k particles in our 3D 5.2. Multiple Interacting Materials
domains—substantially greater than demonstrated in previ-
So far we have reported results of training identical GNS
ous methods—GNS can operate at resolutions high enough
architectures separately on different systems and materials.
for practical prediction tasks and high-quality 3D renderings
However, we found we could go a step further and train a
(e.g., Figure 1). And although our models were trained to
single architecture with a single set of parameters to simulate
make one-step predictions, the long-term trajectories remain
all of our different materials, interacting with each other in
plausible even over thousands of rollout timesteps.
a single system.
6
All rollout videos can be found here: https://sites.
In our M ULTI M ATERIAL domain, the different materials
google.com/view/learning-to-simulate
could interact with each other in complex ways, which
means the model had to effectively learn the product space
Learning to Simulate

of different interactions (e.g., water-water, sand-sand, water- 5.4. Key Architectural Choices
sand, etc.). The behavior of these systems was often much
We performed a comprehensive analysis of our GNS’s ar-
richer than the single-material domains: the stiffer materials,
chitectural choices to discover what influenced performance
such as sand and goop, could form temporary semi-rigid ob-
most heavily. We analyzed a number of hyperparameter
stacles, which the water would then flow around. Figure 3(e)
choices—e.g., number of MLP layers, linear encoder and
and this video shows renderings of such rollouts. Visually,
decoder functions, global latent state in the P ROCESSOR—
our model’s performance in M ULTI M ATERIAL is compa-
but found these had minimal impact on performance (see
rable to its performance when trained on those materials
Supplementary Materials C for details).
individually.
While our GNS model was generally robust to architec-
5.3. Generalization tural and hyperparameter settings, we also identified several
factors which had more substantial impact:
We found that the GNS generalizes well even beyond
1. the number of message-passing steps,
its training distributions, which suggests it learns a more
2. shared vs. unshared P ROCESSOR GN parameters,
general-purpose understanding of the materials and physical
3. the connectivity radius,
processes experienced during training.
4. the scale of noise added to the inputs during training,
To examine its capacity for generalization, we trained a GNS 5. relative vs. absolute E NCODER.
architecture on WATER R AMPS, whose initial conditions We varied these choices systematically for each axis, fixing
involved a square region of water in a container, with 1-5 all other axes with the default architecture’s choices, and
ramps of random orientation and location. After training, report their impact on model performance in the G OOP
we tested the model on several very different settings. In domain (Figure 4).
one generalization condition, rather than all water being
For (1), Figure 4(a,b) shows that a greater number of
present in the initial timestep, we created an “inflow” that
message-passing steps M yielded improved performance in
continuously added water particles to the scene during the
both one-step and rollout accuracy. This is likely because
rollout, as shown in Figure 3(f). When unrolled for 2500
increasing M allows computing longer-range, and more
time steps, the scene contained 28k particles—an order of
complex, interactions among particles. Because computa-
magnitude more than the 2.5k particles used in training—
tion time scales linearly with M , in practice it is advisable to
and the model was able to predict complex, highly chaotic
use the smallest M that still provides desired performance.
dynamics not experienced during training, as can be seen in
this video. The predicted dynamics were visually similar to For (2), Figure 4(c,d) shows that models with unshared
the ground truth sequence. GN parameters in the P ROCESSOR yield better accuracy,
especially for rollouts. Shared parameters imposes a strong
Because we used relative displacements between particles
inductive bias that makes the P ROCESSOR analogous to
as input to our model, in principle the model should handle
a recurrent model, while unshared parameters are more
scenes with much larger spatial extent at test time. We eval-
analogous to a deep architecture, which incurs M times
uated this on a much larger domain, with several inflows
more parameters. In practice, we found marginal difference
over a complicated arrangement of slides and ramps (see
in computational costs or overfitting, so we conclude that
Figure 3(h), video here). The test domain’s spatial width
using unshared parameters has little downside.
× height were 8.0 × 4.0, which was 32x larger than the
training domain’s area; at the end of the rollout, the number For (3), Figure 4(e,f) shows that greater connectivity R
of particles was 85k, which was 34x more than during train- values yield lower error. Similar to increasing M , larger
ing; we unrolled the model for 5000 steps, which was 8x neighborhoods allow longer-range communication among
longer than the training trajectories. We conducted a similar nodes. Since the number of edges increases with R, more
experiment with sand on the S AND R AMPS domain, testing computation and memory is required, so in practice the
model generalization to hourglass-shaped ramps. minimal R that gives desired performance should be used.
As a final, extreme test of generalization, we applied a model For (4), we observed that rollout accuracy is best for an
trained on M ULTI M ATERIAL to a custom test domain with intermediate noise scale (see Figure 4(g,h)), consistent with
inflows of various materials and shapes (Figure 3(g)). The our motivation for using it (see Section 4.3). We also note
model learned about frictional behavior between different that one-step accuracy decreases with increasing noise scale.
materials (sand on sticky goop, versus slippery floor), and This is not surprising: adding noise makes the training
that the model generalized well to unseen shapes, such as distribution less similar to the uncorrupted distribution used
hippo-shaped chunks of goop and water, falling from mid- for one-step evaluation.
air, as can be observed in this video.
Learning to Simulate

# message Shared Connectivity


Noise Std
Use
relative
GNS (Ours) CConv
passing steps GNs radius
−7
positions
10
a c e g i
One-step MSE
(Goop)

10−8

10−9

10−1
b d f h j m

Rollout MSE
Rollout MSE

10−2 CConv
(Goop)

10−2 GNS (Ours)

10−3

th

-S

nd

l
ia
ro

oo
3D
a

er
Sa
xB

at
-

er
er
0 1 2 3 4 5 6 7 8 9 10 11 13 15

Bo

tiM
at
e
ue

0. 3
0. 7
0. 1
5
02
03

3e .0
0. -05
0. 01

0. 3
0. 1
3

e
ue

at
ls

ls
00
00
01
01

0
00
00

W
0

ul
W
Tr

Tr
0.
0.

00
00
Fa

Fa
0.

M
Figure 4. (left) Effect of different ablations (grey) against our model (red) on the one-step error (a,c,e,g,i) and the rollout error (b,d,f,h,j).
Bars show the median seed performance averaged across the entire G OOP test dataset. Error bars display lower and higher quartiles,
and are shown for the default parameters. (right) Comparison of average performance of our GNS model to CConv. (k,l) Qualitative
comparison between GNS (k) and CConv (l) in B OX BATH after 50 rollout steps (video link). (m) Quantitative comparison of our
GNS model (red) to the CConv model (grey) across the test set . For our model, we trained one or more seeds using the same set of
hyper-parameters and show results for all seeds. For the CConv model we ran several variations including different radius sizes, noise
levels, and number of unroll steps during training, and show the result for the best seed. Errors bars show the standard error of the mean
across all of the trajectories in the test set (95% confidence level).

For (5), Figure 4(i,j) shows that the relative E NCODER training procedures which are carefully tailored to model-
is clearly better than the absolute version. This is likely ing fluid dynamics (e.g., an SPH-like local kernel, different
because the underlying physical processes that are being sub-networks for fluid and boundary particles, a loss func-
learned are invariant to spatial position, and the relative E N - tion that weights slow particles with few neighbors more
CODER ’s inductive bias is consistent with this invariance. heavily). Ummenhofer et al. (2020) reported CConv out-
performed DPI, so we quantitatively compared our GNS
5.5. Comparisons to Previous Models model to CConv. We implemented CConv as described in
its paper, plus two additional versions which borrowed our
We compared our approach to two recent papers which noise and multiple input states, and performed hyperparam-
explored learned fluid simulators using particle-based ap- eter sweeps over various CConv parameters. Figure 4(m)
proaches. Li et al. (2018)’s DPI studied four datasets of fluid, shows that across all six domains we tested, our GNS model
deformable, and solid simulations, and presented four dif- with default hyperparameters has better rollout accuracy
ferent, distinct architectures, which were similar to Sanchez- than the best CConv model (among the different versions
Gonzalez et al. (2018)’s, with additional features such as as and hyperparameters) for that domain. In this comparison
hierarchical latent nodes. When training our GNS model on video, we observe than CConv performs well for domains
DPI’s B OX BATH domain, we found it could learn to sim- like water, which it was built for, but struggles with some
ulate the rigid solid box floating in water, faithfully main- of our more complex materials. Similarly, in a CConv roll-
taining the stiff relative displacements among rigid particles, out of the B OX BATH DOMAIN the rigid box loses its shape
as shown Figure 4(k) and this video. Our GNS model did (Figure 4(l)), while our method preserves it. See Supple-
not require any modification—the box particles’ material mentary Materials D for full details of our DPI and CConv
type was simply a feature in the input vector—while DPI comparisons.
required a specialized hierarchical mechanism and forced
all box particles to preserve their relative displacements with
each other. Presumably the relative E NCODER and training 6. Conclusion
noise alleviated the need for such mechanisms. We presented a powerful machine learning framework for
Ummenhofer et al. (2020)’s CConv propagates information learning to simulate complex systems, based on particle-
across particles7 , and uses particle update functions and based representations of physics and learned message-
passing on graphs. Our experimental results show our sin-
7
The authors state CConv does not use an explicit graph repre- gle GNS architecture can learn to simulate the dynamics
sentation, however we believe their particle update scheme can be
interpreted as a special type of message-passing on a graph. See
of fluids, rigid solids, and deformable materials, interact-
Supplementary Materials D. ing with one another, using tens of thousands of particles
over thousands time steps. We find our model is simpler,
Learning to Simulate

more accurate, and has better generalization than previous lanczos bisection. Journal of Machine Learning Research,
approaches. 10(Sep):1989–2012, 2009.
While here we focus on mesh-free particle methods, our Cuturi, M. Sinkhorn distances: Lightspeed computation of
GNS approach may also be applicable to data represented optimal transportation distances, 2013.
using meshes, such as finite-element methods. There are
also natural ways to incorporate stronger, generic physi- Graph Nets Library. DeepMind, 2018. URL https://
cal knowledge into our framework, such as Hamiltonian github.com/deepmind/graph_nets.
mechanics (Sanchez-Gonzalez et al., 2019) and rich, archi- Dong, W., Moses, C., and Li, K. Efficient k-nearest neigh-
tecturally imposed symmetries. To realize advantages over bor graph construction for generic similarity measures.
traditional simulators, future work should explore how to In Proceedings of the 20th International Conference on
parameterize and implement GNS computations more ef- World Wide Web, pp. 577–586, 2011.
ficiently, and exploit the ever-improving parallel compute
hardware. Learned, differentiable simulators will be valu- Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and
able for solving inverse problems, by not strictly optimizing Dahl, G. E. Neural message passing for quantum chem-
for forward prediction, but for inverse objectives as well. istry. In Proceedings of the 34th International Conference
on Machine Learning-Volume 70, pp. 1263–1272. JMLR.
More broadly, this work is a key advance toward more org, 2017.
sophisticated generative models, and furnishes the modern
AI toolkit with a greater capacity for physical reasoning. Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf,
B., and Smola, A. A kernel two-sample test. Journal of
Machine Learning Research, 13(Mar):723–773, 2012.
Acknowledgements
Grzeszczuk, R., Terzopoulos, D., and Hinton, G. Neuroan-
We thank Victor Bapst, Jessica Hamrick and our reviewers imator: Fast neural network emulation and control of
for valuable feedback on the work and manuscript, and we physics-based models. In Proceedings of the 25th An-
thank Benjamin Ummenhofer for advice on implementing nual Conference on Computer graphics and Interactive
the continuous convolution baseline model. Techniques, pp. 9–20, 1998.
He, S., Li, Y., Feng, Y., Ho, S., Ravanbakhsh, S., Chen,
References
W., and Póczos, B. Learning to predict the cosmolog-
Ba, J. L., Kiros, J. R., and Hinton, G. E. Layer normalization. ical structure formation. Proceedings of the National
arXiv preprint arXiv:1607.06450, 2016. Academy of Sciences, 116(28):13825–13832, 2019.
Battaglia, P., Pascanu, R., Lai, M., Rezende, D. J., et al. Holl, P., Koltun, V., and Thuerey, N. Learning to con-
Interaction networks for learning about objects, relations trol PDEs with differentiable physics. arXiv preprint
and physics. In Advances in Neural Information Process- arXiv:2001.07457, 2020.
ing Systems, pp. 4502–4510, 2016.
Hu, Y., Fang, Y., Ge, Z., Qu, Z., Zhu, Y., Pradhana, A., and
Battaglia, P. W., Hamrick, J. B., Bapst, V., Sanchez- Jiang, C. A moving least squares material point method
Gonzalez, A., Zambaldi, V., Malinowski, M., Tacchetti, with displacement discontinuity and two-way rigid body
A., Raposo, D., Santoro, A., Faulkner, R., et al. Rela- coupling. ACM Trans. Graph., 37(4), July 2018.
tional inductive biases, deep learning, and graph networks. Hu, Y., Anderson, L., Li, T.-M., Sun, Q., Carr, N., Ragan-
arXiv preprint arXiv:1806.01261, 2018. Kelley, J., and Durand, F. Difftaichi: Differentiable
Bender, J. and Koschier, D. Divergence-free smoothed parti- programming for physical simulation. arXiv preprint
cle hydrodynamics. In Proceedings of the 2015 ACM SIG- arXiv:1910.00935, 2019.
GRAPH/Eurographics Symposium on Computer Anima- Kingma, D. P. and Ba, J. Adam: A method for stochastic
tion. ACM, 2015. doi: http://dx.doi.org/10.1145/2786784. optimization. arXiv preprint arXiv:1412.6980, 2014.
2786796.
Kipf, T. N. and Welling, M. Semi-supervised classifica-
Chang, M. B., Ullman, T., Torralba, A., and Tenenbaum, tion with graph convolutional networks. arXiv preprint
J. B. A compositional object-based approach to learning arXiv:1609.02907, 2016.
physical dynamics. arXiv preprint arXiv:1612.00341,
2016. Kumar, S., Bitorff, V., Chen, D., Chou, C., Hechtman, B.,
Lee, H., Kumar, N., Mattson, P., Wang, S., Wang, T., et al.
Chen, J., Fang, H.-r., and Saad, Y. Fast approximate kNN Scale mlperf-0.6 models on google tpu-v3 pods. arXiv
graph construction for high dimensional data via recursive preprint arXiv:1909.09756, 2019.
Learning to Simulate

Ladickỳ, L., Jeong, S., Solenthaler, B., Pollefeys, M., and Sun, C., Karlsson, P., Wu, J., Tenenbaum, J. B., and Murphy,
Gross, M. Data-driven fluid simulations using regression K. Stochastic prediction of multi-agent interactions from
forests. ACM Transactions on Graphics (TOG), 34(6): partial observations. arXiv preprint arXiv:1902.09641,
1–9, 2015. 2019.

Li, Y., Wu, J., Tedrake, R., Tenenbaum, J. B., and Torralba, Tacchetti, A., Song, H. F., Mediano, P. A., Zambaldi, V., Ra-
A. Learning particle dynamics for manipulating rigid binowitz, N. C., Graepel, T., Botvinick, M., and Battaglia,
bodies, deformable objects, and fluids. arXiv preprint P. W. Relational forward models for multi-agent learning.
arXiv:1810.01566, 2018. arXiv preprint arXiv:1809.11044, 2018.

Li, Y., Wu, J., Zhu, J.-Y., Tenenbaum, J. B., Torralba, A., and Tang, J., Liu, J., Zhang, M., and Mei, Q. Visualizing large-
Tedrake, R. Propagation networks for model-based con- scale and high-dimensional data. In Proceedings of the
trol under partial observation. In 2019 International Con- 25th International Conference on World Wide Web, pp.
ference on Robotics and Automation (ICRA), pp. 1205– 287–297, 2016.
1211. IEEE, 2019.
Trivedi, R., Dai, H., Wang, Y., and Song, L. Know-evolve:
Macklin, M., Müller, M., Chentanez, N., and Kim, T.-Y. Deep temporal reasoning for dynamic knowledge graphs.
Unified particle physics for real-time applications. ACM In Proceedings of the 34th International Conference on
Transactions on Graphics (TOG), 33(4):1–12, 2014. Machine Learning-Volume 70, pp. 3462–3471. JMLR.
org, 2017.
Manessi, F., Rozza, A., and Manzo, M. Dynamic graph
convolutional networks. Pattern Recognition, 97:107000, Trivedi, R., Farajtabar, M., Biswal, P., and Zha, H. Dyrep:
2020. Learning representations over dynamic graphs. In In-
ternational Conference on Learning Representations,
Monaghan, J. J. Smoothed particle hydrodynamics. Annual 2019. URL https://openreview.net/forum?
review of astronomy and astrophysics, 30(1):543–574, id=HyePrhR5KX.
1992.
Ummenhofer, B., Prantl, L., Thürey, N., and Koltun, V.
Mrowca, D., Zhuang, C., Wang, E., Haber, N., Fei-Fei, Lagrangian fluid simulation with continuous convolutions.
L. F., Tenenbaum, J., and Yamins, D. L. Flexible neural In International Conference on Learning Representations,
representation for physics prediction. In Advances in 2020. URL https://openreview.net/forum?
Neural Information Processing Systems, pp. 8799–8810, id=B1lDoJSYDH.
2018.
Veličković, P., Ying, R., Padovano, M., Hadsell, R., and
Müller, M., Heidelberger, B., Hennix, M., and Ratcliff, J. Blundell, C. Neural execution of graph algorithms. In
Position based dynamics. Journal of Visual Communica- International Conference on Learning Representations,
tion and Image Representation, 18(2):109–118, 2007. 2020. URL https://openreview.net/forum?
id=SkgKO0EtvS.
Sanchez-Gonzalez, A., Heess, N., Springenberg, J. T.,
Merel, J., Riedmiller, M., Hadsell, R., and Battaglia, P. Villani, C. Topics in optimal transportation. American
Graph networks as learnable physics engines for infer- Mathematical Soc., 2003.
ence and control. arXiv preprint arXiv:1806.01242, 2018.
Wiewel, S., Becher, M., and Thuerey, N. Latent space
Sanchez-Gonzalez, A., Bapst, V., Cranmer, K., and physics: Towards learning the temporal evolution of fluid
Battaglia, P. Hamiltonian graph networks with ode inte- flow. In Computer Graphics Forum, pp. 71–82. Wiley
grators. arXiv preprint arXiv:1909.12790, 2019. Online Library, 2019.
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Yan, S., Xiong, Y., and Lin, D. Spatial temporal graph
Monfardini, G. The graph neural network model. IEEE convolutional networks for skeleton-based action recog-
Transactions on Neural Networks, 20(1):61–80, 2008. nition. In Thirty-second AAAI Conference on Artificial
Intelligence, 2018.
Schoenholz, S. S. and Cubuk, E. D. Jax, md: End-to-end
differentiable, hardware accelerated, molecular dynamics
in pure python. arXiv preprint arXiv:1912.04232, 2019.

Sulsky, D., Zhou, S.-J., and Schreyer, H. L. Application of


a particle-in-cell method to solid mechanics. Computer
physics communications, 87(1-2):236–252, 1995.

You might also like