Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method a Differe
Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method a Differe
Abstract
While deep learning and data-driven modeling approaches based on deep neu-
ral networks (DNNs) have recently attracted increasing attention for solving
partial differential equations, their practical application to real-world scientific
and engineering problems remains limited due to the relatively low accuracy
and high computational cost. In this study, we present the neural-integrated
meshfree (NIM) method, a differentiable programming-based hybrid meshfree
approach within the field of computational mechanics. NIM seamlessly inte-
grates traditional physics-based meshfree discretization techniques with deep
learning architectures. It employs a hybrid approximation scheme, NeuroPU, to
effectively represent the solution by combining continuous DNN representations
with partition of unity (PU) basis functions associated with the underlying
spatial discretization. This neural-numerical hybridization not only enhances
the solution representation through functional space decomposition but also
reduces both the size of DNN model and the need for spatial gradient computa-
tions based on automatic differentiation, leading to a significant improvement in
training efficiency. Under the NIM framework, we propose two truly meshfree
solvers: the strong form-based NIM (S-NIM) and the local variational form-
based NIM (V-NIM). In the S-NIM solver, the strong-form governing equation is
directly considered in the loss function, while the V-NIM solver employs a local
Petrov-Galerkin approach that allows the construction of variational residuals
based on arbitrary overlapping subdomains. This ensures both the satisfaction
of underlying physics and the preservation of meshfree property. We perform
extensive numerical experiments on both stationary and transient benchmark
problems to assess the effectiveness of the proposed NIM methods in terms of
accuracy, scalability, generalizability, and convergence properties. Moreover,
comparative analysis with other physics-informed machine learning methods
demonstrates that NIM, especially V-NIM, significantly enhances both accuracy
and efficiency in end-to-end predictive capabilities.
∗ Corresponding author
Email address: [email protected] (QiZhi He)
1. Introduction
2
models are employed individually to approximate different components of the
PDE-governing systems while they can be integrated seamlessly to perform
predictive simulation. In the area of computational mechanics, the exemplary
approaches such as model-free data-driven computing [48, 23, 34, 37] and the
coupled data-driven/numerical solvers [80, 35] have received extensive attention.
3
by the test functions. Recently, Kharazmi et al. extended the VPINN method
to hp-VPINN[46] by introducing domain decomposition, allowing a localized
network parameter optimization and improved training accuracy.
4
functions. This new framework is coined the neural-integrated meshfree (NIM)
method.
In this work, without loss of generality, we chose to employ the reproducing
kernel (RK) meshfree shape functions [56, 16] in the NeuroPU approximation
since the RK shape functions, constructed based on spatially distributed nodes
over physical domain, offer the flexibility to design arbitrary order of accuracy,
smoothness, and compactness. The motivation of introducing NeuroPU for solu-
tion approximation is to regulate the solution representation by well-established
PU basis functions and mitigate the need for intricate computations of high-order
gradients that usually rely on AD, ultimately improving training efficiency. On
the other hand, NeuroPU leverages embedded neural networks to represent
the functional space related to problem-related parameters, enabling effective
surrogate modeling of parameterized or time-dependent problems.
In this study, we will present two NIM solvers for computational mechanics
modeling: strong form-based NIM (S-NIM) and local variational form-based
NIM (V-NIM). Notably, both of these solvers are genuinely meshfree, elim-
inating the need for high-cost conforming mesh generation. Like the PINN
methods [65, 44, 36, 19], the former one considers the strong-form governing
equations in the associated loss function. We will demonstrate the superior
performance of S-NIM over standard PINNs due to the incorporation of Neu-
roPU approximation, which substantially reduces the DNN solution space and
improves the training efficiency and accuracy. In order to further improve the
accuracy and stability, we propose the V-NIM solver, which is inspired by the
meshless local Petrov–Galerkin method (MLPG) [2, 1] that drives the consistent
weak formulation over local subdomains. Distinct from other variational PINN
methods [47, 46, 7], the proposed V-NIM allows for using arbitrary overlapping
subdomains to formulate the loss function, and thus, upholding the meshfree
property. We demonstrate the outstanding performance of the proposed NIM
methods, especially V-NIM, through extensive experiments on the benchmark
examples (e.g., Poisson equation, linear elasticity, time-dependent problem, and
a parameterized PDE), where the accuracy, convergence property, generaliz-
ability, and efficiency are compared against other baseline methods. To the
best of the authors’ knowledge, this study represents the first attempt to de-
velop differentiable programming-based meshfree solvers integrated with hybrid
neuro-numerical approximation for computational mechanics modeling.
The remainder of the paper is organized as follows: Section 2 provides a back-
ground review of numerical discretization and DNNs for function approximation.
In Section 3, we delve into the construction of hybrid NeuroPU approximation.
We present the methodology development of the NIM framework based on the
NeuroPU approxiamtion and meshfree discretization in Section 4, followed by
the detailed solution procedures of the proposed S-NIM and V-NIM methods
provided in Section 5. Numerical tests on static and time-dependent bench-
mark problems are presented in Section 6 and Section 7, respectively. Section 8
concludes the paper by summarizing the main findings and contributions.
5
2. Preliminaries
In this section, we first present a background review of the two fundamental
methodologies of function approximation based on numerical discretization and
deep neural networks (DNNs), as well as their applications to solving PDEs in
computational mechanics.
For demonstration, a classical linear elastostatics problem is taken as the
model problem. Let us consider an elastic solid defined in a bounded domain
Ω ⊂ Rd , where d denotes the spatial dimension, and its boundary ∂Ω ⊂ Rd−1 is
split as the essential boundary condition (EBC) on Γg and the natural boundary
condition (NBC) on Γt , i.e., ∂Ω = Γg ∪ Γt with Γg ∩ Γt = ∅. The governing PDE
describing the static equilibrium is written as:
∇ · σ(u) + f = 0, in Ω
n · σ(u) = t, on Γt (1)
u = u, on Γg
where u is the displacement vector, σ is the Cauchy stress tensor, f is the body
force, u and t are the displacement and traction values prescribed on Γg and Γt ,
respectively, and n is the surface normal on Γt .
For solid mechanics problems, the constitutive law that relates stress and
strain, e.g., σ = σ(ε(u)), is required to solve the boundary value problem (BVP)
in (1), where the linear strain tensor is given by
1
ε := ∇sym u = (∇u + ∇uT ) (2)
2
For linear elastic materials, the strain-stress relation becomes
σ=C:ε (3)
where C is the elasticity tensor.
For many engineering applications, such as inverse design, uncertainty quan-
tification, and surrogate modeling, varying model parameters are considered in
the solid mechanics analysis (1). Here, µ ∈ Rc is used to denote the parameter
vector associated with the problem-specific variables, such as materials properties,
loading, and boundary conditions, etc. In this scenario, the displacement solution
is then generalized as a parameter-dependent field u(x, µ) : Ω × Rc 7→ Rd .
6
where Nh represents the number of nodes, dI ∈ Rd is the nodal coefficient at
location xI , and ΨI is the shape function associated with the Ith node.
While the shape functions ΨI can be defined in many different forms, they
are required to satisfy specific conditions (completeness and continuity) to ensure
the convergence property of approximate solutions. One essential condition is
the so-called partition of unity (PU) [3], which requires that the ensemble of
the compact
SN h support of shape
PNhfunctions generates a covering for domain Ω, i.e.,
Ω ⊂ I=1 supp{ΨI }, and I=1 ΨI = 1 with 0 ≤ ΨI (x) ≤ 1.
For x ∈ Ω, let Sx = {I | x ∈ supp (ΨI )} be an index set of the nodal shape
functions whose influence do not vanish at location x, the PU approximation (4)
can be rewritten as X
uh (x) = ΨI (x)dI (5)
I∈Sx
with z = ∥xI − x∥/a. In (6), p[p] (x) is a vector of monomial basis functions up
to the pth order
n oT
p[p] (x) = 1, x1 , x2 , x3 , · · · , xi1 xj2 xk3 , . . . , xp3 ,0 ≤ i + j + k ≤ p (8)
and the parameter vector b(x) is determined by enforcing the following pth order
reproducing conditions:
X
ΨI (x)p[p] (xI ) = p[p] (x) (9)
I∈Sx
7
Invoking (10) into (6), we have the following expression for the RK shape
functions:
ΨI (x) = p[p]T (x)A−1 (x)p[p] (xI − x) ϕa (xI − x) (12)
Subsequently, the nodal coefficients in Eq. (5) can be determined by solving the
Galerkin weak formulation [40] corresponding to the elasticity problem (1).
Specifically, the RK shape function with quadratic basis function (p = 2)
and the cubic B-spline function (7) is, by default, employed in the following
numerical study, which will be further discussed in Section 3 and 6. Additionally,
we denote a normalized support size with the characteristic nodal distance h as
ā = a/h.
2.2.1. PINNs
When using the PINN method [65, 44] for solving the elasticity problem in
Eqs. (1)-(3), a DNN model is used to approximate the displacement solution,
denoted as û(x), with the spatial coordinates x as the network inputs. The
differential operators, e.g., ∇û(x; θ), are computed by performing Automatic
Differentiation (AD) [4] with respect to the inputs. As a result, the mean
square errors (MSEs) of the governing partial differential equations (PDEs) are
subsequently incorporated into the loss function by feeding both the approximate
solution and its derivatives on the given set of collocation points. With slightly
abusing the notion, the corresponding loss of the elasticity problem in (1)-(3) is
given as
8
The parameters θ of neural networks are optimized through the minimization
of the loss function. For a more formal formulation regarding the standard
PINN [31] or the mixed-form PINN [68] for linear elasticity, we refer readers to
the following studies [65, 31, 68, 36].
Aided by DNNs, the traditional PINN methods offer a straightforward way
to approximate the solution fields by incorporating the given physical laws.
Nevertheless, the complex solution fields often necessitate a relatively large
size of neural networks to ensure sufficient approximation capacity, inevitably
leading to over-parameterized search space that poses challenges in training the
non-convex PDEs-informed loss functions [33, 19, 17]. Consequently, a huge
amount of collocation points as well expensive optimizer iterations are usually
required to train the PINN model resulting from complex high-dimensional PDE
problems [19, 50, 22]. Moreover, performing AD operators with regard to large
data points over spatiotemporal domain usually takes a considerable portion of
computational time, especially when dealing with high-order PDEs.
In order to enhance the training process and accuracy within the field of
PIML, we propose a hybrid approach called neural-partition of unity (NeuroPU)
approximation, which integrates numerical discretization generated by PU shape
functions (Section 2.1) over the physical domain and the neural network-based
approximation (Section 2.2) for nodal coefficient functions. This hybrid approx-
imation method is the building block of the proposed NIM framework, which
will be further elaborated in Section 4.
9
Given a set of predefined nodal shape functions, the NeuroPU approximation
of the parametric solution u(x, µ) is expressed as
X
ûh (x, µ) = ΨI (x)dˆI (µ) (16)
I∈Sx
where ΨI is the PU shape function and specifically the RK shape function defined
in (6) is adopted, and Sx is the set of nodes that contribute to the interpolation
at x (see Section 2.1). In Eq. (16), the nodal coefficient function dˆI (µ) ∈ Rd is
the corresponding Ith output of a neural network Nθ which is parametrized by θ
as explained in Section 2.2. Let the collection of nodal coefficients be denoted as
dˆ := {dˆI }N
I=1 ∈ R
h d×Nh
, which defines a mapping from inputs µ to the discrete
ˆ
nodal variable d through a multi-layer neural network, i.e.,
dˆ : µ ∈ Rc 7→ Nθ (µ) (17)
Notably, the utilization of the DNN model allows the nodal coefficients to be
expressed as a function of problem-defined system parameters µ, e.g., tempo-
ral coordinates and material coefficients. Thus, the NeuroPU approximation
can be readily employed for a surrogate model for parameterized systems, as
demonstrated in Section 6.3.
Due to the compact nature of the support domain, only a few entries of ΨI
and dˆI interacted within the support domain Sx will be active in the dot product
in Eq. (16). This results in a sparsity structure that can streamline matrix
calculations and save storage. Besides, the hybrid NeuroPU approximation
offers an efficient computation of all the space-dependent derivative terms. To
wit, since spatial coordinates are only involved via the shape functions in the
NeuroPU construction, the spatial gradient of ûh yields
X
∇ûh (x, µ) = ∇ΨI (x)dˆI (µ) (18)
I∈Sx
In this setting, the derivatives of shape functions, ∇ΨI , can be pre-computed and
stored in advance for the subsequent derivation of ∇ûh when establishing the loss
function. This is essentially distinct from the PINN method where the differential
and gradient operators in governing equations are computed through automatic
differentiation during training. Furthermore, in contrast to the studies [53, 69],
where shape functions are implicitly encoded using architectural or hierarchical
neural networks, the NeuroPU approximation explicitly expresses these shape
functions, which preserves the necessary simplicity and compatibility for scalable
implementation, akin to classical numerical methods.
Remark 3.1. It is noted that while the reproducing kernel (RK) approximation
is adopted as PU shape functions in Eq. (16), other types of PU shape functions ,
e.g., Lagrange polynomial basis [18, 40], spectral series [64], and NURBS [41], are
also applicable to the proposed framework. However, to ensure truly meshfree
properties and avoid tedious mesh generation, we adopt meshfree-type shape
functions that are defined on physically spatial domain and allow overlapping
compact supports.
10
Remark 3.2. The proposed NeuroPU approach offers the flexibility to utilize
various parameter inputs µ and neural network architectures for nodal coefficient
functions dˆI (µ), depending on the specific problem at hand. For example, when
dealing with temporal variable as input, fully connected neural networks (FCNN)
provide a continuous representation that captures temporal evolution, while
convolutional neural networks (CNN) can be employed to process the image
inputs, obtaining nodal coefficient functions with discrete representations.
where the matrix forms of ûh , σ̂ h and t̂h are provided in Appendix A. The
superscript h denotes the variables involving the NeuroPU approximation. Nf
Nf
represents the number of residual points Sf = {xi }i=1 sampled over computation
domain Ω (denoted by blue stars in Figure 2a), Nt and Ng are the numbers
Ng
of sampling points St = {xi }N i=1 and Sg = {xi }i=1 whereby EBC on Γg and
t
11
parameterized by the parameter vector µ, and Nµ denotes the number of sample
Nµ
points in the parameter set, i.e., Sµ = {µi }i=1 .
In Eq. 19, the weight coefficients α1 and α2 are used to penalize the loss
terms associated with boundary conditions. It has been reported that the
weights are critical to the convergence rates of loss functions and the accuracy of
approximation solution [36, 77]. While the loss function of S-NIM resembles the
standard PINN method for elasticity problems [31, 68], distinct approximation
functions are used. Our numerical studies in Section 6 will demonstrate that the
introduction of the NeuroPU approximation in S-NIM significantly boosts both
the training efficiency and accuracy, attributing to the reduced dimensionality
in approximation space and the efficient, high-order accurate spatial gradients
provided by RK shape functions.
Figure 2: Schematics of the S-NIM and V-NIM methods, where the nodes are represented
by black points, and the influence domains supp(xI ) of the trial shape functions (ΨI ) are
indicated by grey rectangular or square areas around nodes. (a) S-NIM: The sample points are
represented by blue stars; (b) V-NIM: The local subdomains, denoted by Ωs , are represented by
blue rectangles, with their centers marked as blue stars. The boundary of the local subdomain,
∂Ωs , is divided into two parts, Γs and Ls , where Γs represents the portion of ∂Ωs that lies
on the global boundary ∂Ω, whereas Ls corresponds to the portion of the boundary located
within the domain Ω.
4.2. Approach II: Local variational form-based neural integrated meshfree frame-
work (V-NIM)
While formulating the loss function for the proposed S-NIM method is
straightforward due to the employment of strong-form governing PDEs, the
involvement of higher-order derivatives in loss functions could potentially impact
the efficiency and accuracy. Several variational physics-informed machine learn-
ing methods [47, 46, 7], as discussed in the introduction, have been developed to
alleviate these issues, where the corresponding loss function is constructed based
12
on the variational (weak) form of the governing equations derived by using the
weighted residual methods along with various testing functions. The use of weak
form decreases the required regularity of the approximate solution. Consequently,
a reduction in the highest order of derivatives and an improvement in training
accuracy can be achieved when compared to the strong form counterpart.
However, these variational approaches typically require a conforming dis-
cretization to construct the loss function described by an integral form on the
entire computation domain, which inevitably results in the loss of the truly
meshless characteristic. Motivated by this concern, we propose to introduce a
local weak formulation in the NIM method, namely V-NIM, which effectively
preserves the discretization/mesh-free property. This V-NIM approach admits
the local incorporation of underlying physics and avoids the significant costs
associated with mesh generation.
Applying integration by parts and the divergence theorem to the above equation,
with introducing the traction/natural boundary condition, yields the following
local variational (weak) formulation
Z Z Z
∇v : σdΩ − v · tdΓ − v · tdΓ
Ωs Ls Γsg
Z Z (21)
= v · f dΩ + v · tdΓ
Ωs Γst
where t = σ ·n. In general, the boundary of the local domain Ωs is ∂Ωs = Γs ∪Ls ,
in which Γs denotes the portion of the local boundary ∂Ωs located on the global
boundary Γ = ∂Ω, i.e., Γs = ∂Ω ∩ Γ, whereas Ls represents the remaining
part of the local boundary that lies inside the domain, as shown in Figure 2b.
Specifically, we denote Γsg = Γs ∩ Γg and Γst = Γs ∩ Γt as the parts of the
local boundary Γs on which the EBC and NBC are specified, respectively. For a
subdomain located entirely within the global domain, there exists no Γs , and
thus, the boundary integrals over Γsg and Γst vanish and Ls = ∂Ωs . It should
be emphasized that the essential boundary conditions have not yet been imposed
in Eq. (21), and this will be addressed in Section 4.2.4.
Given a selected test function v defined on Ωs , the discretization of Eq. (21)
will only yield one linear algebraic equation. To ensure obtaining a sufficient
13
number of linearly independent equations for displacement solution u ∈ Rd , we
can apply Nv (Nv ≥ d) independent sets of test functions {v (k) }N k=1 to Eq. (21).
v
As such, the local variational residual associated with the kth test function v (k)
over Ωs is defined as
Z Z Z
(k) (k) (k)
Rs = v · tdΓ + v · tdΓ + v (k) · tdΓ
Ls Γ Γst
Z Z sg (22)
− ε(k)
v : σdΩ + v (k)
· f dΩ
Ωs Ωs
(k)
where εv = 12 (∇v (k) + ∇v (k)T ) is the symmetric part of ∇v (k) considering the
symmetry of Cauchy stress σ.
In practice, Nv = d is commonly considered in the study using local variational
form [1]. In the following, we take a 2D solid problem as an example, i.e., d = 2.
(k)
The set of test functions {v (k) }2k=1 and {εv }2k=1 can be assembled in matrices
w and εw , respectively, which are
" # " #T
(1) (2) (1) (1) (1) (1)
(j) v1 v1 v1,1 v2,2 v1,2 + v2,1
w= [vi ] = (1) (2) , εw = (2) (2) (2) (2) (23)
v2 v2 v1,1 v2,2 v1,2 + v2,1
As a result, with invoking the NeuroPU approximation (16) in Eq. (22), the
matrix form of the local variational residual at Ωs is formulated as
Z Z Z
Rhs = wT t̂h dΓ + wT t̂h dΓ + wT tdΓ
Ls Γsg Γst
Z Z (24)
− εTw σ̂ h dΩ + T
w f dΩ
Ωs Ωs
Unless otherwise stated, the setting of test functions in Eq. (25) is employed
in V-NIM across the present study. The matrix forms of other variables can be
referred to Appendix A.
Remark 4.1. It is noted that the trial shape functions ΨI (x) in the NeuroPU
approximation and the test function v(x) can be chosen from different function
spaces, leading to the Petrov-Galerkin method [39, 10, 2]. This flexibility allows
the test functions defined on overlapping subdomains, and the test functions
14
are not required to vanish on the boundary where EBCs are specified (more
discussion will be shown in Section 4.2.4).
Remark 4.2. We also note that the natural boundary condition (NBC) on Γst
are consistently imposed in V-NIM as its boundary integral terms are involved
in the local variational form (21) and the associated residual (24). Due to
the local consistency, we argue that V-NIM can provide a more accurate and
stable approximation compared to the weakly imposition of NBC via the penalty
method, such as PINN [65, 68] and S-NIM (19).
Remark 4.3. V-NIM allows the local weak form (21) (or the local residual
form (24)) to be constructed locally with Galerkin consistency. Given this unique
feature, the proposed V-NIM is a truly meshfree framework, distinct from other
global variational form-based methods [47, 46, 21, 70, 81], where the loss function
is constructed by using globally defined test functions or conforming background
integration cells over the entire domain.
15
2) Heaviside step function. The cumbersome computation of the domain
integral in Eq. (24) can be fully circumvented if we adopt the Heaviside step
function as the test function, namely,
(
0 x∈ / (Ωs ∪ Ls )
v(x) = (27)
1 x ∈ (Ωs ∪ Ls )
where the domain integral term containing the derivative of v(x) is canceled due
to the property of Heaviside step function.
It is also worthwhile pointing out that if Dirac’s Delta function δ(x − xs ),
where xs is the center of subdomain Ωs , is adopted for test functions, the local
variational form (21) that V-NIM is based on will be degenerated to the strong
formulation so that the S-NIM method is restored. In this case, the centers
Nf
of subdomains {xs }N s=1 are considered as the sample points {xi }i=1 in S-NIM,
T
refer to Figure 2.
16
written as:
Nµ NT
1 XX 2
V
L (θ) = Rh,g
s (µj )
Nµ NT j=1 s=1
2
(30)
Nµ NT Z
1 XX
= Rhs + α wT (ûh − ū)dΓ
Nµ NT j=1 s=1 Γsg
where some examples of the local variational residual Rhs are provided in Sec-
tion 4.2.3. Again, ûh (x, µ; θ) = I∈Sx ΨI (x)dˆI (µ; θ) adopts the NeuroPU
P
approximation, where θ are the trainable parameters of the neural networks.
As reported in Remark 4.3, it can be seen that the loss LV for V-NIM simply
relies on the summation of local variational residuals instead of a full integral
form over the whole domain to enforce equilibrium [47, 70, 81, 7]. Thanks to this
collocation-like construction, each local residual can be minimized separately. On
the other hand, compared to other domain decomposition-based DNN methods,
such as hp-VPINN [46] or local extreme learning machines [20], the proposed
V-NIM doesn’t require conforming subdomains to formulate the local residuals,
which offers a great flexibility in constructing the loss function. Overall, the
local feature embedded in V-NIM enables the employment of an efficient and
scalable (mini-)batch training procedure. The enhanced computational efficiency
and accuracy will be highlighted in the following numerical experiments.
5. Solution Procedures
A summary of the numerical procedures for the proposed S-NIM and V-NIM
methods is provided in this section.
Nh ˆ θ) = Nθ (µ).
{xI }I=1 , and initialize the nodal coefficient network d(µ;
3. Construct the loss function based on Eq. (19).
4. Define the optimizer and minimize the loss function until convergence.
5. Output the trained network d(µ; ˆ θ ∗ ).
17
Implementation of V-NIM method
1. Define the following parameters.
1.a. The set of meshfree nodes {xI }N I=1 and the set of center points of
h
NT
subdomains {xs }s=1 distributed over Ω.
1.b. The parameters for NeuroPU approximation (Figure 1) including the
neural network architecture, the order of basis function p, the support
size a, and the type of shape functions.
1.c. The parameters for local variational form including the size of subdo-
mains r and the type of test functions.
2. Calculate and store the trial shape functions {ΨI }NI=1 associated with nodes
h
Nh ˆ θ) = Nθ (µ).
{xI }I=1 , and initialize the nodal coefficient network d(µ;
3. Construct the loss function based on Eq. (30) by using the Gauss quadra-
ture points.
3.a. By introducing the quadrature rules, the discrete forms of Eq. (26)
and Eq. (28) are, respectively, given as follows
Z Z Z Z
Rhs = T h
w t dΓ + T
w tdΓ − εTw σ h dΩ + wT f dΩ
Γsg Γst Ωs Ωs
NL
( NB
)
X X
T h
= JΓE
sg
w (xB ) t (xB ) ωB
E=1 B=1
NL
( NB
)
X X
T
+ JΓE
st
w (xB ) t (xB ) ωB
E=1 B=1
x y
NE NE ( NG
)
X X X
− J (Ex ,Ey ) εTw h
(xG ) σ (xG ) ωG
Ωs
Ex =1 Ey =1 G=1
x y
NE NE ( NG
)
X X X
T
+ J (Ex ,Ey ) w (xG ) f (xG ) ωG
Ωs
Ex =1 Ey =1 G=1
(31)
and
Z Z Z Z
Rhs = th dΓ + th dΓ + tdΓ + f dΩ
Ls Γsg Γst Ωs
NL
X NB
X NL
X NB
X
= [JLE
s
th (xB ) ωB ] + [JΓE
sg
t (xB ) ωB ]
E=1 B=1 E=1 B=1
x y
NL NB NE NE NG
X X X X X
+ [JΓE
st
t (xB ) ωB ] + [J (Ex ,Ey ) f (xG ) ωG ]
Ωs
E=1 B=1 Ex =1 Ey =1 G=1
(32)
where xB ’s and ωB ’s are the locations and weights of quadrature
points for boundary integrals, while xG ’s and ωG ’s correspond to
18
those for domain integrals. J stands for the corresponding Jacobian.
NL represents the number of segments for the boundary integral. NEx
and NEy are the number of segments for the domain integral along
x and y directions, respectively, when considering a 2D rectangle
subdomain.
3.b. The essential boundary terms in (30) will be discretized in the similar
way shown in Eqs. (31) and (32).
4. Define the optimizer and minimize the loss function until convergence.
ˆ θ ∗ ).
5. Output the trained network d(µ;
ˆ θ ∗ ) is given, the solution field is obtained by
Once the trained network d(µ;
using the NeuroPU approximation: ûh (x, µ) = I∈Sx ΨI (x)d(µ;
P ˆ θ ∗ ).
Table 1: Summary of the two proposed NIM methods: S-NIM and V-NIM.
19
where time is considered as the model parameter. The solutions obtained by
the PINN and hp-VPINN methods [46] are also provided for comparison to
underscore the advantages of the NIM solver in both approximation accuracy
and computational efficiency.
Unless stated otherwise, the construction of the NeuroPU approximation
(16) in the NIM methods employ quadratic meshfree shape functions, i.e., p = 2
in (8), with the normalized support size ā = 2.5. To simplify notation, a neural
network with l hidden layers, each of which contains m neurons, is denoted as
l × [m]. The input size of the neural network in NeuroPU approximation is given
by the dimension of parameters µ ∈ Rc , and the output size is Nh .
As shown in Table 1, two different test functions are examined in V-NIM. To
distinguish the employment of the Heaviside function and cubic B-spline function
as the test functions, we denote the resultant V-NIM solvers as V-NIM/h and
V-NIM/c, respectively. As domain integral is required in V-NIM, we let each
subdomain be uniformly divided into 4 × 4 segments in 2D case (or 4 segments
in 1D case), where 5 Gauss quadrature points per direction are used for the
segment integration.
The initial weights and bias of neural networks in the NIM framework are
initialized using the Xavier scheme [27]. For the penalty parameters α adopted
for the S-NIM (19) and V-NIM (30) methods, we determine the optimal one
from the heuristic tests on the values [1, 10, 102 , 103 ]. In order to conduct a
fair comparison, we consider the same training scheme for different methods
to evaluate their training efficiency and accuracy. The Adam optimizer with a
learning rate of 0.001 is used by default unless stated otherwise.
where f (x, y) and ū are the prescribed force term and the essential boundary
condition (EBC) on Γg , respectively. Let the analytical solution [45, 46] be
20
Figure 3: Left: The discretization for NIM, where the nodes are used to define the nodal
shape function and the center points represent the sample points for S-NIM and centers of
subdomains for V-NIM; Right: The reference solution for the 2D Poisson’s problem.
21
Figure 4: The comparisons of point-wise displacement errors obtained by S-NIM, V-NIM/h
and V-NIM/c using different orders of NeuroPU shape functions: (a) Linear; (b) Quadratic;
(c) Cubic. The reference solution is provided in the right panel of Figure 3.
22
2. To ensure sufficient accuracy in the PINN solution, we utilize Nf = 10, 400
uniformly distributed residual points, whereas only NT = 2, 601 subdomains
(or Nf = 2, 601 sample points) are considered in V-NIM (or S-NIM). A neural
network with hidden layers 4 × [40] is adopted for the PINN method.
Figure 5 shows the comparison of the approximate solutions obtained by dif-
ferent NIMs and PINN, and their corresponding absolute point-wise errors. The
maximum errors in solution approximation for all cases are less 0.01. Compared
to PINN, the results indicate that S-NIM generally provides a slightly improved
approximation on the edge regions, albeit with a slightly larger error over the
higher-gradient areas. Nevertheless, V-NIM yields the most accurate results,
indicating the preferable accuracy of the proposed V-NIM methods over the
PINN method.
Furthermore, the comparison of corresponding mean absolute errors (M AE)
and training costs are listed in Table 3. The M AE of S-NIM using 2601 residual
points is 1.11 × 10−3 , which showcases a slight degradation over the PINN
method with M AE = 6.32 × 10−4 . However, this is because S-NIM only utilizes
one fourth of sample points compared to PINN. In terms of training cost, PINN
takes approximately 10.65s for every 1000 epochs, approximately 8 times slower
than S-NIM, as illustrated in Table 3. Furthermore, in order to demonstrate the
superior ability of S-VIM over PINN method, we adopt an increased number of
residual points (Nf = 10, 000) to train the S-NIM model, denoted as S-NIM2 in
Table 3. The corresponding M AE error is substantially reduced to 2.19 × 10−4
from 1.11 × 10−3 . The training time for S-NIM2 becomes 2.01s/1000 epochs,
which is 1.4 times longer than the S-NIM case with Nf = 2, 601. Overall,
this refined S-NIM solution yields 3 times higher accuracy and 5 times higher
efficiency compared with the PINN method, demonstrating the enhancement by
introducing the NeuroPU approximation in S-VIM.
The V-NIM methods further improve the performance, outperforming PINN
by approximately 1.5 times and 8 times in terms of accuracy when using V-NIM/h
and V-NIM/c, respectively. This remarkably higher accuracy is attributed to
the local variational form as we described in Section 4.2. Consistent with the
observation in Section 6.1.1, the employment of smooth test functions (V-NIM/c)
leads to remarkably higher accuracy in approximating the solution.
Because the shape functions in NIMs are pre-computed and stored, similar
to the approach in FEM, the additional computational cost incurred from using
higher-order approximation is, in fact, quite marginal during online training. It
is interesting to observe that, owing to the lower order derivatives involved in
the weak-form residuals, V-NIM even demonstrates superior training efficiency
to S-NIM. Specifically, the training of V-NIM for every 1000 epochs costs
approximately 1.20s, resulting in a 1.2x faster training rate than that of S-NIM.
The efficiency enhancement becomes more pronounced when compared to PINN,
which leads to a 10x speedup.
23
Methods PINN S-NIM S-NIM2 V-NIM/h V-NIM/c
Epochs 50000
Training time:
10.65 1.46 2.01 1.21 1.20
[s]/1000 epochs
M AE 6.32e-04 1.11e-03 2.19e-04 4.21e-04 7.89e-05
Table 3: Training results of PINN, S-NIM, V-NIM/h and V-NIM/c for 2D Poisson’s problem,
where the corresponding hyperparameters are referred to Table 2. For the demonstration of
the effect of sample points, we also provide the S-NIM2 solution that is obtained by using
Nf = 10, 000 sample points.
Figure 5: Comparison of the approximated displacement (upper row) and the absolute point-
wise errors (lower row) obtained by (a) PINN, (b) S-NIM, (c) V-NIM/h and (d) V-NIM/c.
The cubic NeuroPU shape function is adopted for all NIM methods.
24
Figure 6 displays the convergence results of V-NIM/h (left) and V-NIM/c
(right) when employing different-order bases (p = 1, 2, 3) in the trial (NeuroPU)
shape functions. In the case of V-NIM/h, we do not consider the linear shape
function (p = 1) as the resultant weak form solution becomes unstable due to
the low-order continuity of the Heaviside step function. Overall, the results show
that both the e0 and e1 errors of the NIM methods progressively decrease as
refining the meshfree discretizations, where the rates of asymptotic convergence
are also provided in the figure. It is interesting to notice that these convergence
rates approximately follow the error estimates derived from the classical FEM
or Meshfree approximation [40, 15], i.e., e0 ∼ O(hp+1 ) and e1 ∼ O(hp ), where
h represents the characteristic nodal distance and p is the order of basis for
trial functions. For example, the convergence rates of e0 and e1 produced by
V-NIM/h, using quadratic (p = 2) NeuroPU shape functions, are approximately
re0 = 2.9 and re1 = 2.2, respectively. On the other hand, when employing
V-NIM/c with linear and quadratic NeuroPU shape functions, the convergence
rates for e0 are observed to be re0 = 2.3 and 3.4, while those for e1 are re1 = 1.1
and 2.5, respectively. Surprisingly, we notice that the cases with cubic shape
functions (p = 3) yield increased convergence rates in V-NIMs.
It is clearly shown in Figure 6 that in comparison to V-NIM/h, V-NIM/c
tends to offer better stability and attain favorable accuracy due to the high
continuity of the test function.
Figure 6: Convergence study of the V-NIM methods using different test functions: (left)
V-NIM/h with Heaviside step function; (right) V-NIM/c with cubic B-spline function. The
convergence rates presented in the legend are determined by calculating the average slope
of the last two segments. p denotes the order of basis used for the NeuroPU approximation
function in the NIM methods.
25
Figure 7: Left: Schematic of defected plate under normal traction; Right: Distribution of
nodes (black points) and centers of subdomains (green points).
For demonstration, the V-NIM solver with Heaviside test function (V-NIM/h)
is adopted to solve this problem, where the associated normalized size of sub-
domains is set as r̄ = 1.2. The model discretization with 727 nodes and 4117
subdomains is shown in Figure 7(right). A neural network with one hidden layer
(10 neurons), i.e., 1 × [10], and the quadratic trial shape function with normalized
support size of ā = 2.4 are employed to construct the NeuroPU approximation
in V-NIM/h. As it was reported that the standard PINN performs inefficiently
in elasticity problems, the mixed-variable based PINN method developed in [68]
for elasticity mechanics is adopted here as the baseline for comparison. We note
that, consistent with the previous examples, the penalty method is still used
to impose the boundary conditions for both V-NIM/h and the mixed-variable
PINN, in contrast to the hard enforcement of boundary conditions as reported
in [68]. The architecture of PINN is set as 5 × [50], and 22000 residual points
are fed for training. An Adam optimizer with 150,000 epochs and a decaying
learning rate from 10−3 to 10−5 is utilized for both PINN and V-NIM/h methods.
All the hyperparameters of method settings are listed in Table 4.
The V-NIM/h method exhibits remarkably high training efficiency, as shown
in Table 4, with a training speedup of nearly 15 times compared to the mixed-
variable base PINN method (i.e., 2.25s vs. 30.24s for every 1000 epochs of
training). Apart from the training time, Table 4 shows the comparison of
e0 and e1 errors against the FEM reference solution, revealing approximately
5 times higher accuracy in V-NIM/h compared with PINN. This illustrates
the effectiveness of the end-to-end differentiation capacity within the proposed
variational framework, as well as the enhanced efficiency and accuracy achieved
through the use of NeuroPU for computing approximations and spatial gradients.
The approximated displacement and stress distributions generated by V-
NIM/h, PINN and FEM methods are visualized in Figures 8 and 9, respectively.
It is observed that V-NIM/h consistently yields agreeable results with the
reference solution obtained by the FEM method, while the PINN model is not
26
Methods V-NIM/h PINN
Neural networks 1×[10] 5 × [50]
Nh 727 N/A
Subdomain size r̄ 1.2 N/A
Support size ā 2.4 N/A
Nf or NT 4117 22000
α 100 100
Segments 2×2 N/A
Quadrature rule 3×3 N/A
Epochs 150,000
Training time:
2.25 30.24
[s]/1000 epochs
e0 : 1.44e-2 e0 : 8.02e-2
Error
e1 : 1.37e-2 e1 : 2.84e-1
Table 4: Hyperparameters and training results of the V-NIM/h and the mixed-variable PINN
for 2D elasticity problem, where the FEM reference solution is used for calculating the errors.
Figure 8: Comparison of the approximated displacement computed by (a) V-NIM/h, (b) FEM
and (c) Mixed-variable PINN.
capable of capturing the stress concentration around the notch region, leading
to a less satisfactory solution. The point-wise error of the displacement (ux
and uy ) and two stress fields (σxx and σyy ) are also portrayed in Figure 10.
It shows that the errors of PINN are normally 5 ∼ 10 times larger than the
corresponding errors produced by V-NIM/h. The improvement of V-NIM is
particularly pronounced in the stress field, which further confirms the proposed
method offers enhanced approximation capability in the higher-order derivatives
of solution.
Taking a closer observation, the stress distributions around the notch are
plotted in Figure 11, where the reference FEM, PINN, and V-NIM/h are com-
pared. It is evident that the solutions obtained by V-NIM/h method closely
align with the reference solution, demonstrating its advantage of accuracy in
critical regions. In contrast, the results obtained through the PINN method
27
Figure 9: Comparison of the approximated stress components σxx , σyy and σxy computed by
(a) V-NIM/h, (b) FEM and (c) Mixed-variable PINN.
show significant deviations from the reference solution, failing to capture the
steep changes of stress on the notch along with the degree. This could be due to
the insufficient approximation of the network model and the training difficulty
associated with the elasticity problem.
where µ1 and µ2 are the system parameters, with (µ1 , µ2 ) ∈ D = [0.01, 10]2 .
Following the network architecture of NeuroPU shown in Figure 1, we input
(µ1 , µ2 ) into the neural network. The resulting outputs are the respective Nh
nodal coefficient functions obtained after passing through hidden layers structured
as 4 × [40], representing the surrogate model associated with the two parameters.
We consider the V-NIM model with the cubic B-spline test function (V-
NIM/c), where the computation domain is discretized using 441 uniform nodes,
with an equal number of subdomains, i.e., Nh = NT = 441. The penalty number
for essential boundary enforcement is set as 10. The training of the V-NIM
28
Figure 10: Comparison of the point-wise error of displacement and stress components obtained
by (a) Mixed-variable PINN, (b) V-NIM/h.
Figure 11: Comparison of the stress distribution (a) σxx , (b) σyy and (c) σxy on the notched
surface obtained by the PINN, V-NIM/h and FEM methods.
29
Figure 12: The evolution of L2 error (Left) and the M AE loss function (Right) during training
the V-NIM/c surrogate. The loss term associated with the Dirichlet boundary condition is
also provided.
region.
Additionally, Figure 14 presents the e0 error of the predicted solution for
(µ1 , µ2 ) ∈ D, where the training set is enclosed within a white box. The
remaining area represents the extrapolation region of the surrogate model,
illustrating the capability of V-NIM in surrogate modeling. It is noteworthy that
once the NIM surrogate model is trained, the predictions for other parameter
points can be generated in real-time with minimal additional computational
cost. This surrogate modeling capacity of the NIM framework offers a unique
advantage compared to conventional numerical solvers that require repeated full
computation for each case.
Figure 13: (Left) The reference solution, (Middle) predicted solution by the V-NIM/c surrogate;
(Right) the point-wise error. The parameter testing point is (µ1 , µ2 ) = [10, 10].
30
Figure 14: The distribution of e0 error of the V-NIM surrogate over the parameter space
(µ1 , µ1 ) ∈ D.
where Ω = [0, 1], Ωt = [0, 1], the Dirichlet (Essential) boundary is Γg = {(x, t)|x =
±1, t ∈ Ωt }, the initial boundary is Γ0 = {(x, t)|x ∈ Ω, t = 0}, and a = 1
and κ = 0.1/π represent the advection velocity and the diffusivity coefficient,
respectively. The analytical solution of Eq. (38) is given in [62] with infinite
series summation.
Table 5: Hyperparameters of the V-NIM and hp-VPINN for the advection-diffusion equation.
The support size a and the subdomain size r are calculated by a = āh = 1.5/20 = 0.075 and
r = r̄h = 2.5/20 = 0.125, where h = 1/20 is the characteristic nodal distance.
31
Instead, NIM is able to directly approximate the solution as a continuous
function of temporal variable. While it is not within the scope of this study,
this continuous representation opens up a potential avenue to deal with time-
dependent observation data for inverse dynamics problems. On the other hand,
NIM simplifies the integration process by decoupling the spatial and temporal
domains, unlike the two-dimensional spatiotemporal integration required for the
weak form-based PINN methods [46].
For demonstration, we utilize the V-NIM method with cubic B-spline function
as the test functions, i.e., V-NIM/c method, to solve this problem. With the
employment of Eq. (39), the local variational residual for the AD equation (38)
over Ωs can be derived as
Z
h
v ûh,t + aûh,x − κûh,xx dΩ
Rs =
Ωs
Z Z Z (40)
h h h h
=κ v,x û,x dΩ − κ vû,x nx dΓ + v û,t + aû,x dΩ
Ωs Γsu Ωs
where the divergence theorem is applied and the boundary integral term on
Ls is canceled due to the boundary vanishing property of the cubic B-spline
function employed as test functions (refer to Section 4.2.3). The corresponding
loss function modified based on (30) can be formulated as
Nµ NT Nµ
1 XX 2 α1 X h 2
LV (θ) = Rhs (tj ) + û (±1, tj ) − ū(±1, tj )
Nµ NT j=1 s=1 Nµ j=1
(41)
NT
α2 X 2
+ ûh (xs , 0) − u0 (xs )
NT s=1
32
Methods Neural networks Time [s]/100 epochs M AE e0
V-NIM/c 1 × [10] 0.98 2.64e-03 9.59e-02
V-NIM/c 2 × [10] 1.16 2.04e-03 7.64e-02
V-NIM/c 3 × [20] 1.37 1.37e-03 3.24e-02
V-NIM/c 4 × [30] 1.67 1.17e-03 2.59e-02
hp-VPINN 3 × [5] 1.23 1.81e-02 7.24e-01
hp-VPINN 3 × [20] 1.91 1.69e-02 5.83e-01
Table 6: Comparison of V-NIM/c and hp-VPINN under different sizes of neural networks. The
model parameters can be referred to Table 5.
The numerical tests with regard to different sizes of neural networks are
conducted to show the accuracy and robustness of the proposed method V-NIM/c
and hp-VPINN. As can be seen from the M AE and e0 errors in Table 6, an
enhancement in accuracy is attained as a larger neural network is adopted in
the V-NIM method, with increasing training cost (see the ”Time/100 epochs”
column in Table 6) as expected. This demonstrates a larger network can provide
a better approximation capacity to capture the time-dependent behaviors. Due
to the hybrid approximation property by integrating high-order meshfree shape
functions and neural networks, the search space in V-NIM is greatly reduced
compared to the conventional method (e.g., hp-VPINN) purely approximated by
neural networks. Consequently, the proposed V-NIM is capable of attaining a
preferable accuracy as hp-VPINN but only with a much smaller network (1 × [10]
vs. 3 × [20]), leading a fraction of training cost (as shown in Table 6). In
addition, rather than relying on expensive AD to calculate high-order derivatives,
the introduction of NeuroPU approximation in NIM enables the pre-computed
spatial gradients that singly operated on the shape functions (18). This also
contributes to a reduction in the computational complexity.
The snapshots of the solutions of the V-NIM/c and hp-VPINN methods at
t = 0.2s, 0.6s and 1.0s are plotted in Figure 15, which evinces that the V-NIM/c
solution agrees well with the analytical solution, while for the hp-VPINN result
a small deviation is observed in the region where the solution has relatively large
magnitudes. This implies the enhanced approximation of V-NIM for the sharply
changing solution and higher-order derivatives. Figure 16 shows the comparison
of point-wise errors of hp-VPINN and V-NIM/c over the whole spatial-temporal
domain, with different sizes of neural networks. The maximum point-wise error
in V-NIM/c solutions with different sizes of neural networks consistently remains
under 0.04, with only a minor error observed at the top edge. In contrast, the
error snapshot of the hp-VPINN solution exhibits larger errors over most of
the domain. It is also noted that 50,000 more epochs are used in training the
hp-VPINN model to obtain a satisfactory result.
As supplementary, we provide the evolution of training loss of V-NIM/c using
neural network with dimensions 1 × [10], 3 × [20] and 4 × [30] in Figure 17, which
showcases the stable convergence property of the NIM method.
33
Figure 15: The snapshots of analytical solution, and approximation solutions obtained by
V-NIM/c and hp-VPINN methods at t = 0.2s, 0.6s and 1.0s.
Figure 16: Comparison of the point-wise errors of the approximate solution obtained by V-
NIM/c and hp-VPINN. The horizontal and vertical axes denote the time and space coordinates,
respectively.
8. Conclusion
34
Figure 17: The evolution of training loss and e0 error generated by V-NIM/c using different
neural network architectures: (a) 1 × [10], (b) 3 × [20] and (c) 4 × [30].
35
The versatility of the proposed differentiable method beyond conventional
simulation techniques is also demonstrated in this study. Specifically, we highlight
the desirable extrapolative ability and the real-time prediction of the V-NIM
model for surrogate modeling of a parameterized elliptic PDE. It is also shown
that, by considering the temporal variable as inputs to the NeuroPU approxi-
mation, the NIM method yields more stable results while maintaining superior
training efficiency compared to the hp-VPINN method for the time-dependent
(advection-diffusion equation) problem.
In conclusion, we describe the NIM method as a differentiable meshfree
solver, enabling end-to-end gradient-based optimization procedures for seeking
solutions. This innovative method holds great promise for the development
of next-generation physics-based data-driven solvers that offer a remarkable
balance between accuracy and computational efficiency. Additionally, it provides
a versatile framework for data assimilation and adaptive refinement due to
its meshfree nature. It is also worth noting that NIM opens a new avenue
for creating more efficient deep learning models through seamlessly blending
shape functions that represent the finite discretized domain with DNNs that
represent the problem-parameter space. In the future, we plan to explore the
performance of the NIM method in operator learning, inverse modeling, and
diverse applications related to nonlinear material modeling.
Acknowledgment
This research was partially supported by Q.Z. He’s Startup Fund and Data
Science Initiative (DSI) Seed Grant at the University of Minnesota. The authors
also acknowledge the Minnesota Supercomputing Institute (MSI) for providing
resources that contributed to the research results reported within this paper.
Appendix A.
For 2D elasticity associated with the NIM framework (Section 4), the ap-
proximated displacement ûh (x) is given as
X
ûh (x) = NI (x)dˆI (A.1)
I∈Sx
also the stress σ̂ h (x) and traction t̂h (x) tensors are given as
X
σ̂ h (x) = DBI (x)dˆI (A.2)
I∈Sx
X
t̂h (x) = nDBI (x)dˆI (A.3)
I∈Sx
36
T
ΦI,x 0 nx 0
ΦI,x 0
NI = , BI = 0 ΦI,y , n= 0 ny (A.4)
0 ΦI,y
ΦI,y ΦI,x ny nx
and
(
1 ν̄
D=
Ē
ν̄ 1 , Ē = E, ν̄ = ν (plane stress)
1 − ν̄ 2 E ν
(1 − ν̄)/2 Ē = 1−ν 2 , ν̄ = 1−ν (plane strain)
(A.5)
with E and ν being the Young’s modulus and Poisson’s ratio, respectively.
37
References
[1] Atluri, S., Zhu, T., 2000. New concepts in meshless methods. International
journal for numerical methods in engineering 47, 537–556.
[2] Atluri, S.N., Zhu, T., 1998. A new meshless local petrov-galerkin (mlpg)
approach in computational mechanics. Computational mechanics 22, 117–
127.
[3] Babuška, I., Melenk, J.M., 1997. The partition of unity method. Interna-
tional journal for numerical methods in engineering 40, 727–758.
[4] Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M., 2018. Auto-
matic differentiation in machine learning: a survey. Journal of Marchine
Learning Research 18, 1–43.
[5] Belytschko, T., Lu, Y.Y., Gu, L., 1994. Element-free galerkin methods.
International journal for numerical methods in engineering 37, 229–256.
[6] Berg, J., Nyström, K., 2018. A unified deep artificial neural network
approach to partial differential equations in complex geometries. Neurocom-
puting 317, 28–41.
[7] Berrone, S., Canuto, C., Pintore, M., 2022. Variational physics informed
neural networks: the role of quadratures and test functions. Journal of
Scientific Computing 92, 100.
[8] Bezgin, D.A., Buhendwa, A.B., Adams, N.A., 2023. Jax-fluids: A fully-
differentiable high-order computational fluid dynamics solver for compress-
ible two-phase flows. Computer Physics Communications 282, 108527.
[9] Blum, E.K., Li, L.K., 1991. Approximation theory and feedforward networks.
Neural networks 4, 511–515.
[10] Bottasso, C.L., Micheletti, S., Sacco, R., 2002. The discontinuous petrov–
galerkin method for elliptic problems. Computer Methods in Applied
Mechanics and Engineering 191, 3391–3409.
[11] Bradbury, J., Frostig, R., Hawkins, P., Johnson, M.J., Leary, C., Maclaurin,
D., Necula, G., Paszke, A., VanderPlas, J., Wanderman-Milne, S., et al.,
2018. Jax: composable transformations of python+ numpy programs .
[12] Brunton, S.L., Kutz, J.N., 2019. Data-driven science and engineering:
Machine learning, dynamical systems, and control. Cambridge University
Press.
[13] Cai, S., Mao, Z., Wang, Z., Yin, M., Karniadakis, G.E., 2021. Physics-
informed neural networks (pinns) for fluid mechanics: A review. Acta
Mechanica Sinica 37, 1727–1738.
38
[14] Cardiff, P., Demirdžić, I., 2021. Thirty years of the finite volume method
for solid mechanics. Archives of Computational Methods in Engineering 28,
3721–3780.
[15] Chen, J.S., Hillman, M., Chi, S.W., 2017. Meshfree methods: progress
made after 20 years. Journal of Engineering Mechanics 143, 04017001.
[16] Chen, J.S., Pan, C., Wu, C.T., Liu, W.K., 1996. Reproducing kernel particle
methods for large deformation analysis of non-linear structures. Computer
methods in applied mechanics and engineering 139, 195–227.
[17] Chiu, P.H., Wong, J.C., Ooi, C., Dao, M.H., Ong, Y.S., 2022. Can-pinn: A
fast physics-informed neural network based on coupled-automatic–numerical
differentiation method. Computer Methods in Applied Mechanics and
Engineering 395, 114909.
[18] Clough, R., 1960. The Finite Element Method in Plane Stress Analysis.
American Society of Civil Engineers.
[19] Cuomo, S., Di Cola, V.S., Giampaolo, F., Rozza, G., Raissi, M., Piccialli, F.,
2022. Scientific machine learning through physics–informed neural networks:
Where we are and what’s next. Journal of Scientific Computing 92, 88.
[20] Dong, S., Li, Z., 2021. Local extreme learning machines and domain
decomposition for solving linear and nonlinear partial differential equations.
Computer Methods in Applied Mechanics and Engineering 387, 114129.
[21] Dong, Y., Liu, T., Li, Z., Qiao, P., 2023. Deepfem: A novel element-based
deep learning approach for solving nonlinear partial differential equations
in computational solid mechanics. Journal of Engineering Mechanics 149,
04022102.
[22] Du, H., Zhao, Z., Cheng, H., Yan, J., He, Q., 2023. Modeling density-
driven flow in porous media by physics-informed neural networks for co2
sequestration. Computers and Geotechnics 159, 105433.
[23] Eggersmann, R., Kirchdoerfer, T., Reese, S., Stainier, L., Ortiz, M., 2019.
Model-free data-driven inelasticity. Computer Methods in Applied Mechan-
ics and Engineering 350, 81–99.
[24] Fang, Z., 2021. A high-efficient hybrid physics-informed neural networks
based on convolutional neural network. IEEE Transactions on Neural
Networks and Learning Systems 33, 5514–5526.
[25] Gao, H., Sun, L., Wang, J.X., 2021. Phygeonet: Physics-informed geometry-
adaptive convolutional neural networks for solving parameterized steady-
state pdes on irregular domain. Journal of Computational Physics 428,
110079.
39
[26] Gasick, J., Qian, X., 2023. Isogeometric neural networks: A new deep
learning approach for solving parameterized partial differential equations.
Computer Methods in Applied Mechanics and Engineering 405, 115839.
[27] Glorot, X., Bengio, Y., 2010. Understanding the difficulty of training deep
feedforward neural networks, in: Proceedings of the thirteenth international
conference on artificial intelligence and statistics, JMLR Workshop and
Conference Proceedings. pp. 249–256.
[28] Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep learning.
volume 1. MIT press Cambridge.
[29] Grepl, M.A., Maday, Y., Nguyen, N.C., Patera, A.T., 2007. Efficient reduced-
basis treatment of nonaffine and nonlinear partial differential equations.
ESAIM: Mathematical Modelling and Numerical Analysis 41, 575–605.
[30] Haghighat, E., Bekar, A.C., Madenci, E., Juanes, R., 2021a. A nonlocal
physics-informed deep learning framework using the peridynamic differential
operator. Computer Methods in Applied Mechanics and Engineering 385,
114012.
[31] Haghighat, E., Raissi, M., Moure, A., Gomez, H., Juanes, R., 2021b.
A physics-informed deep learning framework for inversion and surrogate
modeling in solid mechanics. Computer Methods in Applied Mechanics and
Engineering 379, 113741.
[32] Han, Z., Atluri, S., 2004. A meshless local petrov-galerkin (mlpg) approach
for 3-dimensional elasto-dynamics. CMC: Computers, Materials & Continua
1, 129–140.
[33] He, Q., Barajas-Solano, D., Tartakovsky, G., Tartakovsky, A.M., 2020.
Physics-informed neural networks for multiphysics data assimilation with
application to subsurface transport. Advances in Water Resources 141,
103610.
[34] He, Q., Chen, J.S., 2020. A physics-constrained data-driven approach based
on locally convex reconstruction for noisy database. Computer Methods in
Applied Mechanics and Engineering 363, 112791.
[35] He, Q., Perego, M., Howard, A.A., Karniadakis, G.E., Stinis,
P., 2023. A hybrid deep neural operator/finite element method
for ice-sheet modeling. Journal of Computational Physics 492,
112428. URL: https://www.sciencedirect.com/science/article/pii/
S0021999123005235, doi:doi:https://doi.org/10.1016/j.jcp.2023.112428.
[36] He, Q., Tartakovsky, A.M., 2021. Physics-informed neural network method
for forward and backward advection-dispersion equations. Water Resources
Research 57, e2020WR029479.
40
[37] He, X., He, Q., Chen, J.S., 2021. Deep autoencoders for physics-constrained
data-driven nonlinear materials modeling. Computer Methods in Applied
Mechanics and Engineering 385, 114034.
[38] Hornik, K., 1991. Approximation capabilities of multilayer feedforward
networks. Neural networks 4, 251–257.
[39] Hughes, T.J., 1982. A theoretical framework for petrov-galerkin methods
with discontinuous weighting functions: Application to the streamline-
upwind procedure. Finite element in fluids 4, Chapter–3.
[40] Hughes, T.J., 2012. The finite element method: linear static and dynamic
finite element analysis. Courier Corporation.
[41] Hughes, T.J., Cottrell, J.A., Bazilevs, Y., 2005. Isogeometric analysis: Cad,
finite elements, nurbs, exact geometry and mesh refinement. Computer
methods in applied mechanics and engineering 194, 4135–4195.
[42] Innes, M., Edelman, A., Fischer, K., Rackauckas, C., Saba, E., Shah, V.B.,
Tebbutt, W., 2019. A differentiable programming system to bridge machine
learning and scientific computing. arXiv preprint arXiv:1907.07587 .
[43] Johnson, D., Maxfield, T., Jin, Y., Fedkiw, R., 2023. Software-based
automatic differentiation is flawed. arXiv preprint arXiv:2305.03863 .
[44] Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang,
L., 2021. Physics-informed machine learning. Nature Reviews Physics 3,
422–440.
[45] Kharazmi, E., Zhang, Z., Karniadakis, G.E., 2019. Variational physics-
informed neural networks for solving partial differential equations. arXiv
preprint arXiv:1912.00873 .
[46] Kharazmi, E., Zhang, Z., Karniadakis, G.E., 2021. hp-vpinns: Variational
physics-informed neural networks with domain decomposition. Computer
Methods in Applied Mechanics and Engineering 374, 113547.
[47] Khodayi-Mehr, R., Zavlanos, M., 2020. Varnet: Variational neural networks
for the solution of partial differential equations, in: Learning for Dynamics
and Control, PMLR. pp. 298–307.
[48] Kirchdoerfer, T., Ortiz, M., 2016. Data-driven computational mechanics.
Computer Methods in Applied Mechanics and Engineering 304, 81–101.
[49] Kochkov, D., Smith, J.A., Alieva, A., Wang, Q., Brenner, M.P., Hoyer, S.,
2021. Machine learning–accelerated computational fluid dynamics. Proceed-
ings of the National Academy of Sciences 118, e2101784118.
[50] Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R., Mahoney, M.W., 2021.
Characterizing possible failure modes in physics-informed neural networks.
Advances in Neural Information Processing Systems 34, 26548–26560.
41
[51] Lagaris, I.E., Likas, A., Fotiadis, D.I., 1998. Artificial neural networks for
solving ordinary and partial differential equations. IEEE transactions on
neural networks 9, 987–1000.
[52] Lee, H., Kang, I.S., 1990. Neural algorithm for solving differential equations.
Journal of Computational Physics 91, 110–131.
[53] Lee, K., Trask, N.A., Patel, R.G., Gulian, M.A., Cyr, E.C., 2021. Partition
of unity networks: deep hp-approximation. arXiv preprint arXiv:2101.11256
.
[54] LeVeque, R.J., 2007. Finite difference methods for ordinary and partial
differential equations: steady-state and time-dependent problems. SIAM.
[55] Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart,
A., Anandkumar, A., 2020. Fourier neural operator for parametric partial
differential equations. arXiv preprint arXiv:2010.08895 .
[56] Liu, W.K., Jun, S., Zhang, Y.F., 1995. Reproducing kernel particle methods.
International journal for numerical methods in fluids 20, 1081–1106.
[57] Liu, W.K., Li, S., Park, H.S., 2022. Eighty years of the finite element
method: Birth, evolution, and future. Archives of Computational Methods
in Engineering 29, 4431–4453.
[58] Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E., 2021. Learning
nonlinear operators via deeponet based on the universal approximation
theorem of operators. Nature machine intelligence 3, 218–229.
[59] McClenny, L., Braga-Neto, U., 2020. Self-adaptive physics-informed neural
networks using a soft attention mechanism. arXiv preprint arXiv:2009.04544
.
[60] Meade Jr, A.J., Fernandez, A.A., 1994. The numerical solution of linear or-
dinary differential equations by feedforward neural networks. Mathematical
and Computer Modelling 19, 1–25.
[61] Mistani, P.A., Pakravan, S., Ilango, R., Gibou, F., 2023. Jax-dips: Neural
bootstrapping of finite discretization methods and application to elliptic
problems with discontinuities. Journal of Computational Physics , 112480.
[62] Mojtabi, A., Deville, M.O., 2015. One-dimensional linear advection–diffusion
equation: Analytical and finite element solutions. Computers & Fluids 107,
189–195.
[63] Montáns, F.J., Chinesta, F., Gómez-Bombarelli, R., Kutz, J.N., 2019. Data-
driven modeling and learning in science and engineering. Comptes Rendus
Mécanique 347, 845–855.
[64] Patera, A.T., 1984. A spectral element method for fluid dynamics: laminar
flow in a channel expansion. Journal of computational Physics 54, 468–488.
42
[65] Raissi, M., Perdikaris, P., Karniadakis, G.E., 2019. Physics-informed neural
networks: A deep learning framework for solving forward and inverse
problems involving nonlinear partial differential equations. Journal of
Computational physics 378, 686–707.
[66] Raissi, M., Yazdani, A., Karniadakis, G.E., 2020. Hidden fluid mechanics:
Learning velocity and pressure fields from flow visualizations. Science 367,
1026–1030.
[67] Ranade, R., Hill, C., Pathak, J., 2021. Discretizationnet: A machine-learning
based solver for navier–stokes equations using finite volume discretization.
Computer Methods in Applied Mechanics and Engineering 378, 113722.
[68] Rao, C., Sun, H., Liu, Y., 2021. Physics-informed deep learning for com-
putational elastodynamics without labeled data. Journal of Engineering
Mechanics 147, 04021043.
[69] Saha, S., Gan, Z., Cheng, L., Gao, J., Kafka, O.L., Xie, X., Li, H., Tajdari,
M., Kim, H.A., Liu, W.K., 2021. Hierarchical deep learning neural network
(hidenn): An artificial intelligence (ai) framework for computational science
and engineering. Computer Methods in Applied Mechanics and Engineering
373, 113452.
[70] Samaniego, E., Anitescu, C., Goswami, S., Nguyen-Thanh, V.M., Guo, H.,
Hamdia, K., Zhuang, X., Rabczuk, T., 2020. An energy approach to the
solution of partial differential equations in computational mechanics via
machine learning: Concepts, implementation and applications. Computer
Methods in Applied Mechanics and Engineering 362, 112790.
[71] Sanchez-Gonzalez, A., Godwin, J., Pfaff, T., Ying, R., Leskovec, J.,
Battaglia, P., 2020. Learning to simulate complex physics with graph
networks, in: International conference on machine learning, PMLR. pp.
8459–8468.
[72] Schmidt, M., Lipson, H., 2009. Distilling free-form natural laws from
experimental data. science 324, 81–85.
[73] Shukla, K., Jagtap, A.D., Karniadakis, G.E., 2021. Parallel physics-informed
neural networks via domain decomposition. Journal of Computational
Physics 447, 110683.
[74] Tancik, M., Srinivasan, P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N.,
Singhal, U., Ramamoorthi, R., Barron, J., Ng, R., 2020. Fourier features
let networks learn high frequency functions in low dimensional domains.
Advances in Neural Information Processing Systems 33, 7537–7547.
[75] Taneja, K., He, X., He, Q., Chen, J., 2023. A multi-resolution physics-
informed recurrent neural network: Formulation and application to muscu-
loskeletal systems. arXiv preprint arXiv:2305.16593 .
43
[76] Tartakovsky, A.M., Marrero, C.O., Perdikaris, P., Tartakovsky, G.D.,
Barajas-Solano, D., 2020. Physics-informed deep neural networks for learn-
ing parameters and constitutive relationships in subsurface flow problems.
Water Resources Research 56, e2019WR026731.
[77] Wang, S., Teng, Y., Perdikaris, P., 2021a. Understanding and mitigating
gradient flow pathologies in physics-informed neural networks. SIAM Journal
on Scientific Computing 43, A3055–A3081.
[78] Wang, S., Wang, H., Perdikaris, P., 2021b. On the eigenvector bias of
fourier feature networks: From regression to solving multi-scale pdes with
physics-informed neural networks. Computer Methods in Applied Mechanics
and Engineering 384, 113938.
[79] Xue, T., Liao, S., Gan, Z., Park, C., Xie, X., Liu, W.K., Cao, J., 2023.
Jax-fem: A differentiable gpu-accelerated 3d finite element solver for au-
tomatic inverse design and mechanistic data science. Computer Physics
Communications , 108802.
[80] Yin, M., Zhang, E., Yu, Y., Karniadakis, G.E., 2022. Interfacing finite ele-
ments with deep neural operators for fast multiscale modeling of mechanics
problems. Computer methods in applied mechanics and engineering 402,
115027.
[81] Yu, B., et al., 2018. The deep ritz method: a deep learning-based numerical
algorithm for solving variational problems. Communications in Mathematics
and Statistics 6, 1–12.
[82] Zhang, R., Liu, Y., Sun, H., 2020. Physics-informed multi-lstm networks
for metamodeling of nonlinear structures. Computer Methods in Applied
Mechanics and Engineering 369, 113226.
44