0% found this document useful (0 votes)
10 views

Geometry-Agnostic Data-Driven Thermal Modeling of Additive Manufacturing Processes

This document presents a novel graph-based approach for thermal modeling in additive manufacturing (AM) processes using graph neural networks (GNNs). The proposed model addresses the challenge of generalizing across complex geometries, demonstrating accurate predictions of thermal histories for unseen geometries in the Directed Energy Deposition process. This method offers a computationally efficient alternative to traditional physics-based models, enhancing the potential for improved AM part quality and reliability.

Uploaded by

22110125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Geometry-Agnostic Data-Driven Thermal Modeling of Additive Manufacturing Processes

This document presents a novel graph-based approach for thermal modeling in additive manufacturing (AM) processes using graph neural networks (GNNs). The proposed model addresses the challenge of generalizing across complex geometries, demonstrating accurate predictions of thermal histories for unseen geometries in the Directed Energy Deposition process. This method offers a computationally efficient alternative to traditional physics-based models, enhancing the potential for improved AM part quality and reliability.

Uploaded by

22110125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Version of Record: https://www.sciencedirect.

com/science/article/pii/S2214860421006011
Manuscript_7dc3c1969b2cc97fc2e2aae6a23f2719

Geometry-Agnostic Data-Driven Thermal Modeling of Additive Manufacturing Processes


using Graph Neural Networks

Mojtaba Mozaffar, Shuheng Liao, Hui Lin, Kornel Ehmann, Jian Cao1

Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA

Abstract

Additive manufacturing (AM) is commonly used to produce builds with complex geometries.

Despite recent advancements in data-driven modeling of AM processes, the generalizability of

such models across a wide range of geometries has remained a challenge. Here, a graph-based

representation using neural networks is proposed to capture spatiotemporal dependencies of

thermal responses in AM processes. Our results tested on the Directed Energy Deposition process,

indicate that our deep learning architecture accurately predicts long thermal histories for unseen

geometries in the training process, offering a viable alternative to expensive computational

mechanics or experimental solutions.

Keywords: Additive Manufacturing, Machine Learning, Graph Neural Network

1
Corresponding author: [email protected] (J. Cao)

1
© 2021 published by Elsevier. This manuscript is made available under the Elsevier user license
https://www.elsevier.com/open-access/userlicense/1.0/
1. Introduction

Additive manufacturing (AM) processes are increasingly utilized in a variety of industries such

as aerospace, medical and automotive industries due to their superior capabilities to produce high-

value-added parts with complex geometrical features and limited material consumption [1, 2].

Despite AM’s immense potential, its widespread adoption is hampered by challenges such as

quality variability and process inefficiencies. For example, Glerum et al. [3] show that various

material properties result from the same set of process parameters. Therefore, many state-of-the-

art research attempts aim to establish a direct mapping between AM process parameters and

resulting part properties [4]. The thermal profile during AM processing, resulting from process

parameters, is a key signature connecting process parameters to defects (e.g., lack of fusion,

distortion, hot cracking), melt pool characteristics, and consequently build properties. However,

mapping the thermal profile and the process parameters based on experimental data is not just

expensive, due to the high cost of material and equipment, but more importantly, incomplete due

to the infeasibility of measuring experimental thermal profiles at every location of a part. Thus,

thermal profile prediction in AM processes is essential in meeting the ever-increasing requirements

for AM part quality and reliability.

Physics-based modeling has been the dominant approach to capture the thermal behavior of

AM processes. High-fidelity computational models such as the finite element method (FEM) and

computational fluid dynamics (CFD) offer high accuracy; however, they require massive

computational resources and time, making them infeasible for most time-sensitive applications.

Therefore, many research attempts investigated alternative approaches to thermal modeling in AM

processes. Empirical solutions based on experimental data are proposed to model melt pool

temperature for real-time prediction and control systems [5, 6]. Yet, they offer low accuracy as

2
they over-rely on limited experimental data and fail to capture interconnected dependencies in

process parameters. Alternatively, many analytical and semi-analytical solutions are proposed to

offer a trade-off between the computation cost and the accuracy of high-fidelity models. While

early versions of such analytical solutions were only applicable to single-track scenarios, recent

publications extend their capability to multi-track and multi-layer cases [7-9]. Despite recent

progress in the field, analytical solutions deviate from realistic responses even in moderately

complex geometries as they include numerous simplifying assumptions.

Recently, data-driven modeling has shown the potential to contribute to the design, discovery,

diagnosis, and optimization of advanced manufacturing processes [10, 11]. In our previous work

[12], we have demonstrated that data-driven methods for thermal profile prediction in AM offer a

viable alternative to physics-based models. We developed a recurrent neural network (RNN) based

architecture to predict the thermal histories of arbitrary points in AM builds. We have shown that

our RNN architecture predicts the behavior of samples with similar classes of geometries to the

ones in the training database with a scaled mean-squared-error of 3𝑒 −5 ; however, the error can

significantly increase in cases for unseen classes of geometries. A surrogate model was proposed

by Roy et al. [13] to achieve a real-time thermal history prediction. A set of input features (e.g.,

distance from heat source and cooling surface) are selected from the geometry representations

using GCode to reduce the computational demands. Their method can be generalized to different

sizes of similar geometries, various process parameters, and different materials achieving a

prediction accuracy of 95%. Two machine learning-based models using extreme gradient boosting

(XGBoost) and long short-term memory (LSTM) are proposed by Zhang et al. [14] to predict the

thermal histories using six input variables, including laser power, scan speed, layer index, time

index, average height, and width. Models are trained and validated using real-time experimental

3
measurements taken by IR thermal cameras under setups with varying process parameters. The

authors reported a best runtime of 0.34 s for XGBoost, which is small enough for real-time

prediction of data captured by IR cameras. A framework combining an RNN and a deep neural

network (DNN) is developed by Ren et al. [15] to establish a mapping between the thermal history

distributions and laser scanning patterns in the laser directed deposition process. The RNN-DNN

model, trained with numerical simulation results analyzed by finite element methods as the ground

truth, achieves accuracies higher than 95% but it can only predict thermal behaviors for single-

layer rather than multilayer or complex geometries, which highly limits the application of the

proposed algorithm. In the study by Haghighi et al. [16], physics-based and data-driven

approaches are combined to characterize the filament bonding and the porosity distribution in an

extrusion-based AM process for PLA (a thermoplastic polyester material). An analytical heat

transfer model was applied for thermal profile characterization and an artificial neural network

was adopted for filament deformation characterization. Their hybrid model achieved an average

accuracy of 95% and 94% in predicting inter-layer and intra-layer bonding, respectively. In

summary, however, the aforementioned data-driven approaches demonstrate their results only for

simple geometries, such as thin walls and cubic structures.

As one can see, despite major achievements in data-driven modeling, state-of-the-art

approaches fall short on the key issue of generalization across unseen geometries, which is crucial

for AM modeling as AM is mostly used for producing one-of-a-kind complex geometries. The

objective of this work is to introduce a novel physics-aware data-driven thermal model for AM

processes that can benefit from the high predictive power and computational efficiency of an

artificial intelligence-empowered model while drastically improving its ability to generalize across

challenging geometries. Our approach captures the intricacies of the physics through graph

4
modeling, which provides a flexible representation of complex unstructured geometries.

Additionally, the approach follows the local contributions of each node to other nodes within an

element (in our case, a hexahedron connecting 8 nodes) and provides fundamentally similar

calculation pathways to primary physics-based approaches, i.e., FEM. We demonstrate our

approach on the Directed Energy Deposition (DED) process. DED is one of seven major AM

families [17] where metal powders are blown into a melt pool generated from a laser or electron

beam. Note that although the methodology is demonstrated for DED, it can be easily extended

across many element-based predictive modeling approaches in applied physics and engineering.

In what follows, first we introduce the key elements of our methodology in Section 2, including

our proposed architecture (Section 2.1) and database generation approach (Section 2.2). The results

are discussed in Section 3 and we conclude this paper with our final remarks and future research

directions in Section 4.

2. Graph-Based Modeling Methodology

Graph neural networks (GNN) are deep learning architectures that capture dependencies in

unstructured graphs via message passing between neighboring nodes. Due to their flexibility,

GNNs have rapidly gained popularity in social sciences [18], chemistry [19], and image processing

[20]. The two major categories of GNN include spectral-based and spatial-based methods.

Spectral-based methods (e.g., GCN [21], AGCN [22], CHEBNET [23]) define a convolution

operation based on the Laplacian eigen-basis, which makes this class most applicable to problems

with a static graph representation [24]. In contrast, spatial-based GNNs define the convolution

operation on the graph by aggregating the information from neighboring nodes and edges. Many

researchers have successfully extracted spatial and temporal dependencies in traffic flow

5
forecasting by considering traffic networks as graphs and integrating a GNN with a RNN (e.g.,

ST-GCN [25], TGC-LSTM [26], GGRU [27]). Spatial-based GNN formulations consist of three

steps: (i) message creation and propagation at the starting nodes, (ii) message aggregation at the

target nodes, and (iii) updating calculations based on the aggregated message. Here, two nodes are

considered neighbors if they belong to the same element. A schematic of our definition of

neighboring nodes and the three message-passing steps are demonstrated in Figure 1.

Figure 1. Schematic of a target node and its neighboring nodes within an element. Message passing

includes three fundamental steps: (i) message construction, (ii) message aggregations, and (iii) and target

update.

In this work, we hypothesize that the spatiotemporal dependencies in AM processes which are

traditionally modeled via physics-based simulations, such as FEM, can be alternatively captured

using graph neural networks. This hypothesis stems from the resemblance between finite element

matrix assembly operations (which combine the local contributions of element interconnectivity)

and graph neural network message-passing formulations. To test this hypothesis, a database of

6
simulation-based thermal responses is developed while building a variety of industrial-grade AM

parts. We investigate two GNN architectures to reproduce AM responses. The schematics of the

two architectures are depicted in Figure 2 and their details are elaborated upon in the following

subsections. Additionally, detailed block diagrams of the networks and a schematic of the training

process along with hyperparameters is provided in Appendix A - Figure 6 and 7.

Figure 2. Schematics of the two architectures for spatiotemporal prediction of AM thermal responses: (A)

The GNN architecture predicts the single-time step update in each training instance given the node and

7
element features at the time-step; (B) The RGNN architecture predicts and trains multi-time step

interactions where at each time step the network receives a temporal nodal-based encoded representation,

a non-temporal element-based representation, and the hidden state of the previous stacked GRU cell and

outputs the thermal distribution over the geometry. Both architectures can be recursively evaluated to

produce thermal outputs of arbitrary length.

2.1. Proposed Network Architectures

2.1.1. GNN Architecture

As the first option, we devised a GNN-based architecture in which the network receives the

thermal responses in the previous time step as well as process parameters at that time including

nodal and element-based features (detailed in Section 2.2), and outputs the thermal responses in

the current time step, as depicted in Figure 2A. In essence, the network is responsible for

calculating thermal fluxes at each time step to properly update the thermal fields. In this network,

each training or evaluation step predicts one forward time increment, however, by recursively

providing the network output as the next time increment’s input we can produce time-series

responses of arbitrary length. For the GNN cell formulations in Figure 2A, we found that

DeeperGCN [28] empirically leads to better results compared to existing alternatives, though it is

noteworthy that in our experience the difference in performance has been insignificant. In the

DeeperGCN formulation, ℎ𝑣𝑙 (node features for the 𝑙 th layer) will be passed to a layer-

normalization layer, an activation layer with a rectified linear unit, and a drop-out layer for feature

preprocessing. Subsequently, a spatial convolution is performed with the preprocessed features to

update the features for the next layer. The spatial convolution operation is defined as [28]:

8
(𝑙+1)
ℎ𝑣
(𝑙) (𝑙) (𝑙)
= 𝑀𝐿𝑃 (ℎ𝑣 + 𝐴𝐺𝐺 ({𝑅𝑒𝐿𝑈(ℎ𝑢 + 𝑒𝑣𝑢 ) + 𝜀 : 𝑢 ∈ 𝒩 (𝑣)})) + ℎ𝑣
(𝑙) (1)

(𝑙)
where 𝒩 (𝑣) is the set of neighboring nodes of 𝑣, ℎ𝑢 represents the neighbor node features and

(𝑙)
𝑒𝑣𝑢 represents the features of edges connecting 𝑣 and 𝑢. 𝑅𝑒𝐿𝑈 is the rectified linear unit activation

function and 𝐴𝐺𝐺 is the aggregation function, which is the Softmax function in the current work.

𝑀𝐿𝑃 is a multi-layer perception and 𝜀 is a small positive constant to assure numerical stability. As

(𝑙) (𝑙)
shown in Eq. (1), the encoded neighboring node features ℎ𝑢 and edge features 𝑒𝑣𝑢 are first added

and fed to an activation function to construct the message from each neighboring node. The

message is aggregated and then combined with nodal features in an 𝑀𝐿𝑃 to calculate an update.

Finally, the updates are added to the previous node to construct the output of the GNN layer. The

last step provides the residual connection which not only facilitates the network training process,

but also improves the interpretability of the models because it resembles thermal fluxes.

2.1.2. RGNN Architecture

As an alternative architecture, to potentially better model long temporal dependencies of the

thermal system, we developed a Recurrent Graph Neural Network (RGNN) architecture as

depicted in Figure 2B. In this architecture, the network takes the time-series nodal features and

edge features as the inputs and directly generates time-series thermal responses over an arbitrary

number of time steps. At each time step, a GNN cell (like the cell used in Section 2.1.1) is used to

capture local interactions between nodes and elements using their corresponding features. By

concatenating the candidate updates generated by the GNN cells with shared parameters, a time-

series candidate update (𝑐𝑡 ) is assembled, which feeds into a stacked RNN layer. For the RNN

formulation, the Gated Recurrent Unit (GRU) [29] is adopted in this work as follows:

9
𝑟𝑡 = 𝑠𝑖𝑔(𝑊𝑟 . [𝑠𝑡−1 , 𝑐𝑡 ] + 𝑏𝑟 ) (2)

𝑧𝑡 = 𝑠𝑖𝑔(𝑊𝑧 . [𝑠𝑡−1 , 𝑐𝑡 ] + 𝑏𝑧 ) (3)

𝑠̂ 𝑡 = 𝑡𝑎𝑛ℎ(𝑊. [𝑟𝑡 × 𝑠𝑡−1 , 𝑐𝑡 ] + 𝑏) (4)

𝑇𝑡 = 𝑠𝑡 = (1 − 𝑧𝑡 ) × 𝑠𝑡−1 + 𝑧𝑡 × 𝑠̂𝑡 (5)

where GRU cell takes a candidate update (𝑐𝑡 ) and the hidden state 𝑠𝑡−1 as the input, and outputs

the temperature field at the current time, 𝑇𝑡 . 𝑟𝑡 and 𝑧𝑡 represent the reset and update gates,

correspondingly, where each uses a separate sets of weights (𝑊), biases (𝑏), and a sigmoid

activation function. The output (𝑇𝑡 ) is used as the hidden state for the next time step (𝑠𝑡 ) and the

hidden state is initialized with the initial temperature field 𝑇0 . Further details on both networks can

be found in Appendix A as well as our publicly available code2.

2.2. AM Database Development

We developed a database based on high-fidelity finite element simulations, which has allowed

us to have access to the thermal histories of all geometric points. An explicit in-house AM

simulation package, GAMMA [30, 31], is used to solve the transient heat transfer equations while

we gradually activate elements as the toolpath passes over a predefined mesh. Heat conduction,

convection, radiation, and external heat flux as the result of the laser beam, are modeled in our

simulations, while stainless steel 316L is used as the material. Key process and material properties

are reported in Table 1. In principle, one can add variations to the process and material properties

and train the network to generalize over a range of such parameters (e.g., as in [12]). However,

2
https://github.com/AMPL-NU/Graph_AM_Modeling

10
here, the parameters provided in Table 1 are kept constant to focus on the capability of the model

to predict challenging geometry and toolpath-related features.

Table 1. Process and material properties for the generated database. Material properties are obtained from

[32].

Material Properties (SS316L) Process and Environmental Properties

Density 8000 Kg/m3 Ambient Temperature 300 K


Heat Capacity 0.5 J/g∙K Laser Power 1 KW
Latent Heat 272.5 J∙g Laser Diameter 2 mm
Conductivity Coefficient 21.4 W/m K Report time step 0.1 s
Solidus/Liquidus 1648.15/ 1673.15 K Scan speed 10 mm/s

To ensure that the proposed models are trained and tested on diverse geometries, we selected

55 different industrial-grade geometries from the ABC database, where 45 of them are used for

training and 10 geometries are randomly separated for testing. Four samples of the geometries are

shown in Figure 3, while the complete set of geometries is provided in Appendix B - Figure 8. The

selected geometries vary in their size, the number of layers, shapes, and geometric features (e.g.,

wall thickness) to capture a wide range of AM builds. CAD geometries are scaled and placed on a

substrate of 20 mm height and 100 mm diameter to make them suitable for sample DED

manufacturing. The geometries are meshed using ABAQUS with 8-node hexahedron elements,

where each element has an approximate edge size of 5 mm for the substrate and 1 mm for the part,

resulting in about 10k - 100k elements in the meshes. To incentivize the network to generalize

across different toolpaths, a Python script is developed to automatically generate layer-by-layer

toolpaths for arbitrary geometries, where the contour patterns are randomly selected from 9

11
toolpath strategies varying in their motion directions, patterns (zigzag versus spiral), and starting

positions. Due to computational restrictions, we limit this study to geometries that can be deposited

on a substrate with a 100 mm diameter. Statistics on the geometric feature sizes in the database

are provided in Appendix B - Figure 9.

Figure 3. Sample AM builds adapted from the ABC online repository [33] for industrial-grade

geometries. Geometries are oriented and placed on substrate plates to construct the AM simulation

database. The three geometries within the blue border are in the training dataset while the geometry

within the red border is used as one of the test samples. All geometries are provided in the Appendix B -

Figure 8.

The database is created using graph representations, where the graph typology is constructed

based on the meshed geometry, i.e., every node of the mesh is defined as a node in the graph, and

the edges are defined according to the connectivity matrix that indicates which nodes are within a

common element. Each node of the graph is embedded with three features: (i) birth flag (indicating

whether a node is active at that time step), (ii) layer height, and (iii) laser distance feature defined

as the inverse of the distance between each node and the laser beam at each time step. In addition

to matrix connectivity, we consider an element-based distance feature indicating the distances

12
between any two nodes of an element. Examples of the input features are provided in Appendix A

– Figure 6 and 7. The three nodal features and one element-based feature are selected to provide

the barebone information necessary about the boundary conditions and fluxes to the data-driven

model. The laser distance determines the external flux input into the system, the layer height

indicates the distance from the boundary condition (the bottom of the substrate), and finally the

birth flag and edge distance determine the neighboring interactions between nodes. Only minimal

feature engineering is applied (e.g., inversing the laser distance), as we expect the data-driven

network to be able to automatically extract hidden correlations between input features.

As element connectivity and feature are constant in time, we implement them as static inputs to

the model across all time steps, whereas nodal features are provided as time-series inputs (see

Figure 2B). The two networks require different sampling methods. For the GNN architecture, we

randomly sample pairs of inputs (process features and thermal responses at time 𝑛) and outputs

(thermal responses at time 𝑛 + 1) at different times of the simulations. In contrast, the RGNN

model receives time-series inputs and outputs of 50 consequent time steps (equivalent to 25 cm

of deposition), where the starting times are sampled from the length of the simulations. To provide

a fair comparison between the performances of the two networks, they train over the same amount

of data collected from the same simulations split into two sets of training and test data. All inputs

and outputs of the models are normalized to values between 0 − 1 to assist the optimization

process. While cross validation methods can often increase the accuracy of final model, we design

this analysis based on two completely separated datasets of training and test sets because it allows

for a better analysis of the generalization capabilities of the model. Additionally, note that the

computational cost of cross validation during our training is prohibitively high.

13
There are several scientific and practical merits in our data-driven solution, even if it is trained

on a synthetic database. First, once trained, the presented data-driven solution offers orders of

magnitude faster evaluation compared to FEM simulations (see the last paragraph in Section 3).

Therefore, the model has practical uses in many time-sensitive applications. Second, we believe

investigating the capability of the model with synthesized data is an essential scientific step before

adopting an experimental database because it allows us to isolate, study, and resolve important

computational problems. Third, in the most realistic scenarios, it is conceivable that the ultimate

solution for data-driven modeling involves both experimental and simulation data as they can

complement each other’s features and augment the size of the database. Therefore, the capability

of models to work with simulation data is an important topic of research.

3. Results and Discussion

We implemented the two architectures using the Pytorch deep learning library [34] and Pytroch

Geometric package [35] in Python. The models are trained on their corresponding training sets,

while the test sets are only deployed for evaluation without contributing to backpropagation. Each

training epoch consists of a complete pass on training samples in which we update model

parameters and simultaneously calculate a mean squared error (MSE) of the predictions versus

database ideal responses. After each epoch, all test samples are evaluated and their MSE errors are

stored. We stop each training process when no improvements in the MSE of the test samples are

seen.

We also introduce a baseline solution, which is the same as the developed GNN architecture,

only without the previous temperature as the input. The reason behind this choice is two-fold. First,

14
training a GNN model without access to its past allows us to evaluate the time-dependent nature

of the problem. If a data-driven model without temporal input can lead to a reasonable

performance, the modeling approach can be greatly simplified. Second, a baseline allows for a

more intuitive assessment of the proposed models compared to raw error metrics.

Figure 4. Training and evaluation results for the baseline, GNN and RGNN formulations: (A) The

evolution of the train and test losses over training epochs is normalized per node per time step; (B) An

example simulation and the predicted thermal history at three points with the location of points depicted

on the top right and the comparison of histories between baseline, GNN, RGNN and the ground truth on

the lower right. Note that 𝑡 = 0 refers to the starting time of the 50 time-step test sample and not the

entire build.

Our results, depicted in Figure 4, indicate that the baseline solution leads to an MSE of 2.74𝑒 −4

on the training set and an MSE of 2.86𝑒 −4 on the test set. The GNN architecture rapidly captures

15
correlations in the data and reaches an error of 4.88𝑒 −6 MSE on the training set and 4.36𝑒 −6 MSE

on the test set in just 40 training epochs. The RGNN architecture requires 5𝑋 more epochs to

stabilize on the test set MSE with an error of 2.75𝑒 −5 MSE on the training set and 2.47𝑒 −5 MSE

on the test set. However, note that the RGNN model attempts to predict 50 steps into the future—

a drastically more difficult task compared to the single time step prediction; thus, a higher error is

reasonable and expected for the RGNN model. In the presented MSE metric, we assign the same

weight to all nodes, albeit the behavior of nodes far from the laser can be trivial. With this in mind,

one can devise alternative metrics to put more focus on nodes with more challenging behavior. As

an example of such a matric, the evolution of training and test losses only on 30% of the nodes

with the highest temperature is provided in Appendix C - Figure 10.

We further demonstrate the output of the developed data-driven models for a sample case in the

test set (see Figure 4B). The output of each trained model is compared to the ground truth

simulation results for three randomly selected points on the geometry surface for 50 time-steps.

This is possible because for all models, although being trained on a fixed number of time steps (1

and 50 respectively), can be recursively evaluated for any arbitrary length of time. The baseline

results in a 4.49𝑒 −4 MSE, the GNN model in 3.57𝑒 −5 MSE, and the RGNN model in 5.32𝑒 −5

MSE averaged over all nodes of the simulation. Qualitatively, the results show a good agreement

between both GNN and RGNN models and the ground truth; however, the baseline solution falls

short of accurately capturing the thermal behavior. Thus, the temporal aspect of the thermal

modeling is an essential component of the data-driven modeling. The error values are summarized

in Table 2.

16
Table 2. Summary of MSE results for baseline, GNN and RGNN models on training and test and the

demonstration case in Figure 4B. Additionally, the errors are reported as the percentage of the baseline

performance in parentheses. Note that for training and test sets reported MSE correspond to each model’s

training span, i.e., 50 time steps for RGNN and one time step for baseline and GNN. The demonstration

case results are provided for 50 time step for all three models by recursively unrolling the predictions.

Training set MSE (% of Test set MSE (% of Demonstration case MSE in Figure
baseline solution) baseline solution) 4B (% of baseline solution)
Baseline 2.74𝑒 −4 (100%) 2.86𝑒 −4 (100%) 4.49𝑒 −4 (100%)
GNN 4.88𝑒 −6 (1.78%) 4.36𝑒 −6 (1.52%) 3.57𝑒 −5 (7.95%)
RGNN 2.75𝑒 −5 (10.0%) 2.47𝑒 −5 (8.63%) 5.32𝑒 −5 (11.8%)

To further investigate the stability and capability of the model for long simulations, we evaluate

the models on 55 samples (45 for training and 10 for test sets) over 1,000 time steps, which is

1,000𝑋 and 20𝑋 the training span of the GNN and RGNN models, respectively (see Figure 5).

Often, such intense extrapolations fail in machine learning; however, we see that both models are

capable of reasonable predictions over long periods of time and their errors, although raising over

time, remain stable. In Figure 5A-5C, we demonstrate a sample case study for both GNN and

RGNN models over 1,000 time steps including their thermal contours as well as root mean

squared errors (RMSE) for all material points in each data-driven simulation. Our results indicate

that the RGNN model significantly outperforms the GNN model, showing a superior capability to

capture long interactions. A similar conclusion can be drawn by observing the RMSE evolution

over all training and test samples as shown in Figure 5D where the GNN model results in a RMSE

of 1.5𝑒 −2 for the training set and 1.44𝑒 −2 for the test set, while the RGNN model shows a small

error propagation with a RMSE of 9.58𝑒 −3 on the training set and 9.32𝑒 −3 on the test set (see

Table 3 for a summary of the results). Both models result in close performance between the training

17
sets and test sets, which shows that they can generalize well across completely unseen geometries

for long simulations. A comprehensive view of the results of all test samples over 1,000 time steps

is provided in Appendix D - Figures 11 and 12 for the RGNN and GNN models, respectively.

Figure 5. Evaluation of the trained models’ capability to produce long-term simulations. The evolution of

the thermal field on a sample simulation is depicted for the GNN and RGNN models (A and B). The error

propagation of the sample simulation and all database simulations for both models are shown (C and D).

18
Table 3. Summary of RMSE results for GNN and RGNN models unrolled over 1,000 time steps.

Training set RMSE Test set RMSE Demonstration case RMSE in Figure 5

GNN 1.50𝑒 −2 1.44𝑒 −2 1.47𝑒 −2


RGNN 9.58𝑒 −3 9.32𝑒 −3 8.53𝑒 −3

Finally, we discuss the computational costs in the training and deployment process. In our case,

the database creation from FEM simulations takes 2 weeks, including 0.5 weeks of simulation

design and 1.5 weeks of FEM execution time. The training process takes approximately 3 and 6

weeks on an Nvidia RTX 8000 GPU for the GNN and RGNN, respectively. However, once trained

the data-driven model performs far faster than the FEM counterpart. For example, the RGNN

model takes on average 20 s to run the entire simulation in the database, whereas the average

execution time for FEM database is 3 hours. This means that a data-driven model can be far more

suitable where there is a need for iterative evaluation, such as optimization, robust design, and

uncertainty quantification.

4. Conclusions and Future Works

Recent advancements in high-throughput computing combined with the popularity of data-

sharing protocols and cyber-physical systems create a unique opportunity to develop data-driven

models for heterogeneous materials and challenging manufacturing processes especially AM

processes. Here, we address a key gap in the capability of state-of-the-art data-driven AM models

related to their poor generalizability across geometries. We demonstrate that our proposed RGNN

architecture effectively captures local intricacies of the process through a graph representation and

long-term temporal correlations via a recurrent network structure. This achieves unprecedented

19
generalizability over unseen geometries and maintain it through 1,000 time steps, which is over

20𝑋 of its training span.

Further improvement can be easily achieved by expanding on both key elements of the data-

driven models: database and network. As the model heavily depends on the size and quality of the

database, an improvement avenue is to broaden the database to different process parameters,

materials, and geometries. Similarly, one can improve the network by deploying larger networks

(as we do not observe overfitting in our models) and training over longer time step periods to

further reduce the errors. While the presented work trains on simulation data, this framework can

be deployed directly on experimental data as high-quality shared repositories for AM processes

are growing. As another future research direction, we believe that our approach opens the path for

new generations of physics-informed data-driven modeling in AM processes with the flexibility

to go beyond thermal responses and also predict other challenging aspects of AM processes, such

as residual stresses and porosities.

5. Acknowledgement

This work was supported by the Vannevar Bush Faculty Fellowship N00014-19-1-2642,

National Institute of Standards and Technology (NIST) – Center for Hierarchical Material Design

(CHiMaD) under grant No.70NANB14H012, and the National Science Foundation (NSF) –

Cyber-Physical Systems (CPS) under grant No.CPS/CMMI-1646592.

20
6. References

[1] H. Kotadia, G. Gibbons, A. Das, P. Howes, A review of Laser Powder Bed Fusion Additive
Manufacturing of aluminium alloys: Microstructure and properties, Additive Manufacturing
(2021) 102155.
[2] C. Yu, J. Jiang, A perspective on using machine learning in 3D bioprinting, International
Journal of Bioprinting 6(1) (2020).
[3] J. Glerum, J. Bennett, K. Ehmann, J. Cao, Mechanical properties of hybrid additively
manufactured Inconel 718 parts created via thermal control after secondary treatment
processes, Journal of Materials Processing Technology (2021) 117047.
[4] C. Zhao, N.D. Parab, X. Li, K. Fezzaa, W. Tan, A.D. Rollett, T. Sun, Critical instability at
moving keyhole tip generates porosity in laser melting, Science 370(6520) (2020) 1080-
1086.
[5] Y. Jin, S. Joe Qin, Q. Huang, Offline predictive control of out-of-plane shape deformation for
additive manufacturing, Journal of Manufacturing Science and Engineering 138(12) (2016).
[6] L. Song, V. Bagavath-Singh, B. Dutta, J. Mazumder, Control of melt pool temperature and
deposition height during direct metal deposition process, The International Journal of
Advanced Manufacturing Technology 58(1) (2012) 247-256.
[7] Y. Huang, M.B. Khamesee, E. Toyserkani, A new physics-based model for laser directed
energy deposition (powder-fed additive manufacturing): From single-track to multi-track and
multi-layer, Optics & Laser Technology 109 (2019) 584-599.
[8] X. Sheng, X. Lu, J. Zhang, Y. Lu, An analytical solution to temperature field distribution in a
thick rod subjected to periodic-motion heat sources and application in ball screws,
Engineering Optimization (2020) 1-20.
[9] J. Ning, W. Wang, X. Ning, D.E. Sievers, H. Garmestani, S.Y. Liang, Analytical thermal
modeling of powder bed metal additive manufacturing considering powder size variation and
packing, Materials 13(8) (2020) 1988.
[10] C. Wang, X. Tan, S. Tor, C. Lim, Machine learning in additive manufacturing: State-of-the-
art and perspectives, Additive Manufacturing (2020) 101538.
[11] J. Jiang, Y. Xiong, Z. Zhang, D.W. Rosen, Machine learning integrated design for additive
manufacturing, Journal of Intelligent Manufacturing (2020) 1-14.

21
[12] M. Mozaffar, A. Paul, R. Al-Bahrani, S. Wolff, A. Choudhary, A. Agrawal, K. Ehmann, J.
Cao, Data-driven prediction of the high-dimensional thermal history in directed energy
deposition processes via recurrent neural networks, Manufacturing letters 18 (2018) 35-39.
[13] M. Roy, O. Wodo, Data-driven modeling of thermal history in additive manufacturing,
Additive Manufacturing 32 (2020) 101017.
[14] Z. Zhang, Z. Liu, D. Wu, Prediction of melt pool temperature in directed energy deposition
using machine learning, Additive Manufacturing (2020) 101692.
[15] K. Ren, Y. Chew, Y. Zhang, J. Fuh, G. Bi, Thermal field prediction for laser scanning paths
in laser aided additive manufacturing by physics-based machine learning, Computer Methods
in Applied Mechanics and Engineering 362 (2020) 112734.
[16] A. Haghighi, L. Li, A hybrid physics-based and data-driven approach for characterizing
porosity variation and filament bonding in extrusion-based additive manufacturing, Additive
Manufacturing 36 (2020) 101399.
[17] A. S. T. M., Standard terminology for additive manufacturing technologies, ASTM
International F2792-12a, 2012.
[18] Z. Wang, T. Chen, J. Ren, W. Yu, H. Cheng, L. Lin, Deep reasoning with knowledge graph
for social relationship understanding, arXiv preprint arXiv:1807.00504 (2018).
[19] A. Fout, J. Byrd, B. Shariat, A. Ben-Hur, Protein interface prediction using graph
convolutional networks, Advances in neural information processing systems, 2017, pp. 6530-
6539.
[20] X. Wang, Y. Ye, A. Gupta, Zero-shot recognition via semantic embeddings and knowledge
graphs, Proceedings of the IEEE conference on computer vision and pattern recognition,
2018, pp. 6857-6866.
[21] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks,
arXiv preprint arXiv:1609.02907 (2016).
[22] R. Li, S. Wang, F. Zhu, J. Huang, Adaptive graph convolutional neural networks, arXiv
preprint arXiv:1801.03226 (2018).
[23] M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs
with fast localized spectral filtering, Advances in neural information processing systems 29
(2016) 3844-3852.

22
[24] Z. Liu, J. Zhou, Introduction to Graph Neural Networks, Synthesis Lectures on Artificial
Intelligence and Machine Learning 14(2) (2020) 1-127.
[25] B. Yu, H. Yin, Z. Zhu, Spatio-temporal graph convolutional networks: A deep learning
framework for traffic forecasting, arXiv preprint arXiv:1709.04875 (2017).
[26] Z. Cui, K. Henrickson, R. Ke, Y. Wang, Traffic graph convolutional recurrent neural
network: A deep learning framework for network-scale traffic learning and forecasting, IEEE
Transactions on Intelligent Transportation Systems 21(11) (2019) 4883-4894.
[27] J. Zhang, X. Shi, J. Xie, H. Ma, I. King, D.-Y. Yeung, Gaan: Gated attention networks for
learning on large and spatiotemporal graphs, arXiv preprint arXiv:1803.07294 (2018).
[28] G. Li, C. Xiong, A. Thabet, B. Ghanem, Deepergcn: All you need to train deeper gcns,
arXiv preprint arXiv:2006.07739 (2020).
[29] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y.
Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine
translation, arXiv preprint arXiv:1406.1078 (2014).
[30] J. Smith, W. Xiong, J. Cao, W.K. Liu, Thermodynamically consistent microstructure
prediction of additively manufactured materials, Computational mechanics 57(3) (2016) 359-
370.
[31] M. Mozaffar, E. Ndip-Agbor, S. Lin, G.J. Wagner, K. Ehmann, J. Cao, Acceleration
strategies for explicit finite element analysis of metal powder-based additive manufacturing
processes using graphical processing units, Computational Mechanics 64(3) (2019) 879-894.
[32] AZoM.com, AZO Materials: Stainless Steel - Grade 316 (UNS S31600), 2021. .
http://www.azom.com/properties.aspx?ArticleID=863.
[33] S. Koch, A. Matveev, Z. Jiang, F. Williams, A. Artemov, E. Burnaev, M. Alexa, D. Zorin,
D. Panozzo, Abc: A big cad model dataset for geometric deep learning, Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9601-9611.
[34] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N.
Gimelshein, L. Antiga, Pytorch: An imperative style, high-performance deep learning library,
Advances in neural information processing systems, 2019, pp. 8026-8037.
[35] M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric, arXiv
preprint arXiv:1903.02428 (2019).

23
Appendix A – Network block diagrams and training details

Figure 6. GNN network architecture, inputs features, outputs and key hyperparameters.

24
Figure 7. RGNN network architecture, inputs features, outputs and key hyperparameters.

25
Appendix B – Database geometries and characteristics

Figure 8. List of geometries used to produce training and test sets.

26
Figure 9. Database size statistics in XY and Z dimensions.

Appendix C – Alternative metric

Figure 10. The evolution of the train and test an alternative metric over training epochs for baseline,

GNN, and RGNN models. The metric is computing as the MSE between the predicted and database

temperatures at nodes in the top 30% thermal range in any given time step.

27
Appendix D – Test-Set results

Figure 11. Results of all test-set geometries with pretrained RGNN network for 1000 time steps including

3D contours at time 0, 500, and 1000 (columns 2-4), cross-section plot of the prediction error at the

location of the laser (column 5), and the error growth over time averaged over all nodes (column 6).

28
Figure 12. Results of all test-set geometries with pretrained GNN network for 1000 time steps including

3D contours at time 0, 500, and 1000 (columns 2-4), cross-section plot of the prediction error at the

location of the laser (column 5), and the error growth over time averaged over all nodes (column 6).

29

You might also like