0% found this document useful (0 votes)
11 views9 pages

A Method for the Rapid Creation of AI Driven Cryst 2024 Computers Chemical

A Method for the Rapid Creation of AI Driven Cryst 2024 Computers Chemical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

A Method for the Rapid Creation of AI Driven Cryst 2024 Computers Chemical

A Method for the Rapid Creation of AI Driven Cryst 2024 Computers Chemical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Computers and Chemical Engineering 186 (2024) 108680

Contents lists available at ScienceDirect

Computers and Chemical Engineering


journal homepage: www.elsevier.com/locate/compchemeng

A method for the rapid creation of AI driven crystallization


process controllers
Conrad Meyer *, Arjun Arora , Stephan Scholl
Technische Universität Braunschweig, Institute for Chemical and Thermal Process Engineering, Langer Kamp 7, 38106 Braunschweig, Germany

A R T I C L E I N F O A B S T R A C T

Keywords: The implementation and optimization of crystallization process control can be crucial to achieve quality stan­
Crystallization dards consistently in various industrial applications. To overcome several of the challenges that arise when
Machine learning creating advanced, model-based control approaches, this work introduces a new method for rapid and flexible
Artificial neural network
creation of Artificial Neural Network(ANN) based Model Predictive Controllers (NN-MPC). It includes all pro­
Process control
Model predictive control
cedures and software necessary for users to easily extract data from crystallization processes and automatically
train optimized ANN on it. The ANN are then inserted into a simple MPC scheme as means to predict future
system states. The whole process from generating training data to a functional MPC controlling a batch cooling
crystallization is demonstrated in a case study on two model substances.

1. Introduction is shining through the sapphire probe window into the crystallizator.
When hitting a crystal, the time during which light is backscattered is
While crystallization has been used for a long time and is an measured. Based on the number of crystals hit and the time during
important unit operation in many production processes across several which the backscattering occurs, a Chord Length Distribution (CLD) is
branches of the process industry, it plays an especially crucial role in the calculated online. While not the same as the CSD, it can still be corre­
pharmaceutical industry (Braatz, 2002). Almost all active pharmaceu­ lated to it (Heinrich and Ulrich, 2012). Other methods for measuring
tical ingredients are delivered in form of crystals (Alvarez and Myerson, crystal properties include image analysis (Borsos et al., 2017), ultrasonic
2010). As central part of many production pipelines, it can have a sig­ attenuation (Li et al., 2004) and Raman spectroscopy for polymorph
nificant impact not only on the final product, but also on the efficiency of detection (Qu et al., 2009). In the continuous phase, measurements of
other operations further downstream, such as filtration, washing and the concentration are crucial to calculate the supersaturation, the
drying (Nagy et al., 2013; Kim et al., 2005). A suitable process control driving force for nucleation and crystal growth. For this purpose,
can be an invaluable tool to guarantee consistent crystal quality ATR-FTIR spectroscopy has been established as popular tool to provide
regarding crystal size distribution (CSD), purity, morphology, solubility, accurate measurements even in presence of crystals (Lewiner et al.,
and other criteria, and has therefore been researched for decades (Gao 2001). Alternative methods include NIR and mid-IR spectroscopy
et al., 2017). (Simone et al., 2014).
But only in recent years, advanced control techniques are becoming Traditionally, most pharmaceutical crystallizations have been
increasingly feasible. This has been mainly attributed to two factors: implemented as recipe-based batch processes, and control schemes are
First, technological advancements are making the use of more advanced, developed through trial-and-error experimentation (Yu et al., 2007;
computationally intensive control techniques possible. Second, focused Fujiwara et al., 2005). Such model-free control approaches include static
research on Process Analytical Technologies (PAT) has yielded in­ cooling ramps (Mayrhofer and Nývlt, 1988), Supersaturation Control
struments capable to provide more accurate process data (Nagy and (SC) (Liotta and Sabesan, 2004) and Direct Nucleation Control (DNC)
Braatz, 2012). For the analysis of the crystalline phase, Focused Beam (Abu Bakar et al., 2009). These techniques typically suffer from the
Reflective Measurement (FBRM) been established as valuable tool to get possibility of batch-to-batch variability caused by the inability of the
in-line information of the crystals (Ruf et al., 2000; Griffin et al., 2015; process control to react dynamically to disturbances or changed re­
Kutluay et al., 2017). It features a rapidly rotating infrared laser, which quirements (Gerlinger et al., 2019; Chen et al., 2011).

* Corresponding author.
E-mail address: [email protected] (C. Meyer).

https://doi.org/10.1016/j.compchemeng.2024.108680
Received 6 December 2023; Received in revised form 3 April 2024; Accepted 4 April 2024
Available online 5 April 2024
0098-1354/© 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

As dynamic and more precise alternative, model-based approaches crystallization task at hand consists of five parts:
are being developed. They usually feature a knowledge-based model of
the process, often formulated as Population Balance Equations (PBE) 1. A PC running a rudimentary process control and data logging soft­
(Ramkrishna, 2000). A prominent example in this category is Model ware. It is directly connected to all PAT and temperature control
Predictive Control (MPC), which uses a process model to predict future appliances of the crystallizer and can be used to create training data
trends of process variables and based on them dynamically generate using several model-free controls.
optimized control inputs for the system (García et al., 1989). Creating 2. An InfluxDB database server (InfluxDB) for storing raw process data
suitable models of a crystallization process can be very challenging, due and metadata.
to the complexities of the effects involved and the highly nonlinear in­ 3. A server running the training and optimization pipeline. This soft­
teractions of process variables (Wu and Hussain, 2005). For accuracy, ware includes fetching raw data from the database and preprocessing
the model needs to consider all relevant effects. These include the basic it, creating, and training models by supervised learning, and auto­
mechanisms nucleation and growth, but other factors like agglomera­ matically optimizing them based on a user-specified configuration.
tion, breakage, and polymorph transformation may also have a 4. A MLflow tracking server (MLflow), which enables the user to follow
non-negligible impact (Erdemir et al., 2009; Mazotti et al., 2018). the training process, compare models and serve trained models to the
A powerful alternative to knowledge-based models can be the use of MPC module (5).
AI driven, empirical models like Artificial Neural Networks (ANN). 5. A MPC module, which fetches a trained model from the tracking
Simple Feed-Forward Neural Networks (FFN) have been successfully server (4) and forwards control inputs to the process control software
used to model crystallization processes and to control them in MPCs for (1).
several processes (Velasco-Mejia et al., 2016; Dasoud et al., 2017; Öner
et al., 2020). Additionally, Recurrent Neural Networks (RNN) have been A general overview is shown in Fig. 1.
established as a subcategory of ANNs specifically designed for handling For any empirical model, sufficient data is necessary for successfully
timeseries data (Hochreiter, 1998), making them promising candidates training it to generalize system properties. To enable the user to generate
for capturing the complex kinetics of crystallizations. Two prominent such data, a simple process control and data logging software (1) was
examples are Long-Short Term Memory (LSTM) (Hochreiter and designed purely in python 3.10. Features include:
Schmidhuber, 1997) and Gated Recurrent Unit (GRU) (Cho et al., 2014).
Both FFN and RNN have been demonstrated as suitable models in - High measurement frequencies due to multithreading
NN-MPC schemes for various crystallization applications, either stand­ - Interactive GUI
alone (Damour et al., 2010; Zheng et al., 2022), or in combination with - Modular: New PAT or process controllers can be implemented by
physics-based models (Griffin et al., 2016; Wu et al., 2023). using pre-existing structures
In recent years, the increase of available computing power as well as - Heartbeat module to simultaneously get measurements from all
the development of several open-source frameworks like TensorFlow connected PAT devices
(Abadi et al., 2016) and PyTorch (Paszke et al., 2017) have made the - Data logger for buffering and writing data to file and / or database
development of ANNs more feasible for a broader audience. Still, due to
the lack of generally applicable rules for designing ANNs and optimizing The collected raw data is stored in a InfluxDB database server (2) to
the many hyperparameters, their training is a tedious and mostly make it available to the training pipeline. This database type was chosen
empirical task. And even if the training was successful, the model is because it can handle timeseries data effectively, while also allowing for
usually only applicable to the process it was trained for, and cannot be experiment metadata to be stored. This feature is necessary to guarantee
transferred to new substances, process conditions or different crystal­ the correct data for e.g. one specific substance or process type can be
lizators in case of a Scale-Up / Scale-Down. One way around this limi­ retrieved for training later.
tation of such specialized models would be to train a bigger, more As central part of the method, the training and optimization pipeline
general model to cover all possible process variations. But this approach server (3) is responsible for creating empirical models. For this, data is
needs much bigger datasets and will most likely result in worse accuracy first fetched from the database. The following preprocessing steps are
(Hinton et al., 2015). taken before using the data for training: First, the data is normalized in
Addressing these constraints, this work introduces a method for the fixed ranges depending on the values recorded. Outliers outside the
rapid and flexible creation and optimization of data driven, specialized normalization interval are removed. Then, it is resampled by taking the
models for direct use in a model-based control scheme, aiming to average of values for fixed time intervals. This guarantees the identical
overcome the previously stated limitations regarding transferability of number of datapoints for all variables. Lastly, the data is smoothed using
ANNs to different processes. It includes software, servers, and configu­ a moving average with a window size of 50.
rations necessary to generate and preprocess training data, design, train The preprocessed timeseries data of each experiment is then split:
and optimize ANNs and use them in a MPC process control. By imple­ For each datapoint it contains, the NH preceding timesteps are extracted
menting a high amount of automatization, the tuning of hyper­ as a “history” interval, while an interval of N subsequent timesteps forms
parameters as well as model design is streamlined. the prediction interval of the ANN. The resulting subsets therefore each
We demonstrate the methods effectiveness in a case study for batch contain NH + N datapoints for each variable. Of the these, 80% are
cooling crystallizations of two model substances: adipic acid (AA) and randomly selected to form the training dataset, while the remaining 20%
potassium dihydrogen phosphate (KDP). Three model types are trained are used for validation.
for this purpose: A simple FFN, as well as LSTM and GRU RNN. All Model accuracy depends on choosing the right features (variables to
models are optimized and compared based on their accuracy predicting be used by the ANN). To be usable in the proposed MPC-scheme, three
the crystallization process. They are then used in a MPC for crystallizing sets of features must be selected from the data:
the model substances while aiming for a target FBRM CLD as repre­
sentative quality parameter. 1. System state features containing all process variables to be used as
main ANN input, providing sufficient data for the model to gener­
2. Materials and methods alize system properties during training. Here, NH states will be
considered by the ANN to enable it to extract information about the
2.1. Method overview system kinetics. Data for these features will be taken from the “his­
tory” interval of each subset during training, and from the NH most
The presented method to quickly generate MPC-ready ANN for the

2
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

Fig. 1. Method infrastructure overview.

recent online measurements during MPC control, representing the 1. Collect the system states xj containing the corresponding features for
estimated system states. the most recent timesteps j = k − NH + 1, k − NH + 2, …, k.
2. Predicted features containing process variables the ANN will make 2. Calculate new reference trajectories ri for all predicted features
predictions for. During training, these will be taken from the pre­ leading from their most recent values xk to the corresponding set­
diction intervals of each subset and considered the desired model points w across all timesteps i of the prediction horizon N.
outputs (“labels”) for supervised learning. During MPC control, its 3. Iteratively minimize the MPC cost function J for the timesteps i of the
objective is defined by target values for these features, and pre­ prediction horizon, by varying hypothetical control inputs ui and
dictions of their future values by the ANN are used for calculating getting ANN predictions ̂ y i based on xj and ui . An additional term
optimal control inputs. adds a penalty for deviations of uk from the previous control input
3. Manipulated features containing all process variables the MPC can uk− 1 :
manipulate to achieve its objective. During training, these are also
taken from the prediction interval. While controlling the process, 1 ∑N
J= y i (xk−
(ri − ̂ NH +1 , xk− NH +2 , …, xk , ui ))2 + (uk − uk− 1 )2 (1)
these are the variables the MPC will calculate optimal control inputs N i=1
for and apply them to the process.

ANN models are created using TensorFlow 2.10, and the python 4. Apply the first optimized control input u1 to the process.
module Optuna 3.3 (Akiba et al., 2019) is used for hyperparameter
tuning, employing the Tree-Structured Parzen estimator (TPE) method. In this work, hyperbolic reference trajectories were used based on
The Mean Squared Error (MSE) validation loss after completing 50 their good performance in preliminary tests. Alternatively, linear, or
epochs was set as minimization criterion. Like this, e.g. different model / exponential trajectories can be used in the MPC module.
hidden configurations, combinations of features, activation functions or
optimizer algorithms can automatically be tested. For maximum flexi­ 2.2. Case study: AI driven MPC for batch cooling crystallization
bility, a configuration file lets the user define the models to be trained by
setting parameters like model type, features, the number of timesteps to To demonstrate the effectiveness of the method introduced in Section
predict, etc. Alternatively, value ranges can be specified, in which the 2.1, all steps from training data generation to a finished MPC controlling
(hyper)parameters will then be varied by Optuna for optimization. a crystallization process were followed through using a model process:
All stages of training and optimization are tracked by a dedicated The batch cooling crystallization of two model substances, AA and KDP,
server running MLflow (4) version 2.7. This open-source software is run from solution in deionized water. The substances were chosen because
with little customization and provides a comprehensive way to log the of their good solubility, non-toxicity and them not expressing poly­
configuration files, parameters used, training progress, and resulting morphic behavior in the applied temperature ranges. The target of the
trained model properties (Zaharia et al., 2018). It also is used to store the MPC was set to create crystals with a narrow FBRM CLD and a mean
models after training, compare their performance and make them chord length in the range of 50–150 µm.
accessible to be used for process control.
The last part of the method is the MPC module (5). As part of the 2.2.1. Experimental setup
custom process control and logging software (1), it runs on the PC For all experiments, a stirred 2 L jacketed glass vessel was used as
attached directly to the crystallizator peripherals. It provides function­ crystallizator. It was stirred at 360 rpm by an overhead stirrer with
ality to select and automatically download trained models stored by the paddle agitator. The temperature was controlled using a thermostat
MLflow server via a GUI. Before starting the MPC module, the overall (Huber Ministat 230). A general overview is given in Fig. 2.
control objective must be chosen. For this, the user can define target A FBRM probe (Mettler Toledo ParticleTrack G400) with a scanning
values w for each of the predicted features. They represent the static speed of 2 m/s was used in-line to record the unweighted CLD. Addi­
setpoint of the system state, which the process control will try to ach­ tionally, the cube weighted CLD3 was calculated as shown in Eq. (2) to
ieve. During control, the MPC module repeats the following operations better judge crystallization results.
at the sampling time k:

3
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

was the creation of diverse datasets that include all relevant physical
effects and kinetics, e.g. spontaneous nucleation, growth, breakage, and
dissolution of crystals. To do so, static temperature ramps, temperature
waves, DNC, and SC control schemes with varying parameters were
executed. An overview is given in Table 1. The same amount of exper­
iments was done for each control type to prevent creating a bias towards
one of them.
The ramp scheme featured linear cooling with the fixed cooling rate
as sole parameter varied. In the waves approach, the crystallizer was
repeatedly cooled and heated from 50 ◦ C to 20 ◦ C with fixed cooling and
heating rate, to simulate thermocycling. For the DNC approach, the
surface sensor intensity was used to define a target which the controller
tried to reach by changing the temperature in the vessel with a dynamic
cooling / heating rate. SC was based on the concentration calculated
from ATR-FTIR spectra. Every experiment was limited to 3 h.

Fig. 2. Flow chart and picture of the used experimental setup. Backscattering 2.2.3. Model creation, training, and optimization
sensor, ATR-FTIR and FBRM probes were inserted through the lid of the 2 L As described in 2.1, the MPC module expects ANN predictions to be
crystallizator. based on the system state containing recent process data, and possible
future control inputs. Consequently, ANN were designed to have two
inputs, for which specific process variables were selected as features.
3
ni l
wi = ∑N (2)
3 The first one is for receiving system state data, and the following eight
j=1 nj lj
system state features were chosen:
wi represents the weighted counts of FBRM channel i, with ni and l the
- the process temperature measured by the thermostat,
corresponding counts and channel midpoint, respectively. To compare
- four FBRM CLD bins: < 10 µm, 10 - 50 µm, 50 - 150 µm and 150 - 300
the narrowness of CLDs, the Coefficient of Variation (CV) was calculated
µm,
based on the standard deviation σ 3 and the mean chord length μ of the
- concentration and supersaturation calculated from ATR-FTIR
CLD3 distribution as shown in Eq. (3).
measurements,
CV =
σ
(3) - signal intensity of the surface sensor.
μ
For concentration measurement, an ATR-FTIR (Mettler Toledo For the second one to receive control inputs, as single manipulated
ReactIR 15) was used. A method was developed to correlate its raw feature the temperature setpoint of the thermostat was selected.
spectra with the concentration of model substances: In several calibra­ The two input layers were connected to a hidden layer depending on
tion experiments a solution with a high AA or KDP concentration was the model type. For the simple FFN, a fully connected dense layer with
prepared. It was then cooled from 60 ◦ C to 5 ◦ C in steps of 5 K, holding 64 units was used, while for the LSTM and GRU model types, the cor­
the temperature on each step for one hour, letting the system equilibrate responding Keras layers with 64 cells were chosen. Models with more
and reach saturation concentration. The mean raw spectra of the last layers or more units / cells did not improve training results significantly
two minutes per step were collected and a Savitzky-Golay filter in preceding test runs and were therefore omitted further on. The out­
(Savitzky and Golay, 1964) was applied subsequently. Using a Partial puts of the hidden layers were concatenated and connected to a dense
Least Squares Regression (PLSR), the filtered spectra and measured output layer.
temperatures were fitted to the saturation concentration of the corre­ The predicted features were the four FBRM CLD bins that also were
sponding substance at each step. With this method, good accuracies used in the system state features. Fig. 3 shows an overview of the
(R2=0.998 for AA, R2=0.997 for KDP) were achieved, and the regression described model design.
model was then applied for concentration calculation. The supersatu­ When it comes to training ANN models, the tuning of hyper­
ration could then be calculated based on Eq. (4) from concentration c parameters to optimize the training process can be a time-consuming
and saturation concentration c∗ (T) at temperature T. task. Therefore, within this method it was automized as much as
possible. The (hyper)parameters varied in this case study are described
c
S(T) = (4) in the following section and an overview is given in Table 2.
c∗ (T)
For each model, two versions were created for predicting either N =
As additional source of process information, a custom light back­ 1 or N = 10 timesteps of the target variables (one-step-prediction /
scattering sensor was used, whose signal intensity correlates with the multiple-step-prediction). In both cases, NH = 10 timesteps of system
total crystal surface present in the crystallizator (Schmitt et al., 2022). state data were provided as input. The optimizer algorithm can have a
All analytical devices were attached to a PC (Intel i5–6600, 8 GB significant impact on the success of model performance. During training,
RAM) running the control software with USB cables. For both FBRM and it adjusts model variables based on a loss function to gradually increase
ATR-FTIR the corresponding Software (iC FBRM 4.4.33, ReactIR 7.1) prediction accuracy. As examples for common optimizers, Adam and
and OPC UA Server 1.2 from Mettler Toledo was used to connect with
the process control software via the asyncua module (version 1.0.3). The
communication with thermostat and custom backscattering sensor was Table 1
Model-free control schemes and corresponding parameters varied during
established by sending and receiving serial messages with the pyserial
training data generation.
module (version 3.5).
Control scheme Parameter (variation range)
2.2.2. Generation of training data ramp cooling rate (0.1 – 0.5 K/min)
For training data generation, several experiments with model-free waves cooling / heating rate (0.1 – 0.5 K/min)
DNC target intensity (90 – 100)
process control were performed with both model substances. The goal
SC target supersaturation (1.25 – 3.0)

4
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

Fig. 3. General design of the ANN models used in this work.

3. Results and discussion


Table 2
Model types and parameters varied during optimization.
3.1. Case study: batch cooling crystallization
Parameter Values

model type FFN, LSTM, GRU To demonstrate the effectiveness of the method introduced, a case
timesteps predicted 1, 10 study was executed, showing the complete process of creating a func­
optimizer Adam, Nadam, Adamax, Adagrad tional MPC for batch cooling crystallizations of AA and KDP from
activation function Sigmoid, ReLU
kernel regulizer L1, L2
aqueous solution.
bias regulizer L1, L2
3.1.1. Generation and preprocessing of training data
The data necessary to train ANN models was collected with the py­
Adamax (Kingma and Ba, 2015), Nadam (Dozat, 2016), and Adagrad thon process control (1) by applying model-free approaches to the batch
(Duchi et al., 2011) were tested in this study with a learning rate of cooling process for both model substances.
0.0001. Other algorithms like RMSprop and FTRL were tested in pre­ Process temperature and FBRM count bins were measured and log­
ceding experiments but performed significantly worse and are therefore ged directly, while for the concentration and supersaturation a PLS
not featured here. As activation function for the fully connected dense regression was used to correlate ATR-FTIR raw spectra to concentration.
output layer at the end of each model, the sigmoid as well as the The raw data was then preprocessed as described in Section 2.1,
Rectified Linear Units (ReLU) functions were tested as examples. providing data with fewer fluctuations while retaining most of its in­
A common problem with training ANN is overfitting, meaning a formation (see Fig. 4). In total, 51 valid experiments were executed for
model learns the training dataset, but fails to generalize and apply its both AA and KDP, resulting and ca. 55 000 subsets of preprocessed data
knowledge to new data. The use of regulizers allows to mitigate this. In each.
this study, bias and kernel regulizers L1 (Tibshirani, 1996) and L2 (Zou
and Hastie, 2005) were tested. For training, a batch size of 20 was used, 3.1.2. Model training and optimization
which had proven to be most successful in previous test trainings. The three model types tested in this study, FFN, LSTM and GRU, were
All models were trained using a GPU on a workstation (Intel Xeon W- created on the training and optimization server (3). In total, for each
2255, 32 GB RAM, nVidia RTX 2080 Super). type 95 to 105 models were trained with varying hyperparameters for
optimization. The configuration and performance of the best models
2.2.4. Crystallization experiments using trained models in an MPC based on validation loss are shown in Table 3. Training took ca. 4–5 min
For testing the trained models in a lab-scale application, the setup for FFN, and 7–8 min for RNN models in the setup used.
described in Section 2.2.1 was used. Instead of the model-free control For AA, the validation loss ranged from 5.00e-5 to 1.77e-4.
schemes used for training data generation, the MPC module (5) was used
to control the crystallizator. For this, first the desired trained model was
selected via GUI and then automatically fetched from the MLflow server
(4). As target for the MPC controller, a high value of 300 was chosen for
the FBRM CLD bin 50–150 µm, and 0 for all other bins. This led to the
controller minimizing crystals with chord lengths outside the desired
range.
To generate optimal control inputs, a Nadam optimizer with a
learning rate of 0.01 was chosen based on preceding tests. During the
experiment, FBRM CLD bins, ATR-FTIR concentration and supersatu­
ration, surface sensor signal intensity and temperature were fed in-line
into the process control as the system state features.
Every 10 s, the controller executed the steps described in more detail
in 2.1 consisting of calculating new reference trajectories ri for all FBRM
bins, minimizing the cost function J based on ANN predictions ̂ y i of the
same FBRM bins, and applying the first of the optimized control inputs
to the process by changing the temperature setpoint of the thermostat.
Fig. 4. Example for preprocessed training data for a batch cooling crystalli­
The temperature range was constrained to 5 – 60 ◦ C., and all experi­
zation of AA. The trends were generated using for a linear cooling ramp (0.1
ments were limited to 2 h. K/min).

5
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

Table 3
Performance comparison of best models. The models were selected based on their validation loss for each category.
substance model type timesteps predicted optimizer kernel / bias regulizer activation function validation loss loss

FFN Adam L2/L2 ReLU 7.98e-5 1.30e-4


GRU 1 Nadam L2/L1 ReLU 1.25e-4 1.69e-4
LSTM Nadam L2/L2 sigmoid 5.00e-5 1.07e-4
AA
FFN Nadam L2/L1 ReLU 1.62e-4 2.81e-4
GRU 10 Adam L2/L1 ReLU 1.23e-4 3.35e-4
LSTM Nadam L2/L2 sigmoid 1.77e-4 4.01e-4
FFN Nadam L2/L2 sigmoid 8.90e-5 7.65e-5
GRU 1 Nadam L2/L1 ReLU 9.81e-5 1.01e-4
KDP LSTM Nadam L2/L2 ReLU 2.86e-5 3.40e-5
FFN Nadam L2/L1 ReLU 1.41e-4 1.30e-4
10
GRU Nadam L2/L2 ReLU 6.92e-5 5.78e-5
LSTM Nadam L2/L2 ReLU 5.87e-5 4.02e-5

Surprisingly, both extremes come from LSTM models. The performance every combination, in general models with a L2 for both kernel and bias
of both GRU and FFN was significantly better for one timestep pre­ regulizer dominated the top performing models. The bias regulizer was
dictions than for ten timestep predictions. In the first case, the FFN observed to be less impactful, as some of the best models used L1.
model with 7.98e-5 came close to the performance of the LSTM, while Validation loss and loss were very similar for all shown models, indi­
GRU performed worse with 1.25e-4. Loss values generally were higher, cating that the regulization was sufficient and the network small enough
ranging from 1.07e-4 to 1.69e-4 for one timestep and from 2.81e-4 to to prevent overfitting, see Fig. 5. Like regulizers, the activation function
4.01e-4 for ten timesteps predicted. showed only little impact, but ReLU was used in most of the top models.
In case of KDP, LSTMs outperformed the others for both prediction While ReLU worked well with both bias regulizers, models using the
ranges, with validation loss values of 2.86e-5 and 5.87e-5. As with AA, sigmoid function strictly used L2 for that.
FFN models performed better for one timestep predictions with 8.90e-5 Fig. 6 visualizes model predictions. While the accuracies for the
in contrast to 1.41e-4 for ten timesteps. The performance of GRU models small chord lengths < 10 µm were extremely accurate, small discrep­
was in between the others, with 9.81e-5 and 6.92e-5. The difference ancies could be seen for bigger chord lengths.
between validation loss and loss was not as pronounced as with AA
models, but in contrast to them slightly higher than the validation loss. 3.1.3. MPC performance
In general, LSTM models trained with the Nadam optimizer and L2 as Using the MPC module (5), crystallizations for this study could
regulizer for both kernel and bias proved to be the most accurate for the successfully be conducted. The goal was to generate crystals with a mean
process at hand. FFN and GRU networks performed slightly worse but chord length in the range of 50 µm to 150 µm. For each model type (FFN,
could be trained much faster due to their lower complexity (see Fig. 5). GRU, LSTM), the models with the best loss and validation loss were
Comparing models predicting one and ten timesteps, the former gener­ chosen for both predicting one and ten timesteps and both substances,
ally performed better with validation losses between 2.86e-5 to 1.25e-4. resulting in a total of 12 models being tested in actual process control.
This may be because the prediction uncertainty increases with longer The calculation of new reference trajectories and optimization of control
time intervals chosen. inputs took 0.4 – 1.4 s.
Due to the many models automatically trained, the influence of the The MPC controller handled all stages of the crystallization well: In
parameters could be determined and can now be used to further improve the beginning, it cooled rapidly until reaching the saturation tempera­
future training sessions of networks. The optimizer algorithm exerted by ture, at which point primary nucleation occurred. Afterwards, the
far the biggest impact on training success. Here, Nadam was generally temperature setpoint was raised slightly, effectively preventing further
the best option, closely followed by Adam. Adagrad and Adamax often nucleation to occur. This can also be seen in the supersaturation trend:
performed multiple orders of magnitude worse. One caveat is, that the As expected it drops when nucleation occurs, and the dissolved con­
optimizer hyperparameters were not varied. When comparing trained centration is lowered due to the formation of crystals. Afterwards, when
models, the amount of hyperparameters used in the optimization can the temperature is raised such that the supersaturation rests slightly
play a significant role (Choi et al., 2020). above unity, favoring crystal growth over nucleation. When nucleation
The kernel and bias regulizers being set to either L1 or L2 impacted
the results only slightly. Even though good models could be trained for

Fig. 6. Normalized FBRM count predictions made by the best LSTM model for
one timestep predictions on the data of an AA crystallization experiment. The
Fig. 5. Training progress of the models listed in Table 3 for KDP. Solid lines lines indicate the actual measurements, the corresponding darker dots the
indicate validation loss, dashed lines loss. model outputs.

6
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

occurs, the FBRM count bins, especially for small chord lengths < 10 µm, Table 4
can be seen rising rapidly first, and then falling again when growth takes Performance comparison of data driven MPC experiments based on different
over as main crystallization mechanism. Trends for one experiment are ANN models, and model-free control approaches. The calculations are based on
shown in Fig. 7, the resulting FBRM CLD, cube weighted CLD3 and CV FBRM CLDs.
for all MPC experiments listed in Table 4. substance model type / timesteps CLD CLD3 CV
For AA, the mean CLD were slightly below the target with 18.11 - control predicted mean mean [-/e-
approach [µm] [µm] 4]
42.79 µm, except for the LSTM ten timestep model with 52.54 µm. In all
experiments with KDP on the other hand, crystals within the target CLD FFN 37.32 310.12 7.8
range (80.58 to 102.52 µm) could be produced. One possible explana­ GRU 1 34.01 261.30 9.7
LSTM 42.79 251.50 9.4
tion for the differences between substances builds on the metastable AA
FFN 25.05 253.37 8.9
zone width of AA being narrower than the one of KDP (ca. 5 K and 15 K GRU 10 52.54 254.16 7.4
in the setup used at maximum cooling rate of 0.5 K/min). Because of LSTM 18.11 250.86 8.7
that, the cooling rate has a much bigger impact on induction time and FFN 80.58 268.89 7.9
GRU 1 100.57 269.23 7.9
kinetics, making it more challenging to precisely predict and control the
LSTM 101.89 274.49 7.3
process. KDP
FFN 97.50 247.51 7.1
The differences between model types were small, but LSTM seemed GRU 10 102.18 252.96 6.8
to be most consistent achieving the MPC goal, especially with one LSTM 102.52 252.86 7.0
timestep models. One factor for this might be the prediction accuracy ramp 0.1–0.2 – 104.51 292.36 8.7
K/min
described in Section 3.1.2 which showed the same ranking of the
ramp 0.3–0.5 – 76.98 275.03 8.5
models. For both substances, models with ten timestep predictions were AA
K/min
better suited to create crystals of the desired size – even though the SC 1.25–2.0 – 51.68 259.65 6.6
validation loss of one timestep models was lower during training. SC 2.5–3.0 – 60.21 268.11 7.5
ramp 0.1–0.2 97.50 247.51 7.5
Since CSDs are commonly compared in form of volume/mass dis­ –
K/min
tributions, the cube weighted CLD3 and its derived CV were used to ramp 0.3–0.5 – 100.57 269.23 9.4
compare the results in this study. With values from 247.51 µm to 310.12 KDP
K/min
µm the CLD3 across all MPC experiments was above the target range. SC 1.25–2.0 – 102.18 252.96 11.1
This is because the MPC target definition was solely based on the un­ SC 2.5–3.0 – 101.89 274.49 8.9

weighted CLD.
The CV across all models used was small with values between 6.8e-4 servers, and configurations necessary to generate training data, create
and 9.7e-4, proving the existence of crystals within a narrow CLD/CSD and optimize models in a highly automated fashion, and then use them
(see Fig. 8). The models predicting ten timesteps lead to slightly smaller in a MPC controller to generate crystals meeting desired quality criteria.
coefficients indicating the narrowest distributions. The first part of this method is a custom process control software
Comparing the results with data from model-free approaches, static designed to apply various model-free control schemes to the crystalli­
cooling ramps and SC approaches also resulted in CLDs in the target zation process at hand and thereby generating training data, which is
range. For AA, especially slow ramps produced crystals with a mean stored in a InfluxDB database. A custom server was created for pre­
chord length of 104.51 µm, with the CV of 8.7e-4 being in the same processing the raw data and using it for supervised training, during
range as in MPC experiments. The same goes for model-free SC experi­ which models are automatically optimized by varying user-selected
ments, which resulted in smaller mean CLDs but better CVs (6.6e-4 and hyperparameters to minimize validation loss. At all times, the training
7.5e-4). For KDP, fast ramps and low supersaturation SC experiments process can be supervised using a MLflow tracking server, which addi­
resulted in significantly worse CV values (9.4e-4 and 11.1e-4) than their tionally provides the functionality to compare models and make them
MPC counterparts. accessible through the network. Finally, trained models are used in a
Still, the model-free approaches took 2–3 h to complete, while the MPC module to generate and apply optimal control inputs to the crys­
MPC experiments achieved the results after only 45–50 min of experi­ tallization process online.
ment time. In a case study to demonstrate the methods effectiveness, it was
applied to lab-scale batch cooling crystallizations of AA and KDP from
4. Conclusions aqueous solution. The target was set to creating crystals with a narrow
FBRM CLD and chord lengths of 50–150 µm. For both substances,
The goal of this work was to create a method enabling the user to training data was generated employing static cooling ramps, SC, and
rapidly create data-driven models and apply them directly in a MPC DNC model-free process control approaches. Three different ANN model
control scheme for crystallization processes. It includes all software, types were then trained: simple FFN, and more advanced LSTM and GRU
RNNs. Several hyperparameters (optimizer algorithm, prediction time
interval, regulizers and loss functions) were varied automatically to
create optimal networks. Generally, models trained wit Nadam opti­
mizer, a prediction range of one timestep and ReLU activation functions
resulted in the lowest loss values. The difference between the tested
regulizers L1 and L2 was marginal, in both cases no overfitting occurred.
Overall, LSTM models performed slightly better than their GRU and FFN
counterparts.
The best models of each type were used for MPC control of batch
crystallizations for AA and KDP. This resulted in crystals with narrow
CLDs within the desired target range for both substances. Again, con­
trollers using LSTM networks generally outperformed the other model
types slightly, leading to bigger crystals with a smaller CV.
In conclusion, the method proved to be an efficient tool for flexibly
Fig. 7. Trends of selected process variables for an MPC-controlled AA cooling creating data driven MPC controllers and applying them in
crystallization. A FFN model for one timestep predictions was used.

7
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

Fig. 8. Comparison of the cube weighted, normalized CLD3 from MPC controlled crystallizations of AA (left) and KDP (right). Results from experiments with models
predicting a range of ten timesteps are shown.

crystallization processes, even without extensive preparation or large Alvarez, A.J., Myerson, A.S., 2010. Continuous plug flow crystallization of
pharmaceutical compounds. Cryst. Growth Des. 10 (5), 2219–2228. https://doi.org/
amounts of pre-existing process data. While some process knowledge is
10.1021/cg901496s.
still necessary to generate suitable training data, the elimination of a Borsos, Á., Szilágyi, B., Agachi, P.Ş., Nagy, Z.K., 2017. Real-time image processing based
detailed, mathematical description of all relevant crystallization mech­ online feedback control system for cooling batch crystallization. Org. Process Res.
anisms and determination of kinetic parameters necessary in traditional, Dev. 21 (4), 511–519. https://doi.org/10.1021/acs.oprd.6b00242.
Braatz, R.D., 2002. Advanced control of crystallization processes. Annu. Rev. Control 26
knowledge-based modelling represents the major advantage of the (1), 87–99. https://doi.org/10.1016/S1367-5788(02)80016-5.
described approach. While the equivalent here - the training of models - Chen, J., Sarma, B., Evans, J.M.B., Myerson, A.S., 2011. Pharmaceutical crystallization.
is still time consuming, it can be done mostly automatically without the Cryst. Growth Des. 11 (4), 887–895. https://doi.org/10.1021/cg101556s.
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2014. On the properties of neural
need for supervision. machine translation: encoder-decoder approaches. In: Proceedings of SSST-8, Eighth
Workshop on Syntax, Semantics and Structure in Statistical Translation. https://doi.
Funding org/10.48550/arXiv.1409.1259.
Choi, D., Shallue, C.J., Nado, Z., Lee, J., Maddison, C.J., Dahl, G.E., 2020. On empirical
comparisons of optimizers for deep learning. In: Eighth International Conference on
This research did not receive any specific grant from funding Learning Representations. https://doi.org/10.48550/arXiv.1910.05446.
agencies in the public, commercial, or not-for-profit sectors. Damour, C., Benne, M., Grondin-Perez, B., Chabriat, J.-P., 2010. Nonlinear predictive
control based on artificial neural network model for industrial crystallization.
J. Food Eng. 99 (2), 225–231. https://doi.org/10.1016/j.jfoodeng.2010.02.027.
CRediT authorship contribution statement Daosud, W., Thampasato, J., Kittisupakorn, P., 2017. Neural network based modeling
and control for a batch heating/cooling evaporative crystallization process. Eng. J.
21 (1), 127–144. https://doi.org/10.4186/ej.2017.21.1.127.
Conrad Meyer: Visualization, Validation, Software, Methodology, Dozat, T., 2016. Incorporating Nesterov Momentum into Adam. In: International
Investigation, Conceptualization. Arjun Arora: Software. Stephan Conference on Learning Representations.
Scholl: Supervision. Duchi, J., Hazan, E., Singer, Y., 2011. Adaptive subgradient methods for online learning
and stochastic optimization. J. Mach. Learn. Res. 12 (7).
Erdemir, D., Lee, A.Y., Myerson, A.S., 2009. Nucleation of crystals from solution:
Declaration of competing interest classical and two-step models. Acc. Chem. Res. 42 (5), 621–629. https://doi.org/
10.1021/ar800217x.
Fujiwara, M., Nagy, Z.K., Chew, J.W., Braatz, R.D., 2005. First-principles and direct
The authors declare that they have no known competing financial design approaches for the control of pharmaceutical crystallization. J. Process
interests or personal relationships that could have appeared to influence Control 15 (5), 493–504. https://doi.org/10.1016/j.jprocont.2004.08.003.
Gao, Z., Rohani, S., Gong, J., Wang, J., 2017. Recent developments in the crystallization
the work reported in this paper. process: toward the pharmaceutical industry. Engineering 3 (3), 343–353. https://
doi.org/10.1016/J.ENG.2017.03.022.
Data availability García, C.E., Prett, D.M., Morari, M., 1989. Model predictive control: theory and
practice—a survey. Automatica 25 (3), 335–348. https://doi.org/10.1016/0005-
1098(89)90002-2A.
Data will be made available on request. Gerlinger, W., Asua, J.M., Chaloupka, T., Faust, J.M.M., Gjertsen, F., Hamzehlou, S.,
Hauger, S.O., Jahns, E., Joy, P.J., Kosek, J., Lapkin, A., Leiza, J.R., Mhamdi, A.,
Mitsos, A., Naeem, O., Rajabalinia, N., Singstad, P., Suberu, J., 2019. Dynamic
optimization and non-linear model predictive control to achieve targeted particle
References morphologies. Chem. Ing. Tech. 91 (3), 323–335. https://doi.org/10.1002/
cite.201800118.
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Griffin, D.J., Grover, M.A., Kawajiri, Y., Rousseau, R.W., 2015. Combining ATR-FTIR and
Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., FBRM for feedback on crystal size. In: 2015 American Control Conference (ACC),
Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Zheng, X., 2016. TensorFlow: a pp. 4308–4313. https://doi.org/10.1109/ACC.2015.7172006.
system for large-scale machine learning. In: 12th USENIX symposium on operating Griffin, D.J., Grover, M.A., Kawajiri, Y., Rousseau, R.W., 2016. Data-driven modeling and
systems design and implementation (OSDI 16). dynamic programming applied to batch cooling crystallization. Ind. Eng. Chem. Res.
Abu Bakar, M.R., Nagy, Z.K., Saleemi, A.N., Rielly, C.D., 2009. The impact of direct 55 (5), 1361–1372. https://doi.org/10.1021/acs.iecr.5b03635.
nucleation control on crystal size distribution in pharmaceutical crystallization Heinrich, J., Ulrich, J., 2012. Application of laser-backscattering instruments for in situ
processes. Cryst. Growth Des. 9 (3), 1378–1384. https://doi.org/10.1021/ monitoring of crystallization processes—a review. Chem. Eng. Technol. 35 (6),
cg800595v. 967–979. https://doi.org/10.1002/ceat.201100344.
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M., 2019. Optuna: a next-generation Hinton, G., Vinyals, O., Dean, J., 2015. Distilling the knowledge in a neural network. In:
hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD 29th Conference on Neural Information Processing Systems (NIPS 2015). https://
International Conference on Knowledge Discovery & Data Mining. https://doi.org/ doi.org/10.48550/arXiv.1503.02531.
10.1145/3292500.3330701.

8
C. Meyer et al. Computers and Chemical Engineering 186 (2024) 108680

Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput 9 (8), Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A.,
1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. Antiga, L., Lerer, A., 2017. Automatic differentiation in PyTorch. In: 31st Conference
Hochreiter, S., 1998. The vanishing gradient problem during learning recurrent neural on Neural Information Processing Systems (NIPS 2017).
nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 06 (02), Qu, H., Alatalo, H., Hatakka, H., Kohonen, J., Louhi-Kultanen, M., Reinikainen, S.-P.,
107–116. https://doi.org/10.1142/S0218488598000094. Kallas, J., 2009. Raman and ATR FTIR spectroscopy in reactive crystallization:
InfluxDB overview. Available online: https://www.influxdata.com/products/influxdb-o simultaneous monitoring of solute concentration and polymorphic state of the
verview/ (accessed on 22. November 2023). crystals. J. Cryst. Growth 311 (13), 3466–3475. https://doi.org/10.1016/j.
Kim, S., Lotz, B., Lindrud, M., Girard, K., Moore, T., Nagarajan, K., Alvarez, M., Lee, T., jcrysgro.2009.04.018.
Nikfar, F., Davidovich, M., Srivastava, S., Kiang, S., 2005. Control of the particle Ramkrishna, D., 2000. Population balances: Theory and Applications to Particulate
properties of a drug substance by crystallization engineering and the effect on drug Systems in Engineering. Academic Press, San Diego, USA.
product formulation. Org. Process Res. Dev. 9 (6), 894–901. https://doi.org/ Ruf, A., Worlitschek, J., Mazzotti, M., 2000. Modeling and experimental analysis of PSD
10.1021/op050091q. measurements through FBRM. Part. Part. Syst. Charact. 17 (4), 167–179. https://doi.
Kingma, D.P., Ba, J., 2015. Adam: a method for stochastic optimization. In: 3rd org/10.1002/1521-4117(200012)17:4%3C167::AID-PPSC167%3E3.0.CO;2-T.
International Conference for Learning Representations. https://doi.org/10.48550/ Savitzky, A., Golay, M.J., 1964. Smoothing and differentiation of data by simplified least
arXiv.1412.6980. squares procedures. Anal. Chem. 36 (8), 1627–1639. https://doi.org/10.1021/
Kutluay, S., Şahin, Ö., Ceyhan, A.A., İzgi, M.S., 2017. Design and optimization of ac60214a047.
production parameters for boric acid crystals with the crystallization process in an Schmitt, L., Meyer, C., Schorz, S., Manser, S., Scholl, S., Rädle, M., 2022. Use of a
MSMPR crystallizer using FBRM® and PVM® technologies. J. Cryst. Growth 467, scattered light sensor for monitoring the dispersed surface in crystallization. Chem.
172–180. https://doi.org/10.1016/j.jcrysgro.2017.03.027. Ing. Tech. 94 (8), 1177–1184. https://doi.org/10.1002/cite.202200076.
Lewiner, F., Klein, J.P., Puel, F., Févotte, G., 2001. On-line ATR FTIR measurement of Simone, E., Saleemi, A.N., Nagy, Z.K., 2014. Raman, UV, NIR, and Mid-IR spectroscopy
supersaturation during solution crystallization processes. Calibration and with focused beam reflectance measurement in monitoring polymorphic
applications on three solute/solvent systems. Chem. Eng. Sci. 56 (6), 2069–2084. transformations. Chem. Eng. Technol. 37 (8), 1305–1313. https://doi.org/10.1002/
https://doi.org/10.1016/S0009-2509(00)00508-X. ceat.201400203.
Li, M., Wilkinson, D., Patchigolla, K., Mougin, P., Roberts, K.J., Tweedie, R., 2004. On- Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58
line crystallization process parameter measurements using ultrasonic attenuation (1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.
spectroscopy. Cryst. Growth Des. 4 (5), 955–963. https://doi.org/10.1021/ Velásco-Mejía, A., Vallejo-Becerra, V., Chávez-Ramírez, A.U., Torres-González, J., Reyes-
cg030041h. Vidal, Y., Castañeda-Zaldivar, F., 2016. Modeling and optimization of a
Liotta, V., Sabesan, V., 2004. Monitoring and feedback control of supersaturation using pharmaceutical crystallization process by using neural networks and genetic
ATR-FTIR to produce an active pharmaceutical ingredient of a desired crystal size. algorithms. Powder Technol. 292, 122–128. https://doi.org/10.1016/j.
Org. Process Res. Dev. 8 (3), 488–494. https://doi.org/10.1021/op049959n. powtec.2016.01.028.
Mayrhofer, B., Nývlt, J., 1988. Programmed cooling of batch crystallizers. Chem. Eng. Wu, H., Hussain, A.S., 2005. Use of pat for active pharmaceutical ingredient
Process. 24 (4), 217–220. https://doi.org/10.1016/0255-2701(88)85005-0. crystallization process control. IFAC Proceedings Volumes 38 (1), 147–152. https://
Mazzotti, M., Vetter, T., Ochsenbein, D.R., 2018. Crystallization process modeling. In: doi.org/10.3182/20050703-6-CZ-1902.02228.
Hilfiker, R., von Raumer, M. (Eds.), Polymorphism in the Pharmaceutical Industry. Wu, G., Yion, W.T.G., Dang, K.L.N.Q., Wu, Z., 2023. Physics-informed machine learning
Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, Germany, pp. 285–304. https:// for MPC: application to a batch crystallization process. Chem. Eng. Res. Des. 192,
doi.org/10.1002/9783527697847.ch10. 556–569. https://doi.org/10.1016/j.cherd.2023.02.048.
MLflow – a platform for the machine learning life cycle. Available online: https://mlflow. Yu, Z.Q., Chew, J.W., Chow, P.S., Tan, R.B.H., 2007. Recent advances in crystallization
org/ (accessed on 22. November 2023). control. Chem. Eng. Res. Des. 85 (7), 893–905. https://doi.org/10.1205/
Nagy, Z.K., Braatz, R.D., 2012. Advances and new directions in crystallization control. cherd06234.
Annu. Rev. Chem. Biomol. Eng. 3 (1), 55–75. https://doi.org/10.1146/annurev- Zaharia, M., Chen, A., Davidson, A., Ghodsi, A., Hong, S.A., Konwinski, A., Murching, S.,
chembioeng-062011-081043. Nykodym, T., Ogilvie, P., Parkhe, M., Xie, F., Zumar, C., 2018. Accelerating the
Nagy, Z.K., Fevotte, G., Kramer, H., Simon, L.L., 2013. Recent advances in the machine learning lifecycle with MLflow. IEEE Data Eng. Bull. 41 (4), 39–45.
monitoring, modelling and control of crystallization systems. Chem. Eng. Res. Des. Zheng, Y., Wang, X., Wu, Z., 2022. Machine learning modeling and predictive control of
91 (10), 1903–1922. https://doi.org/10.1016/j.cherd.2013.07.018. the batch crystallization process. Ind. Eng. Chem. Res. 61 (16), 5578–5592. https://
Öner, M., Montes, F.C.C., Ståhlberg, T., Stocks, S.M., Bajtner, J.E., Sin, G., 2020. doi.org/10.1021/acs.iecr.2c00026.
Comprehensive evaluation of a data driven control strategy: experimental Zou, H., Hastie, T., 2005. Regularization and variable selection via the elastic net. J. R.
application to a pharmaceutical crystallization process. Chem. Eng. Res. Des. 163, Stat. Soc. 67 (2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x.
248–261. https://doi.org/10.1016/j.cherd.2020.08.032.

You might also like