Data Mining Applications For Finding Golden Batch Benchmarks and Optimizing Batch Process Control
Data Mining Applications For Finding Golden Batch Benchmarks and Optimizing Batch Process Control
Data Mining Applications for Finding Golden Batch Benchmarks and Optimizing
Batch Process Control
Yuelong Su and Fengqin Yu
B atch process plays an important role in the compress the information into low-dimensional spaces and
production of polymers, pharmaceuticals and pinpoint the root causes of batch-to-batch difference
biochemical. For example, polyether polyols (PPG) [13-14], which extend the applications of PCA and PLS
and chloroprene rubber, it is a process of techniques from continuous processes to batch processes.
high-quality specialty materials and products. According to Therefore, batch process performance can be easily
batch control standard [1], a finite-duration batch process analyzed and monitored in the reduced space, quality
consists of one or more process stages which are organized predictions can also be made [15].
as an ordered set and a specified recipe of materials. Each On the other hand, talking about industrial experiences
process stage consists of an ordered set of one or more for batch process modelling and monitoring, the
process operations, these operations represent major multivariate analysis was performed on 20 fermentation
processing activities, including preparing reactor, charging batches from The Dow Chemical Company San Diego
and reacting. Each process operation can be subdivided into facility [2]; Batch scheduling with electric power
an ordered set of one or more process actions that carry out constraints (ABB) and simultaneous scheduling and
the processing required by the process operation. A range dynamic optimization of batch process (Dow Chemical)
of quality indexes can be measured at the quality control had also been researched in Carnegie Mellon University
laboratory for this batch final product. recently [16]; the multivariate SPC charts had been applied
In general, the batch process exhibits some batch-to-batch to monitor DuPont batch polymerization reactor [17].
variations because of errors in the charging of the recipe of 1.2 Challenges for Process or Control Engineers
materials, deviations of the process control between
operation point and set point, and disturbances to the From engineering point of view, challenges for process
manipulated variables. To achieve consistent product engineers or control engineers are understanding of the
quality from a batch process, minimizing batch-to-batch underlying phenomena that can be modelled rapidly to aid
variability is important [2], especially temperatures, us in rational decision making [18]. Chemical industry
pressures, agitation and feedrates are under the best and process modelling, especially for batch process, has been
stable controlled conditions. From engineering perspective, evolving into a specialized field, the in-depth knowledge
it is important for process or control engineers to mine we need includes the interface between chemical
multi-way data for finding golden benchmarks [3] and engineering, applied mathematics and computer science
optimize batch process control. The objective is to identify with specific model-based methods and tools. For process
the best operating procedure and correct abnormal or control engineers, these comprehensive abilities are as
their core competencies to deal with the inherent
complexity of chemical processes and the multi-objective
Yuelong Su are with senior engineer of China National nature of decision-making during the lifecycle of the
Bluestar (Group) Co., Ltd, Beijing, China (email: manufacturing process of chemical products [19-20].
[email protected]). Combing with the nature of multi-way batch process
Fengqin Yu are with senior engineer of Central Research data, which is the same as time-series data including large
Institute of China Chemical Science and Technology, Beijing, in data size, high dimensionality and necessary to update
China (email: [email protected]).
1059
below (Table 2). Multi-way data about total 26 batches and z Every batch reaction beginning time is the real start
3 tags in every batch are exported form PI Systems, the time for process analysis. We chose the first time of
sampling interval for every tag is all 1 minute. PIC0605 changing from minus to positive number as
Table 2. Data acquirement
every batch start time in this process.
z Considering end time synchronization, we chose the
Start End
Batch ID Tag minimal column of these 26 batches as data sample’s
Time Time
column (n=820) because it's the sign of the best
TICA0610.PV
duration time in these batches. More columns of other
(Temperature
batches should be excluded from data set.
measured value)
TICA0610.MV
DEP N
(Temperature hh:mm:ss hh:mm:ss
(N=1 to 26)
controller output)
PIC0605.PV
(Pressure measured
value)
III. DATA PRE-TREATMENT
Why we need data pre-treatment before using MPCA or
MPLS for analyzing batch process? As shown in Fig. 3, Fig. 4. Batch trajectory synchronization method for MPCA
loading plot resulted by using MPCA directly for raw data,
which included 26 batches data without pre-processing, For example, individual value plot of TICA0610.PV can
can’t tell us any valuable information about this process. It be illustrated after batch trajectory synchronization (Fig. 5).
is assumed that the batch durations are the same when most z Through synchronization, more process details can be
multivariate statistical methods being used. But in real observed. For example, normal batches and abnormal
industrial production line, the batch durations are not fixed batches can be divided clearly. At the same time, we
because of changes and disturbances to operating can check quality record for proving authenticity of
conditions. For example, distribution of these 26 batch these abnormal batches judged by data analysis. These
durations is from 16.4h to 22.2h (Fig. 3). In such situations, abnormal batches could also be excluded from the
data pre-treatment, which is also called synchronization, is next data pro-processing step.
required.
z For normal batches, the top values of TICA0610.PV
in every batch are different. In the next step, combing
these batches quality indexes with operating data,
deviations of the process control resulting in different
product can be analysed through MPLS.
1060
z For analysing every process variable’s impact on final
product quality, there is a one-to-one
correspondence between batch process variables and
this batch quality indexes. Every quality index’s value,
corresponded to every process variable in one batch,
is the same. For example, we choose “D-Value” (the
most important quality index for this product) as “Y”,
and three process variables in Batch 1 are
corresponded to one “D-Value” (equal to 1.074) of
Batch 1, “D-Value” corresponded to batch 22 three
process variables is 1.0692.
1061
TIC.PV TIC.MV=0
Heating TICA0610 144&PIC. %
PV190 Automatic
Reacting PIC0605 PV = SV Automatic
Based on above mentioned, comparing temperatures in Fig. 12. Comparing reaction manual and automatic control
time 34 of the minim and maximum “D” values in this data
set, as shown in Fig. 10, only little temperature difference VI. “D” VALUE COMPARISON BEFORE-AND-AFTER
can result in the “best” quality (D=1.0693) and the “worst” OPTIMIZATION
quality (D=1.087) product in these samples.
Other 20 batches are chose for comparing, 10 batches are
Analysing operation conditions about D=1.0693 in detail, without optimization and 10 batches are triggered reaction
golden batch benchmarks can be found, as shown in Fig. 11. automatically according to golden batch benchmarks.
Operator started heating stage when TICA0610.PV is more “D” values distributions of these batches with and without
than 144 ć and PIC0605.PV is little than 144 kPa. optimization are shown in Fig. 13. The average of “D”
value is decreased from 1.0805 to 1.073 after optimizing, it
means that final batch quality is improved highly through
mining multi-way data, finding golden batch benchmarks
and optimizing process control.
1062
Except for production cycle time had been reduced and [3] E. Keogh and S. Kasetty, “On the need for time series data mining
benchmarks: a survey and empirical demonstration,” in Proc. 8th
stabilized, first pass yield (FPY) had also improved
ACM SIGKDD International Conference on Knowledge Discovery
continuously from September last year to May this year, as and Data Mining, Canada, 2002, pp. 102–110.
shown in Fig. 15. [4] I. Jolliffe, Principal component analysis (second ed.), Springer,
2002, pp. 130–135.
[5] J. Jackson, A user’s guide to principal components, New York:
Wiley-Interscience, 1991, pp. 123–129.
[6] P. Geladi and B. R. Kowalski, “Partial least-squares regression: A
tutorial,” Analytica Chimica Acta, Vol.185, pp.1–17, 1986.
[7] A. Hoskuldsson, “PLS regression methods,” Journal of
Chemometrics, Vol.2, No.2, pp.211–228, 1988.
[8] Y. L. Su, F. Q. Yu, J. Zhou and Q. X. Zhang, “Product Moisture
Real-time Monitoring Based on Soft-sensing Technique,”in Proc.
the IEEE International Conference on Information and Automation,
2014, pp. 604–609.
[9] P. Kadlec, B. Gabrys, and S. Strandt, “Data-driven Soft Sensors in
the Process Industry,” Computers & Chemical Engineering, vol. 33,
no. 4, pp. 795–814, 2009.
[10] V. Venkatasubramanian, R. Raghunathan and N. K. Surya, “A
Review of Process Fault Detection and Diagnosis,” Computers &
Fig. 14. Production cycle time comparison Chemical Engineering, vol. 27, no. 3, pp. 293–326, 2003.
[11] B. M. Wise, N. B. Gallagher, S. Butler, D. White and G. Barna, “A
comparison of principal component analysis, trilinear decomposition
and parallel factor analysis for fault detection in a semiconductor
etch process,” Journal of Chemometrics, vol. 13, no. 3, pp. 379–396,
1999.
[12] G. Cherry and S. J. Qin, “Multiblock principal component analysis
based on a combined index for semiconductor fault detection and
diagnosis,” IEEE Transactions on Semiconductor Manufacturing,
vol. 19, no. 2, pp. 159–172, 2006.
[13] P. Nomikos and J. F. MacGregor, “Monitoring of batch process using
multi-way principal component analysis,” AIChE Journal, vol. 40,
no. 8, pp. 1361–1375, 1994.
[14] S. Wold, N. Kettaneh, H. Friden and A. Holmberg, “Modelling and
Fig. 15. Continuous improvement of FPY Diagnostics of a Batch Process and Analogous Kinetic
Experiments,” chemometrics and intelligent laboratory systems, vol.
44, no. 8, pp. 331–340, 1998.
VIII. CONCLUSIONS [15] Y. Yao and F. R. Gao, “Survey on Multistage/multiphase Statistical
Modeling Methods for Batch Processes,” Annual Reviews in
This paper applies MPCA and MPLS methodology to Control, vol. 33, no. 2, pp. 172–183, 2009.
optimize a batch process. Process engineers and control [16] I. E. Grossmann, “Advances in Mathematical Programming Models
engineers can obtain the valuable data analysing and for Enterprise-wide Optimization,” Computers & Chemical
control optimization information for batch process from Engineering, vol. 47, no. 2, pp. 2–8, 2012.
[17] P. Nomikos and J. F. MacGregor, “Multivariate SPC charts for
this study. monitoring batch processes,” Technometrics, vol. 37, no. 1, pp.
(1) How to optimize batch process: combing golden 41–59, 1995.
[18] V. Venkatasubramanian, “DROWNING IN DATA: Informatics and
batch benchmarks with automatic control strategies in DCS Modeling Challenges in a Data-rich Networked World,” AIChE
can reduce production batch time and improve quality. Journal, vol. 55, no. 1, pp. 2–8, 2009.
[19] I. E. Grossmann and A. W. Westerberg, “Research challenges in
(2) Economic benefits: decreasing of batch time means process systems engineering,” AIChE Journal, vol. 46, no. 9, pp.
that equipment production capacity is enlarged without new 1700–1703, 2000.
investment; and quality improvement means that product [20] K. U. Klatt, and M. Wolfgang, “Perspectives for Process Systems
Engineering—Personal Views from Academia and Industry,”
can be sold higher price and with high customer confidence
Computers & Chemical Engineering, vol. 33, no. 3, pp. 536–550,
level. 2009.
[21] T. C. Fu, “A Review on Time Series Data Mining. Engineering,”
(3) Values of “big data”: if company management asks Applications of Artificial Intelligence, vol. 24, no. 1, pp. 64–81,
the potential benefit of a real-time databases (such as PI 2011.
System) or “big data”, this paper can give a solution [22] S. J. Qin, “Process Data Analytics in the Era of Big Data,” AIChE
perfectly because of data analysis combing with Journal, vol. 60, no. 9, pp. 3092–3100, 2014.
[23] N. Blakemore and R. Aris, “Studies in optimization-V.: The
engineering practices integrally. bang-bang control of a batch reactor,” Chemical Engineering
Science, 17(8), vol. 17, no. 8, pp. 591–598, 1962
REFERENCES
[1] Instrument Society of America, ISA-88.01-1995 (R2006) Batch
Control, Part 1: Models and Terminology, chapter 4 Batch
processes and equipment. North Carolina: ISA, 1995, pp. 123–135.
[2] L. H. Chiang, L. Riccardo, J. P. Randy and B. S. Mary, “Industrial
Experiences with Multivariate Statistical Analysis of Batch Process
Data,” Chemometrics and Intelligent Laboratory Systems, vol. 81,
no. 2, pp. 109–119, 2006.
1063