0% found this document useful (0 votes)
3 views

AhybridinformationmodelbasedonLong-ShortTermMemoryNetworkfortoolconditionmonitoring

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

AhybridinformationmodelbasedonLong-ShortTermMemoryNetworkfortoolconditionmonitoring

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/338379462

A hybrid information model based on long short-term memory network for


tool condition monitoring

Article in Journal of Intelligent Manufacturing · August 2020


DOI: 10.1007/s10845-019-01526-4

CITATIONS READS

87 1,497

4 authors, including:

Wenjuan Zhang Xiaofeng Hu


University of Science and Technology Beijing Shanghai Jiao Tong University
105 PUBLICATIONS 3,532 CITATIONS 43 PUBLICATIONS 1,515 CITATIONS

SEE PROFILE SEE PROFILE

Yc Liu
Shanghai Jiao Tong University
7 PUBLICATIONS 185 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Shipbuilding View project

Research learning View project

All content following this page was uploaded by Yc Liu on 20 July 2021.

The user has requested enhancement of the downloaded file.


Journal of Intelligent Manufacturing
https://doi.org/10.1007/s10845-019-01526-4

A hybrid information model based on long short‑term memory


network for tool condition monitoring
Weili Cai1 · Wenjuan Zhang2 · Xiaofeng Hu1 · Yingchao Liu1

Received: 24 August 2019 / Accepted: 12 December 2019


© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract
Excessive tool wear leads to the damage and eventual breakage of the tool, workpiece, and machining center. Therefore, it is
crucial to monitor the condition of tools during processing so that appropriate actions can be taken to prevent catastrophic
tool failure. This paper presents a hybrid information system based on a long short-term memory network (LSTM) for tool
wear prediction. First, a stacked LSTM is used to extract the abstract and deep features contained within the multi-sensor
time series. Subsequently, the temporal features extracted are combined with process information to form a new input vector.
Finally, a nonlinear regression model is designed to predict tool wear based on the new input vector. The proposed method
is validated on both NASA Ames milling data set and the 2010 PHM Data Challenge data set. Results show the outstand-
ing performance of the hybrid information model in tool wear prediction, especially when the experiments are run under
various operating conditions.

Keywords Tool condition monitoring · Tool wear · Maintenance · Deep learning · Long short-term memory network ·
Process information

Introduction under the precondition that the machining requirements are


met. Therefore, a reliable and accurate TCM system that
Tool condition monitoring (TCM) plays a crucial role in the can obtain the health status of a tool becomes an imperative
manufacturing industry, especially for machining processes demand.
that use precious or expensive parts. The tool flank incurs Tool condition can be classified into three categories: tool
wear during the machining process because of the contact breakage, tool chipping, and tool wear. Tool breakage and
forces that exist between the tool flank and the workpiece. tool chipping occur abruptly in an observable and random
These forces cause deterioration and damage to the surface manner, whereas tool wear develops gradually and can be
finish, which eventually lead to the breakage of the tool, predicted (Cho et al. 2009). Therefore, TCM systems mostly
workpiece, and machining center (Siddhpura and Paurobally focus on tool wear prediction. The methods used to measure
2012; Aghazadeh et al. 2018). Jain and Lad (2017) empha- tool wear fall into two categories: direct methods, which are
sized the interaction between product quality and tool deg- mostly intermittent and offline; and indirect methods, which
radation. After the wear on the cutting tool exceeds a certain are mostly continuous and online (Zhang et al. 2016; Si et al.
threshold, the machining parts no longer meet the machining 2011). Direct methods measure the actual dimensions of the
requirements. Based on this threshold, and the tool wear, worn area with sensors such as proximity sensors, radioac-
the tool’s remaining useful life (RUL) can be defined. There tive sensors, and vision sensors, which have the advantage of
exists a high demand for higher productivity and lower cost being able to acquire direct and accurate dimension changes.
However, the sensitivity to cutting fluid, chips, vibrations
* Xiaofeng Hu and various other environmental disturbances results in
[email protected] that they are usually performed while the system is offline
(Kurada and Bradley 1997; Zhang et al. 2016). Indirect
1
School of Mechanical Engineering, Shanghai Jiao Tong methods obtain tool wear information through various sen-
University, Shanghai, China
sor signals, such as cutting force, torque, vibration, acoustic
2
Warwick Business School, University of Warwick, Coventry, emission (AE), spindle power, current, surface roughness
UK

13
Vol.:(0123456789)
Journal of Intelligent Manufacturing

and temperature. Compared with direct methods, they are in industrial and manufacturing engineering. Nguyen and
easier to install and implement while the system is online, Medjaher (2019) constructed an LSTM classifier to predict
but the tool wear predictions are not as accurate as direct the probabilities of turbofan engine system failure during dif-
measurements (Zhang et al. 2016). ferent time horizons. Zhang et al. (2019) assessed bearing deg-
In general, the indirect TCM system includes the follow- radation states and estimated the bearing RUL using LSTMs.
ing procedures: sensor data acquisition, signal preprocess- In this paper, we propose a hybrid information model,
ing, feature extraction and selection, and model prediction. based on an LSTM, for tool wear prediction. For validation,
Feature extraction reduces the dimension of the original or both the milling data set from NASA Ames (Agogino and
preprocessed signal and makes sure that the extracted fea- Goebel 2007) and the data set reported in the 2010 PHM
tures are correlated with the tool condition. These extracted Data Challenge (PHM Society Conference Data Chal-
features can still be of a large volume, or can contain redun- lenge, provided at https​://www.phmso​ciety​.org/compe​titio​
dant information, both of which can negatively affect the n/phm/10) are tested. The main contributions of this paper
performance of the prediction model. Feature selection tech- includes:
niques such as Pearson’s correlation coefficient and principal
component analysis are introduced after feature extraction to • Deep learning architecture is used to adaptively extract
select the most valuable information and reduce dimension- features from sensor signal, instead of shallow machine
ality (Zhang et al. 2017; Wu et al. 2018b; Zhang et al. 2016; learning procedures with manual feature extraction and
Shi et al. 2018). To build the relationship between selected selection, which overcomes the dependence on specific
features and tool wear or RUL, common machine learning priori knowledge and diagnostic expertise.
methods are used, such as support vector machine (SVM), • A hybrid information model is designed based on deep
artificial neural networks (ANNs), neuro fuzzy inference learning techniques, which are not commonly used in
systems, and hidden Markov models (HMMs). TCM. In addition, LSTMs are selected to model moni-
However, shallow machine learning methods and intel- toring sensor signal, owing to the recurrent structure for
ligent system methods have two obvious deficiencies. (1) time series processing. The performance of LSTM-based
The performance of these methods largely depends on the model was compared with LR, SVR, MLP, CNN, and
specific feature extraction and selection methods. Extract- results proved its capability in modeling sensor signal in
ing and selecting features from sensor signals require priori TCM.
knowledge and diagnostic expertise, which would possibly • Process information is combined with the temporal fea-
limit the full utilization of the signals. (2) The shallow-archi- tures extracted by LSTMs to form a new input vector,
tecture methods are not fully capable of learning the rela- which significantly improves the prediction accuracy
tionship between the input signal/features and tool condi- when the machining process runs under various oper-
tion/RUL, which is highly complex and non-linear (Jia et al. ating conditions. Machining capability of cutting tool
2016; Shi et al. 2018). Deep learning (Hinton and Salakhut- depends on properties of tool, workpiece material and
dinov 2006) has the potential to overcome these deficiencies, process parameters, whereas the monitoring signals
which refers to a class of machine learning techniques where merely reflect the degradation process and performance
many layers of information processing stages in hierarchical fluctuation of cutting tool.
architectures are exploited both for pattern classification and
for feature or representation learning (Deng 2014). The deep The remainder of this paper is organized as follows.
architectures and multiple layers enable these techniques to “Previous work” section briely reviews the related work on
learn the complex relationship from low-processed or even indirect TCM method. “Methods” section proposes a hybrid
raw signals (Aghazadeh et al. 2018; Jia et al. 2016). When information system for tool wear prediction based on an
applied to industrial machining, where massive data are LSTM. “Experiments” section details the two data sets used
obtained through various sensors to monitor tool condition, in this study. “Results and discussion” section discusses the
deep learning techniques have powerful characteristics, such experimental results, including comparisons with some rep-
as excellent representation and non-linear learning ability, resentative approaches. Finally, our concluding remarks and
which allow them to outperform other methods. areas for future study are presented in “Conclusion” section.
Long short-term memory networks (LSTMs) has been
known as a powerful deep learning architecture to handle
sequential data with various application, including natural Previous work
language processing (Jimeno Yepes 2017), speech recognition
(Brocki and Marasek 2015) and financial market predictions Tool wear can be obtained intuitively and accurately by
(Fischer and Krauss 2018). Researchers have also explored the using TCM method based on direct measurement. García-
applications of long short-term memory networks (LSTMs) Ordás et al. 2018) proposed a computer vision-based system

13
Journal of Intelligent Manufacturing

(based on a shape-features descriptor called ShapeFeat and a random forest to predict tool wear in milling process. Thou-
contour-features descriptor called BORCHIZ) that classifies sands regression trees of random forest were built to predict
the wear level of inserts during the intervals between each tool wear respectively, where MapReduce technique was
cutting process. However, in general, the interval between introduced to obtain the finally prediction of tool wear. Liu
each cutting process is too short to measure tool wear. et al. (2019) proposed an RUL prediction model for turbine
Directly measuring leads to a serious loss of processing effi- cutting tools, based on health index similarity, which con-
ciency. Under the premise of ensuring processing efficiency, sidered distance similarity and spatial direction similarity.
monitoring tool condition based on the indirect sensor has Moreover, deep learning methods have been applied to
become an important research filed. TCM in recent studies. Shi et al. (2018) proposed a deep
Yesilyurt and Ozturk (2007) developed a TCM system learning modeling framework in ultra-precision manufactur-
in end milling. Time and frequency domain features were ing, using multiple stacked sparse auto-encoders to diagnose
extracted from vibration signal, utilized to indicate tool wear the state of tool wear, based on multiple feature spaces of
progression and tool breakage. Bhattacharyya et al. (2008) vibration signal. Aghazadeh et al. (2018) employed a con-
developed an online TCM system for face milling. Tool wear volutional neural network (CNN) to estimate tool wear.
was estimated in real-time using multiple linear regression Wavelet transform was combined with spectral subtraction
model, based on spindle current and voltage measurement. algorithm to extract features from current signal. To predict
Özel and Karpat (2005) proposed a predictive modelling tool wear, R. Zhao et al. (2017) added a convolutional layer
approach in finish hard turning. Surface roughness and tool before the LSTM to extract local features of processed sen-
wear were predicted by ANNs, inputs of which include sory data. Zhao et al. (2018) predicted tool wear under dry
workpiece hardness, cutting speed, feed rate, axial cutting milling operation, using a local feature-based gated recurrent
length and mean value of three measured force components. unit networks, which is regarded as a simplified LSTM.
Freyer et al. (2014) compares orthogonal cutting force and In summary, the related work presented in this section
unidirectional strain component processing for tool condi- explored how to monitoring tool condition with various
tion monitoring. Wavelet packet analysis determined the indirect sensor as well as how tool wear or RUL can be
wear-sensitive features, and time-delay neural networks are accurately predicted. While earlier work focused on shal-
applied to estimate tool wear. Yang et al. (2014) developed low machine learning techniques, such as ANNs, SVMs,
a particle warm optimization-based ANNs to predict flank ensemble learning and deep learning method have been
wear in drilling operations, based on thrust force, torque, and gradually applied in TCM. To apply deep learning method
processing parameters. Sun et al. (2005) proposed a modi- in TCM, it is essential to design a reasonable model and
fied SVM method to evaluate tool condition, where signal introduce enough information relevant to tool wear. This
features were extracted and selected from AE sensor signal. paper explores a hybrid information model based on LSTM,
Sun et al. (Sun 2007) developed an effective TCM system which combines process information and monitoring sensor
using SVM, where AE features were selected by automatic signal to predict tool wear.
relevance determination method to recognize tool status.
Zhang et al. (2015) designed a TCM system based on AE
and cutting sound signal, where tool wear was predicted by Methods
SVM.
Sharma et al. (2007) developed a tool wear condition Recurrent neural networks
monitoring system for turning process. The tool wear rate
was predicted using Adaptive Neuro fuzzy Inference sys- Deep learning techniques for data analysis fall into three cat-
tem, based on cutting force, vibration and AE signal. Ren egories: (1) Methods without space–time processing, such
et al. (2014) developed a micro milling TCM system. Tool as the fully connected neural network. Multi-layer neural
RUL was estimated using a type-2 fuzzy logic system, based networks commonly deal with the extracted features of the
on multiple AE signal features. Zhu et al. (2009) proposed raw signal. The neural network possesses capabilities of
a tool wear monitoring system in micro milling. Continu- nonlinear mapping and adaptive prediction (Kwon 2017).
ous HMMs were adapted to classify tool flank wear state (2) Sliding window methods, such as the CNN. These meth-
into multiple categories, based on cutting force features. ods focus on the information contained in the spatial struc-
Cho et al. (2009) designed a TCM system in end milling, ture data, such as an image. (3) Sequence learning methods,
based on multiple sensor fusion, including force, vibration, such as the recurrent neural network (RNN). These methods
and AE. Tool condition classification were implement by a place emphasis on the information contained in the time
machine ensemble technique, using support vector machine, series (Zheng et al. 2017; Zhao et al. 2017; Goodfellow et al.
multilayer perceptron and radial basis function neural net- 2016). Different network structures are suitable for differ-
work. Wu et al. (2018a) combined cloud computing and ent application scenarios, for instance, to forecast corporate

13
Journal of Intelligent Manufacturing

bankruptcy using textual disclosures, Mai et al. (2019) found contains non-linear gate units to control the information flow.
that CNN is less effective than simpler models such as word These gate units, especially the forget gate, allow the LSTM
embedding. to effectively solve the problem of long-term dependencies.
Naturally, the sensor signals are time series data, which
are sampled and expressed in a sequential form. To process Long short‑term memory networks
the sensor signals obtained during machining, it is important
to consider the information immerged in time dependen- LSTM cell has the same inputs and outputs as the ordinary
cies. The RNN (Elman 1990) is capable of using all the RNN cell, but the LSTM cell has an effective system with
historical information, and it maintains a low model com- gating units to control the information flow, as illustrated in
plexity because of the virtue of weight sharing. RNN has Fig. 2. The forget gate controls the weight of the self-loop cell
become one of the most important subfields of deep learn- state to update the memory cell, based on current time input
ing, which has been widely used in domains such as text, x(t) and the output of the previous moment h(t−1):
music, speech and motion capture data (LeCun et al. 2015; ( )
Wu et al. 2018c). (t)
∑ (t)
∑ (t−1)
Fi = 𝜎 bFi + (2)
F
Ui,j xj + Wi,jF hj ,
RNNs are a specialized family of neural networks that j j
process sequential values x(1) , … , x(𝜏) , whereas CNNs are
specialized for processing spatial values X such as images. where bF , U F , W F denote the respective biases, input weights
To unfold the recurrent structure in time, the RNN can be and recurrent weights of the forget gate; σ is the sigmoid
seen as a very deep feedforward network, as illustrated function, that sets forget gate to a value between 0 and 1. The
in Fig. 1. The unfolded recurrence can be represented as input gate controls the information fed into the memory cell:
(Goodfellow et al. 2016): ( )
(t)
∑ ∑
( ) ( ( ) ) I (t) I (t−1)
(3)
I
h(t) = f h(t−1) , x(t) ;𝝎 = f f h(t−2) , x(t−1) ;𝝎 , x(t) ;𝝎 Ii = 𝜎 b i + Ui,j xj + Wi,j hj ,
( ) (1) j j
= ⋯ = g(t) x(t) , x(t−1) , x(t−2) , … , x(2) , x(1) ,
where bI , U I , W I are the respective biases, input weights and
where h are the values of hidden units at current time t,
(t)
recurrent weights of the input gate. After that, the internal
x(t) is the input of RNNs at current time t, and 𝝎 are the state of the LSTM cell is updated as follows:
internal parameters of the hidden units. The function g(t) ( )
takes all past sequences as input to acquire the current state (t) (t) (t−1) (t)
∑ (t)
∑ (t−1)
S i = Fi S i + Ii tanh bi + Ui,j xj + Wi,j hj ,
of the RNN units. The recurrent structure enables the RNN j j
to capture information from all previous time series. (4)
Although the main purpose of the RNN is to learn long-
where b, U, W denote the respective biases, input weights
term dependencies, gradients tend to vanish or explode when
and recurrent weights into the LSTM cell. The output gate
propagated over many stages (Bengio et al. 1994). To solve
controls the weight of cell output:
vanishing and exploding gradients problem, some variants are
( )
proposed, such as echo state network and LSTM. The echo ∑ ∑
state network randomly generates a large-scale sparse connec-
(t) (t) (t−1)
Oi = 𝜎 bO (5)
O
i
+ Ui,j xj + Wi,jO hj ,
tion matrix called a reservoir, which reduces the computational j j

difficulty because it acts as the information processing unit


where bO , U O , W O denote the respective biases, input
instead of the hidden layer of the RNN (Peng et al. 2012).
weights and recurrent weights of the output gate. Finally,
The LSTM (Hochreiter and Schmidhuber 1997; Graves and
the output of the LSTM cell is
Schmidhuber 2005) architecture is an explicit memory cell that

Unfold
h h(...) h(t-1) h(t) h(t-1) h(...)

x x(t-1) x(t) x(t+1)

Fig. 1  A recurrent network without output, and its unfolding structure

13
Journal of Intelligent Manufacturing

Fig. 2  A long short-term


memory network cell h(t)

S(t-1) × + S(t)

×
Forget Input Output
Gate Gate gate ×

h(t-1) h(t)

x(t)

( )
signals are collected from the processing site or laboratory
h(t) = tanh Si(t) ⋅ O(t) (6)
i i environment, where one or more different sensors are used
to monitor the tool condition during machining. Consider-
Based on these gating units, LSTMs are capable of learn-
ing the differences in the dimension and numerical value of
ing long-term dependencies.
different sensor, data normalization is adopted for data pre-
processing, to reduce computational complexity and improve
Adam optimizer model performance. The most commonly used methods for
this are min–max normalization and Z-score normalization.
According to the LSTM structure detailed above, the gra- We use Z-score normalization in this study:
dients of weights and biases of each cell can be calculated.
xi − 𝜇i
Optimization algorithms, such as the stochastic gradient �
xi = , (7)
descent, AdaGrad, RMSprop, and Adam, are introduced to 𝜎i
train the neural network to get the optimal value of cost
where 𝜇i is the mean of ith sensor data xi , and 𝜎i is the cor-
functions and the optimal solution, based on the gradient’s
responding standard deviation.
descent. Different optimization algorithms are suitable for
After data preprocessing, including data normalization
different networks, and finding the appropriate algorithm
and other necessary operations, the sensor signals are fed
needs theoretical analysis and trial and error in many cases.
into a deep LSTM as a temporal encoder to extract informa-
The Adam algorithm (Kingma and Ba 2014) can be seen
tion from the time sequences. The deep LSTM consists of
as a variant combining RMSprop (Tieleman and Hinton
three layers, where each layer is followed by a dropout layer
2012) and momentum with a few important modifications.
to avoid overfitting as far as possible. By means of dropout,
Adam incorporates momentum directly into the estimation
parts of the hidden units are randomly masked so they do
of the first-order moment of gradient. Bias corrections are
not exert influence on the forward propagation during the
included in Adam to modify estimation of the first-order
training process (R. Zhao et al. 2017).
moments and the second-order moments to account for their
If differences exist between the process conditions and
initialization from origin. Adam is generally considered to
process parameters during monitoring, we combine the out-
be fairly robust to the selections of hyperparameters (Good-
put of the deep LSTMs with the process information:
fellow et al. 2016).
[ ]
InputVecter = TemporalFeatures, ProcessInfo (8)
Proposed methodology
Generally, process information includes the type of mate-
The framework of our proposed methodology is shown in rial, feed, depth of cut, number of different cases, among oth-
Fig. 3, which is illustrated in detail below. The raw sensor ers. In actual machining process, the machining capability of

13
Journal of Intelligent Manufacturing

Fig. 3  Tool monitoring frame-


work based on long short-term
memory network
Output

Hidden Layer 2
Input
Hidden Layer 1

Temporal Encoder
Raw Sensor Signal Nonlinear Regression
--Deep LSTM

Process
Data Preprocessing Tool Wear Esmaon
Informaon

cutting tool depends on properties of tool, workpiece mate- as illustrated in Fig. 4 (Wang et al. 2017). A three-flute ball
rial and process parameters, whereas the process monitor- nose tungsten carbide cutter (stainless steel, HRC52) was
ing sensor signals merely reflect the degradation process tested in a down milling operation. The operation parameters
and performance fluctuation of cutting tool. However, the were set as follows: the running speed of the spindle was
researchers often pay excessive attention to process moni- 10,400 rpm; the feed rate was 1555 mm/min in x direction;
toring signal, or process sensor signal and process informa- the depth of cut (radial) was 0.125 mm in y direction; the
tion in a similar way. Combining process information and depth of cut (axial) was 0.2 mm in z direction.
temporal features extracted by LSTM helps to predict the A Kistler quartz 3-component platform dynamometer
tool wear. Finally, a nonlinear regression model is designed was mounted between the workpiece and machining table
to predict tool wear, which consist of two fully-connected to measure the cutting forces. Three Kistler piezo accel-
layers and a linear regression layer. erometers were mounted on the workpiece to measure the
The loss function of the model is defined as the root mean machine tool vibrations of the cutting process in X, Y and
square error (RMSE), which is widely used as a loss function Z directions, respectively. A Kistler acoustic emission sen-
or evaluation criterion: sor was mounted on the workpiece to monitor the high fre-
√ quency stress wave generated by the cutting process. A NI

√1 ∑ m
( )2 DAQ PCI 1200 board was used to capture the voltage signals
RMSE = √ y − ŷ i , (9) after the charge amplifiers. A LEICA MZ 12 microscope
m i=1 i
was used to measure the flank wear of each individual flute
where y and ŷ denote the target value and prediction value, after finishing each surface. Finally, seven channels of sig-
respectively. The Adam optimization algorithm is adopted nals (force_x, force_y, force_z, vibration_x, vibration_y,
here to train the models.

CNC Milling Machine


Experiments Offline Measurement

Microscope
In this section, we first introduce the two data sets selected LEICA MZ12
for evaluating our proposed method: The 2010 PHM Data PC
Challenge data set and the NASA Ames milling data set. Cung Tool
Details of the experimental setup for method comparison z y

are provided subsequently. Workpiece


x
Data
Accelerometer Charge
AE Sensor Acquision
Machining Amplifier
Table Dynamometer Card
Description of 2010 PHM Data Challenge data set

To analyze the performance of our LSTM-based method,


a set of experimental data measured from a high speed Fig. 4  Illustration of the experimental setup for 2010 PHM Data
CNC machine under dry milling operations is employed, Challenge data set

13
Journal of Intelligent Manufacturing

vibration_z, AE-rms) were captured, and the flank wear was without further processing. Figure 5 illustrates the experi-
set to be the target value. mental setup (Agogino and Goebel 2007).
The flank wear, VB, was measured as the distance from
the cutting edge to the end of the abrasive wear on the flank
Description of NASA Ames milling data set face of the tool, as a generally accepted parameter for evalu-
ating tool wear. The insert was taken out of the tool and the
The data in the NASA Ames milling data set represents wear was measured using a microscope. Measurements for
experiments conducted on a milling machine under various VB were not taken after each run.
operating conditions, including sixteen cases with a vary-
ing number of runs, specific experimental conditions are Experimental setup
described in Table 1.
The experiments were implemented with the Matsuura The two data sets are used to evaluate the performance of
machining center MC-510 V. A 70 mm face mill with six our proposed method. The 2010 PHM Data Challenge data
KC710 inserts was chosen as the tool. An acoustic emission set was conducted under the same invariant operating param-
sensor (model WD925, PHYSICAL ACOUSTIC GROUP) eters, while the NASA Ames milling data set has sixteen
and a vibration sensor (model 7201-50, ENDEVCO) were cases. Considering the characteristics of the two data sets,
mounted on both the table and the spindle of the machin- different data preprocessing methods are implemented.
ing center. The signals from all sensors were amplified and The 2010 PHM Data Challenge data set recorded six
filtered, then were fed through two root mean square (RMS) individual cutters, but the corresponding tool wear is only
calculation before entering the computer for data acquisi- available for three of them. These three individual cutter
tion through a high-speed data acquisition board with maxi- records are selected as our data set, named c1, c4, and c6.
mal sampling rate of 100 kHz. An OMRON K3TB-A1015 Time series sensor data with a different length more than
current converter fed the signal from spindle motor current 200 thousand time steps was recorded in each run, totally
phase into the cable connector, and a model CTA 213 cur- 315 runs in each cutter. Such a long time series is difficult
rent sensor was used for data acquisition. The signal from for LSTMs to process, and downsampling the time series
the spindle motor current sensor was fed into the computer at a low rate, such as one percent, causes signal distortion.

Table 1  Experimental Case Depth of cut Feed Material Case Depth of cut Feed Material
conditions of milling data set
1 1.5 0.5 1-cast iron 9 1.5 0.5 1-cast iron
2 0.75 0.5 1-cast iron 10 1.5 0.25 1-cast iron
3 0.75 0.25 1-cast iron 11 0.75 0.25 1-cast iron
4 1.5 0.25 1-cast iron 12 0.75 0.5 1-cast iron
5 1.5 0.5 2-steel 13 0.75 0.25 2-steel
6 1.5 0.25 2-steel 14 0.75 0.5 2-steel
7 0.75 0.25 2-steel 15 1.5 0.25 2-steel
8 0.75 0.5 2-steel 16 1.5 0.5 2-steel

Fig. 5  Illustration of the experi- Acousc Emission


mental setup for NASA Ames Preamplifier RMS
Sensor Spindle
milling data set
Acousc Emission
Preamplifier RMS
Sensor Table

Vibraon Sensor
Charge Amplifier LP/HP Filter RMS Computer
Spindle

Vibraon Sensor
Charge Amplifier LP/HP Filter RMS
Table

Spindle Motor
Current Sensor
Recorder

13
Journal of Intelligent Manufacturing

Table 2  Configurations for Symbol Training set Test set Table 3  List of extracted features
training and test setting
Domain Features Expression
C1 c4, c6 c1
� ∑
C4 c1, c6 c4 Statistical RMS 1 n 2
xrms = i=1 xi
C6 c1, c4 c6 n

Variance 1
n

xvar = n
(x − x̄ )2
i=1

Maximum xmax = max(x)


Therefore, the whole sequence was divided into 1000 sec- [( ) ]
Skewness 3
tions, and the maximum and mean values of each section xskew = E x−𝜇
𝜎
are extracted to form a new time sequence respectively. For [(
Kurtosis )4 ]
training and test settings, a three-fold setting is adopted: two xkurt = E x−𝜇

sets are selected as training sets and one set is used as test 𝜎

set, as illustrated in Table 2 (Zhao et al. 2017). To facilitate Peak-to-Peak xp−p = max(x) − min(x)
training, z-score normalization is used. Frequency Spectral skewness k �
∑ fi −f̄
�3 � �
fskew = S fi
In the NASA Ames milling data set, measurements for i=1
𝜎

VB were not taken after each run. Accordingly, those sam- Spectral kurtosis k �
∑ fi −f̄
�4 � �
fkurt = S fi
ples with unmeasured VB from the data set are excluded. i=1
𝜎

Time series sensor data with a length of 9000 time steps was Time–frequency Wavelet energy N

recorded in each run, which is downsampled at rate 1/2 to EWT =
i=1
wt𝜙2 (i)∕N
get a more appropriate data length. Z-score normalization is
also used. In our experiment, the last two cases of material
type 1 (case 11 and case 12) and the last two cases of mate-
rial type 2 (case 15 and case 16) are selected as the test sets maintain consistent, to introduce these information into
(Zheng et al. 2017). model does not bring apparent improvement. Finally, the
Two types of methods are selected for comparison: meth- new input vector is fed into a nonlinear regression model that
ods with feature extraction and methods without. Methods consists of two fully-connected layers with (32, 8) hidden
with feature extraction include linear regression (LR), sup- units and an output layer, to obtain the tool wear prediction.
port vector regression (SVR), and multi-layer perceptron To quantitatively evaluate the performance of the selected
(MLP). As shown in Table 3, the same feature extractions algorithms, three criteria are adopted: mean absolute error
as in Wang et al. (2017) are adopted, except that the specific (MAE), root mean square error (RMSE), and average accu-
wavelet energy used in our study contains the percentages racy ( Ā ) (Aghazadeh et al. 2018). The respective calcula-
of energy corresponding to the approximation and details. tions are as follows:
Methods without feature extraction include CNNs and m
LSTMs. 1 ∑|
MAE = ŷ − yi ||, (10)
LR requires no hyperparameter setting. For SVR, the m i=1 | i
predefined set of two hyperparameters forms a grid, which
are C in [0.001:0.001:0.01, 0.01:0.01:0.1, 0.1:0.1:1, 1:1:10], √
1 ∑m ( )2
andγin [0.001:0.001:0.01, 0.01:0.01:0.1, 0.1:0.1:1]. The best RMSE = ŷ i − yi , (11)
m i=1
hyperparameters are determined through an exhaustive grid
search. The other approaches, including MLP, CNN, and and
LSTM, are neural networks. MLP consists of a three-layer m
fully-connected layer with 256 hidden units. For CNN, the Ā =
1 ∑|
ŷ − yi ||∕̄y, (12)
convolutional and subsampling layers of LeNet-5 (Lecun m i=1 | i
et al. 1998) are adopted to replace the LSTM element in our
proposed method, as a comparison. For our proposed hybrid where yi , ŷ i , and ȳ are the true wear, predicted wear, and
information method based on an LSTM, a three-layer LSTM average value of true wear, respectively.
with 256 hidden units for each layer is designed. Subse-
quently, the output of the LSTM is combined with certain
process information, forming a new input vector. Regarding
the NASA Ames milling data set, the machining information Results and discussion
includes the case number, depth of cut, feed, material, and
flank wear of the last run. As for 2010 PHM Data Challenge In this section, the experimental results of the two data sets
data set, since the process conditions and process parameters are presented and discussed.

13
Journal of Intelligent Manufacturing

PHM Data Challenge data set results value under three different scenarios. Despite the prediction
errors, particularly at the end of C6, the tool wear predic-
The 2010 PHM Data Challenge data set represents experi- tions follow the actual trends well.
ments from runs on a CNC machine under consistent oper- As shown in Fig. 6, the trend of tool wear in these three
ating conditions. The results of different methods on the cases is different. Due to the existence of individual differ-
three-fold data sets are shown in Table 4. Regression models ences, predicting the tool wear of one cutting tool based on
with feature extraction, including LR, SVR and MLP, are the samples from another two does not yield satisfactory
analyzed first. LR performs worst because of the limitation results, that is, it captures the global trend but does not per-
of linear models. SVR with an RBF kernel function can cap- form well in local details. To further verify the capability of
ture the nonlinear relationship between the extracted fea- the LSTM-based method and improve the prediction accu-
tures and tool wear; thus, it performs better than LR. MLP racy of tool wear, a fitting experiment, a dynamic training
is also a nonlinear method, which performs better than LR mode and an experiment with randomized training data set
but worse than SVR. In addition to the structure (i.e. lin- selection are implemented as follows.
ear or non-linear), the performance of different methods is For the fitting experiment, all samples in the three differ-
also influenced by the expert feature extraction. The feature ent sets are fed into the same LSTM-based model as train-
extraction methods shown in Table 3 might be more suitable ing examples. Subsequently, the trained model is used to
for methods based on support vectors, rather than neural test data set c1, c4 and c6, respectively. As shown in Fig. 7,
networks. excellent fitting results are achieved with tiny fluctuations.
CNN and LSTM are two typical deep learning methods, For the dynamic training mode, the training and test split-
in which CNN mainly focuses on the implied information ting is the same as the original model. The difference is
in the spatial structure data and LSTM focuses on the time that the prediction procedure is implemented along with the
series. Based on the evaluation criteria, the performance of dynamic update of the trained model. The procedures are
the CNN-based method is unsatisfactory when dealing with as follows:
multidimensional time sequences without further signal pro-
cessing, whereas the LSTM-based method achieves the best (1) Use the original training set to train model.
performance among all methods. To better demonstrate the (2) Utilize the trained model to predict tool wear of the first
effectiveness of the LSTM-based method, Fig. 6 illustrates 5 cuts in the test set.
the comparison between predicted tool wear and the true (3) Add the 5 tested samples to the training set.

Table 4  Performance Algorithms MAE RMSE Average accuracy, %


comparison among algorithms
using 2010 PHM Data C1 C4 C6 C1 C4 C6 C1 C4 C6
Challenge data set
LR 0.0337 0.0164 0.0668 0.0501 0.0189 0.0904 70.41 84.60 52.22
SVR 0.0098 0.0262 0.0273 0.0144 0.0293 0.0389 91.36 75.39 80.46
MLP 0.0251 0.0336 0.0268 0.0288 0.0398 0.0336 78.00 68.47 80.83
CNN 0.0254 0.0391 0.0383 0.0293 0.0436 0.0553 77.68 63.26 72.57
LSTM 0.0085 0.0085 0.0146 0.0114 0.0117 0.0212 92.54 92.04 89.56

Fig. 6  Results of proposed method for three different test scenarios using the 2010 PHM Data Challenge data set (C1, C4 and C6)

13
Journal of Intelligent Manufacturing

(4) Train model with the updated training set. prediction on this dataset using MapReduce-based parallel
(5) Test next 5 cuts, and repeat (3)–(5) until the end of the random forests (PRFs), which consists of 10,000 regression
test set. trees. For comparison, the same mean square error (MSE)
and coefficient of determination ­(R2) as Wu et al. (2018a)
The results of the dynamic training mode are shown in are calculated, as shown in Table 6. Although the R ­ 2 met-
Fig. 8, and the results of three different modes, including fit- rics of two method are close, the hybrid information model
ting, dynamic and original, are compared in Table 5. In addi- is still worse than MapReduce-based PRFs based on MSE.
tion to excellent fitting results, the dynamic training mode Different from the single LSTM-based model, the MapRe-
has improved the accuracy of tool wear prediction, compared duce-based PRFs ensemble integrated 10,000 submodels to
with the original training mode. predict tool wear, achieving more accurate predictions and
For the experiment with randomized training data set faster calculations, which provides a feasible way for our
selection, 90% of all samples in the three different set are future improvement.
randomly selected as training set, and the remainder is used In this case, the LSTM-based model proves its capabil-
for model test. Results of tool wear prediction are shown ity in modeling time series. The original model captured
in Fig. 9. Wu et al. (2018a) has achieved a very accurate the trend of tool wear but the prediction accuracy of local

Fig. 7  Results of fitting experiment for three different scenarios using the 2010 PHM Data Challenge data set (C1, C4 and C6)

Fig. 8  Results of dynamic training mode for three different test scenarios using the 2010 PHM Data Challenge data set (C1, C4 and C6)

Table 5  Comparison of Modes MAE RMSE Average accuracy, %


performance among different
training modes using 2010 C1 C4 C6 C1 C4 C6 C1 C4 C6
PHM Data Challenge data set
Fitting 0.0025 0.0029 0.0027 0.0038 0.0046 0.0042 97.85 97.26 98.07
Dynamic 0.0069 0.0065 0.0092 0.0109 0.0105 0.0127 93.91 93.93 93.45
Original 0.0085 0.0085 0.0146 0.0114 0.0117 0.0212 92.54 92.04 89.56

13
Journal of Intelligent Manufacturing

the data set are shown in Table 7. Among the regression


models with feature extraction (i.e. LR, SVR and MLP),
LR performs the worst due to the limitation of linear struc-
tures. However, the performances of all three models are
unsatisfactory. The sliding window method (i.e. CNN)
also performs poorly. For all four methods, the introduc-
tion of process information into the models does not bring
a positive impact. Regarding the LSTM, this powerful
sequence-learning method achieves slightly better results
than the other four methods when process information is
not involved in the model. However, its performance is
still unsatisfactory.
The main causes of this unsatisfactory performance are
twofold. First, the expert feature extraction for the regres-
sion model might be not reasonable enough in this case,
Fig. 9  Results of the experiment with 90% data for training and CNNs are inappropriate for multidimensional time
series modeling without further signal preprocessing. Sec-
ond, sixteen cases are contained in this data set. A total of
Table 6  Performance comparison with cloud computing-base random eight different combinations of process parameters makes
forests it difficult for the models to learn the complex relationship
Methods MSE R2 Average accuracy between the sensor signal and the tool wear in all cases.
Classification and clustering methods are also inappropri-
LSTM-based model 22.704 0.989 0.976
ate under such circumstances.
MapReduce-based PRFs 8.295 0.992 –
Our proposed LSTM-based hybrid information model
achieves the best performance. To further illustrate the
effectiveness of our proposed method, a contrast between
details was insufficient. Although each tool contains 315 the model with and without process information are pro-
cutting samples, only three tools are available in this data vided in Fig. 10, which shows that the proposed model is
set, which results in significant individual differences capable of predicting the trend of the tool wear, giving
among tools. Accordingly, a dynamic training mode and an a reasonable prediction value within a small error. The
experiment with randomized training data set selection were results of four cases (case 11, case 12, case 15 and case
implemented, where the prediction accuracy was improved 16, from left to right) are shown in Fig. 10, of which case
to a certain extent. 16 performs slightly worse than the other three cases. The
trends of tool wear in the sixteen cases are compared in
NASA Ames milling data set results Fig. 11. The tool of case 16 only proceeded six cuts, and
three of which lacked tool wear records. Consequently,
The data in the NASA Ames milling data set represents only three cut samples are available in this case. In addi-
experiments from runs on a milling machine under various tion, the tool wear sharply increased within a short time,
operating conditions. The results of different methods on resulting in a smaller prediction value than the actual wear.

Table 7  Comparison of Algorithms With process information Without process information


performance between
algorithms with NASA Ames MAE RMSE Average accu- MAE RMSE Average
milling data set racy, % accuracy,
%

LR 0.1879 0.2146 41.98 0.1879 0.2146 41.98


SVR 0.1700 0.1945 47.53 0.1668 0.1910 48.49
MLP 0.1830 0.2168 43.49 0.1835 0.2137 42.34
CNN 0.1835 0.2143 43.36 0.1829 0.2178 43.52
LSTM 0.0322 0.0456 90.06 0.1558 0.1853 51.91

13
Journal of Intelligent Manufacturing

(a) Result with information (b) Result without information

Fig. 10  Results of proposed method using NASA Ames milling data set

The main contributions of this paper are highlight as fol-


lows: (1) Deep learning architecture was used to adaptively
extract features from sensor signal, which overcomes the
dependence on specific priori knowledge and diagnostic
expertise. (2) LSTMs were selected to model monitoring
sensor signal, which are not commonly used in TCM. The
LSTM-based model outperformed LR, SVR, MLP, CNN,
owing to its recurrent structure for time series processing.
(3) Process information was combined with the temporal
features extracted by LSTMs to form a new input vec-
tor, which significantly improved the prediction accuracy
when the machining process runs under various operating
conditions.
In the future, feature extractions and information should
be explored to improve the prediction accuracy of TCM sys-
Fig. 11  Tool wear for all sixteen cases tem. In addition, our future work will focus on the algorithm
design based on ensemble learning and parallel computing
to improve the prediction accuracy, and enhance training
efficiency.
Conclusion
Acknowledgements This work was supported by the Major Program
of National Natural Science Foundation of China (Grant Numbers
In this paper, a hybrid information model based on LSTM 51435009).
was proposed, for tool condition monitoring purposes. A
stacked LSTM was designed firstly to extract temporal Compliance with ethical standards
features from raw sequential data. Then, a new input vec-
tor was formed by combining the temporal features and Conflict of interest The authors declare that they have no conflict of
process information, and fed into a nonlinear regression interest.
model, which consisted of two fully-connected layers and
a linear regression layer to obtain the tool wear prediction.
The performance of the proposed method was evaluated on
two different data sets: the 2010 PHM Data Challenge data
References
set, and the NASA Ames milling data set. Experimental Aghazadeh, F., Tahan, A., & Thomas, M. (2018). Tool condition moni-
results have shown that the LSTM-based model can predict toring using spectral subtraction and convolutional neural net-
tool wear more accurately than LR, SVR, MLP and CNN. works in milling process. The International Journal of Advanced

13
Journal of Intelligent Manufacturing

Manufacturing Technology, 98(9–12), 3217–3227. https​://doi. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic opti-
org/10.1007/s0017​0-018-2420-0. mization. arXiv preprint arXiv​:1412.6980.
Agogino, A., & Goebel, K. (2007). Milling data set. In U. B. BEST Kurada, S., & Bradley, C. (1997). A review of machine vision sen-
lab (Ed.) NASA ames prognostics data repository NASA ames sors for tool condition monitoring. Computers in Industry, 34(1),
research center, moffett field, CA.(http://ti.arc.nasa.gov/proje​ct/ 55–72. https​://doi.org/10.1016/S0166​-3615(96)00075​-9.
progn​ostic​-data-repos​itory​). Kwon, H.-B. (2017). Exploring the predictive potential of artificial
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long- neural networks in conjunction with DEA in railroad performance
term dependencies with gradient descent is difficult. IEEE modeling. International Journal of Production Economics, 183,
Transactions on Neural Networks, 5(2), 157–166. https​://doi. 159–170. https​://doi.org/10.1016/j.ijpe.2016.10.022.
org/10.1109/72.27918​1. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature,
Bhattacharyya, P., Sengupta, D., Mukhopadhyay, S., & Chattopadhyay, 521(7553), 436–444. https​://doi.org/10.1038/natur​e1453​9.
A. B. (2008). On-line tool condition monitoring in face milling Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-
using current and power signals. International Journal of Produc- based learning applied to document recognition. Proceedings of
tion Research, 46(4), 1187–1201. https​://doi.org/10.1080/00207​ the IEEE, 86(11), 2278–2324. https​://doi.org/10.1109/5.72679​1.
54060​09402​88. Liu, Y., Hu, X., & Zhang, W. (2019). Remaining useful life prediction
Brocki, L., & Marasek, K. (2015). Deep belief neural networks and based on health index similarity. Reliability Engineering & System
bidirectional long-short term memory hybrid for speech rec- Safety, 185, 502–510. https​://doi.org/10.1016/j.ress.2019.02.002.
ognition. Archives of Acoustics, 40(2), 191–195. https​://doi. Mai, F., Tian, S., Lee, C., & Ma, L. (2019). Deep learning models
org/10.1515/aoa-2015-0021. for bankruptcy prediction using textual disclosures. European
Cho, S., Binsaeid, S., & Asfour, S. (2009). Design of multisensor Journal of Operational Research, 274(2), 743–758. https​://doi.
fusion-based tool condition monitoring system in end milling. The org/10.1016/j.ejor.2018.10.024.
International Journal of Advanced Manufacturing Technology, Nguyen, K. T. P., & Medjaher, K. (2019). A new dynamic predictive
46(5–8), 681–694. https​://doi.org/10.1007/s0017​0-009-2110-z. maintenance framework using deep learning for failure prognos-
Deng, L. (2014). A tutorial survey of architectures, algorithms, and tics. Reliability Engineering & System Safety, 188, 251–262. https​
applications for deep learning. APSIPA Transactions on Signal ://doi.org/10.1016/j.ress.2019.03.018.
and Information Processing. https:​ //doi.org/10.1017/atsip.​ 2013.9. Özel, T., & Karpat, Y. (2005). Predictive modeling of surface rough-
Elman, J. L. (1990). Finding structure in time. Cognitive Science, ness and tool wear in hard turning using regression and neural
14(2), 179–211. https​://doi.org/10.1016/0364-0213(90)90002​-E. networks. International Journal of Machine Tools and Manu-
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term facture, 45(4–5), 467–479. https​://doi.org/10.1016/j.ijmac​htool​
memory networks for financial market predictions. European s.2004.09.007.
Journal of Operational Research, 270(2), 654–669. https​://doi. Peng, Y., Wang, H., Wang, J., Liu, D., & Peng, X. (2012). A modi-
org/10.1016/j.ejor.2017.11.054. fied echo state network based remaining useful life estimation
Freyer, B. H., Heyns, P. S., & Theron, N. J. (2014). Comparing orthog- approach. In 2012 IEEE international conference on prognostics
onal force and unidirectional strain component processing for tool and health management: enhancing safety, efficiency, availability,
condition monitoring. Journal of Intelligent Manufacturing, 25(3), and effectiveness of systems through PHM technology and appli-
473–487. https​://doi.org/10.1007/s1084​5-012-0698-6. cation, PHM 2012, June 18, 2012—June 21, 2012, Denver, CO,
García-Ordás, M. T., Alegre-Gutiérrez, E., González-Castro, V., & United states, 2012 (PHM 2012—2012 IEEE international confer-
Alaiz-Rodríguez, R. (2018). Combining shape and contour fea- ence on prognostics and health management: Enhancing safety,
tures to improve tool wear monitoring in milling processes. Inter- efficiency, availability, and effectiveness of systems through phm
national Journal of Production Research, 56(11), 3901–3913. technology and application, conference program). IEEE Computer
https​://doi.org/10.1080/00207​543.2018.14359​19. Society. https​://doi.org/10.1109/icphm​.2012.62995​24.
Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep Ren, Q., Balazinski, M., Baron, L., Jemielniak, K., Botez, R., &
learning (Vol. (Vol. 1)). Cambridge: MIT press. Achiche, S. (2014). Type-2 fuzzy tool condition monitoring sys-
Graves, A., & Schmidhuber, J. (2005). Framewise phoneme clas- tem based on acoustic emission in micromilling. Information Sci-
sification with bidirectional LSTM and other neural network ences, 255, 121–134. https​://doi.org/10.1016/j.ins.2013.06.010.
architectures. Neural Networks, 18(5–6), 602–610. https​://doi. Sharma, V. S., Sharma, S. K., & Sharma, A. K. (2007). Cutting tool
org/10.1016/j.neune​t.2005.06.042. wear estimation for turning. Journal of Intelligent Manufactur-
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimen- ing, 19(1), 99–108. https​://doi.org/10.1007/s1084​5-007-0048-2.
sionality of data with neural networks. Science, 313(5786), 504. Shi, C., Panoutsos, G., Luo, B., Liu, H., Li, B., & Lin, X. (2018). Using
https​://doi.org/10.1126/scien​ce.11276​47. multiple feature spaces-based deep learning for tool condition
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. monitoring in ultra-precision manufacturing. IEEE Transactions
Neural Computation, 9(8), 1735–1780. https​://doi.org/10.1162/ on Industrial Electronics. https​://doi.org/10.1109/tie.2018.28561​
neco.1997.9.8.1735. 93.
Jain, A. K., & Lad, B. K. (2017). A novel integrated tool condition Si, X.-S., Wang, W., Hu, C.-H., & Zhou, D.-H. (2011). Remaining
monitoring system. Journal of Intelligent Manufacturing, 30(3), useful life estimation—A review on the statistical data driven
1423–1436. https​://doi.org/10.1007/s1084​5-017-1334-2. approaches. European Journal of Operational Research, 213(1),
Jia, F., Lei, Y., Lin, J., Zhou, X., & Lu, N. (2016). Deep neural net- 1–14. https​://doi.org/10.1016/j.ejor.2010.11.018.
works: A promising tool for fault characteristic mining and intelli- Siddhpura, A., & Paurobally, R. (2012). A review of flank wear pre-
gent diagnosis of rotating machinery with massive data. Mechani- diction methods for tool condition monitoring in a turning pro-
cal Systems and Signal Processing, 72–73, 303–315. https​://doi. cess. The International Journal of Advanced Manufacturing
org/10.1016/j.ymssp​.2015.10.025. Technology, 65(1–4), 371–393. https​://doi.org/10.1007/s0017​
Jimeno Yepes, A. (2017). Word embeddings and recurrent neural net- 0-012-4177-1.
works based on long-short term memory nodes in supervised bio- Sun, J., Hong *, G. S., Rahman, M., & Wong, Y. S. (2005). Improved
medical word sense disambiguation. Journal of Biomedical Infor- performance evaluation of tool condition identification by manu-
matics, 73, 137–147. https​://doi.org/10.1016/j.jbi.2017.08.001. facturing loss consideration. International Journal of Production

13
Journal of Intelligent Manufacturing

Research, 43(6), 1185–1204. https:​ //doi.org/10.1080/002075​ 4041​ international conference on emerging technologies and factory
23312​99701​. automation, ETFA: Institute of electrical and electronics engineers
Sun, J., Hong, G. S., Rahman, M., & Wong, Y. S. (2007). Identification Inc. https​://doi.org/10.1109/etfa.2017.82476​59.
of feature set for effective tool condition monitoring by acoustic Zhang, C., Yao, X., Zhang, J., & Jin, H. (2016). Tool condition moni-
emission sensing. International Journal of Production Research, toring and remaining useful life prognostic based on a wireless
42(5), 901–918. https:​ //doi.org/10.1080/002075​ 40310​ 00162​ 6652.​ sensor in dry milling operations. Sensors (Basel), 16(6), 795. https​
Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the ://doi.org/10.3390/s1606​0795.
gradient by a running average of its recent magnitude. COUR- Zhang, K.-F., Yuan, H.-Q., & Nie, P. (2015). A method for tool con-
SERA: Neural Networks for Machine Learning, 4(2), 26–31. dition monitoring based on sensor fusion. Journal of Intelligent
Wang, J., Xie, J., Zhao, R., Zhang, L., & Duan, L. (2017). Multisensory Manufacturing, 26(5), 1011–1026. https​://doi.org/10.1007/s1084​
fusion based virtual tool wear sensing for ubiquitous manufactur- 5-015-1112-y.
ing. Robotics and Computer-Integrated Manufacturing, 45, 47–58. Zhang, B., Zhang, S., & Li, W. (2019). Bearing performance degrada-
https​://doi.org/10.1016/j.rcim.2016.05.010. tion assessment using long short-term memory recurrent network.
Wu, D., Jennings, C., Terpenny, J., Kumara, S., & Gao, R. X. (2018a). Computers in Industry, 106, 14–29. https​://doi.org/10.1016/j.
Cloud-based parallel machine learning for tool wear prediction. compi​nd.2018.12.016.
Journal of Manufacturing Science and Engineering, Transactions Zhao, R., Wang, D., Yan, R., Mao, K., Shen, F., & Wang, J. (2018).
of the ASME. https​://doi.org/10.1115/1.40380​02. Machine health monitoring using local feature-based gated recur-
Wu, J., Su, Y., Cheng, Y., Shao, X., Deng, C., & Liu, C. (2018b). Multi- rent unit networks. IEEE Transactions on Industrial Electronics,
sensor information fusion for remaining useful life prediction of 65(2), 1539–1548. https​://doi.org/10.1109/tie.2017.27334​38.
machining tools by adaptive network based fuzzy inference sys- Zhao, R., Yan, R., Wang, J., & Mao, K. (2017). Learning to monitor
tem. Applied Soft Computing, 68, 13–23. https:​ //doi.org/10.1016/j. machine health with convolutional bi-directional LSTM networks.
asoc.2018.03.043. Sensors (Basel). https​://doi.org/10.3390/s1702​0273.
Wu, Y., Yuan, M., Dong, S., Lin, L., & Liu, Y. (2018c). Remaining Zheng, S., Ristovski, K., Farahat, A., & Gupta, C. (2017). Long short-
useful life estimation of engineered systems using vanilla LSTM term memory network for remaining useful life estimation. In
neural networks. Neurocomputing, 275, 167–179. https​://doi. 2017 IEEE international conference on prognostics and health
org/10.1016/j.neuco​m.2017.05.063. management, ICPHM 2017, June 19, 2017—June 21, 2017, Dal-
Yang, W.-A., Zhou, W., Liao, W., & Guo, Y. (2014). Prediction of las, TX, United states, 2017 (pp. 88–95), 2017 IEEE international
drill flank wear using ensemble of co-evolutionary particle conference on prognostics and health management, ICPHM 2017:
swarm optimization based-selective neural network ensembles. Institute of electrical and electronics engineers Inc. https​://doi.
Journal of Intelligent Manufacturing, 27(2), 343–361. https​://doi. org/10.1109/icphm​.2017.79983​11.
org/10.1007/s1084​5-013-0867-2. Zhu, K., Wong, Y. S., & Hong, G. S. (2009). Multi-category micro-
Yesilyurt, I., & Ozturk, H. (2007). Tool condition monitoring in mill- milling tool wear monitoring with continuous hidden Markov
ing using vibration analysis. International Journal of Production models. Mechanical Systems and Signal Processing, 23(2), 547–
Research, 45(4), 1013–1028. https:​ //doi.org/10.1080/002075​ 4060​ 560. https​://doi.org/10.1016/j.ymssp​.2008.04.010.
06777​81.
Zhang, C., Hong, G. S., Xu, H., Tan, K. C., Zhou, J. H., Chan, H. Publisher’s Note Springer Nature remains neutral with regard to
L., et al. (2017). A data-driven prognostics framework for tool jurisdictional claims in published maps and institutional affiliations.
remaining useful life estimation in tool condition monitoring. In
22nd IEEE international conference on emerging technologies
and factory automation, ETFA 2017, September 12, 2017—Sep-
tember 15, 2017, Limassol, Cyprus, 2018 (pp. 1–8). IEEE

13

View publication stats

You might also like