0% found this document useful (0 votes)
73 views

Design of Multisensor Fusion-Based Tool Condition Monitoring System in End Milling

vib

Uploaded by

uamiranda3518
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Design of Multisensor Fusion-Based Tool Condition Monitoring System in End Milling

vib

Uploaded by

uamiranda3518
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Int J Adv Manuf Technol

DOI 10.1007/s00170-009-2110-z

ORIGINAL ARTICLE

Design of multisensor fusion-based tool condition monitoring


system in end milling
Sohyung Cho & Sultan Binsaeid & Shihab Asfour

Received: 5 December 2008 / Accepted: 11 May 2009


# Springer-Verlag London Limited 2009

Abstract Recent advancement in signal processing and be achieved by using force, vibration, and acoustic
information technology has resulted in the use of multiple emission sensor together with correlation-based feature
sensors for the effective monitoring of tool conditions, selection method and majority voting machine ensemble.
which is the most crucial feedback information to the
process controller. Interestingly, the abundance of data Keywords Multisensor fusion . Tool condition monitoring .
collected from multiple sensors allows us to employ various Machine ensemble
techniques such as feature extraction, selection, and
classification methods for generating such crucial informa-
tion. While the use of multiple sensors has improved the 1 Introduction
accuracy in the classification of tool conditions, design of
tool condition monitoring system (TCM) for reduced In today’s fierce global competition, on-time delivery of
complexity and increased robustness has been rarely highly diversified products with reduced manufacturing
studied. Therefore, this paper studies the design of effective lead time has become a key determinant for the survival of
multisensor-based TCM when machining 4340 steel by manufacturing enterprises. Manufacturing lead time can be
using a multilayer-coated and multiflute carbide end mill significantly reduced by effective control of disruptive
cutter. Multiple sensors tested in this paper include force, events such as machine breakdown, material absence, and
vibration, acoustic emission, and spindle power sensor for demand fluctuations. Among those disruptive events,
the time and frequency domain data. In addition, two machine breakdown is directly related to the increased
feature selection methods and three classifiers with a manufacturing lead time and may result in reduced
machine ensemble technique are considered as design customer satisfaction. It should be emphasized that consid-
components. Importantly, different fusion methods are erable portion (7–20%) of machine downtime results from
evaluated in this paper: (1) decision level fusion and (2) tool failure [1, 2]. The tool failure can be prevented by
feature level fusion. The experimental results show that the efficiently monitoring conditional changes in the tool, and
design of TCM based on the feature level fusion can hence, tool condition monitoring (TCM) has been of great
significantly improve the accuracy of the tool condition interest to both academia and industry. It has been reported
classification. It is also shown that the highest accuracy can that successful implementation of TCM can save up to 40%
of production costs [3]. In general, there are three
categories of tool condition, particularly for end-milling
S. Cho (*)
cutters: (1) tool breakage, (2) tool chipping, and (3) tool
Industrial and Manufacturing Engineering,
Southern Illinois University Edwardsville, wear. These categories are different in their nature such that
Edwardsville, IL 62026, USA tool breakage occurs abruptly in an observable and random
e-mail: [email protected] manner, tool chipping has the same characteristics as tool
S. Binsaeid : S. Asfour
breakage except it is hardly detected for a considerable
Department of Industrial Engineering, University of Miami, duration, whereas tool wear develops gradually and can be
Coral Gables, FL 33146, USA predicted to a certain extent.
Int J Adv Manuf Technol

With recent advancement in signal processing technolo- extracted features, and feature selection method. Then,
gy and information technology, a wide range of online machine learning (ML) algorithms and machine ensemble
sensors has been employed to retrieve information relevant approach are introduced. In addition, feature level and
to tool conditions, which is the most crucial feedback decision level fusion are introduced. Section 3 outlines the
information to the process controller. Specifically, force experimental setup, design of experiment, and definition of
sensor [4, 5], vibration sensor [6, 7], acoustic emission tool condition classes. Specifically, three classes are defined
[8, 9], and spindle power sensor [10] have been used as an to describe three different states of flank wear progression,
individual sensor or a group of sensors, referred to as a while two extra classes are assigned for tool chipping and
multiple sensor [11, 12]. The employment of multiple breakage. The discussion and performance of constructed
sensors has improved the accuracy in the classification of TCM models are provided in Section 4.
tool conditions because it is intended to fuse the informa-
tional power of individual sensor, resulting in complemen-
tary and redundant information [13]. Interestingly, the 2 Design of multisensor fusion-based TCM system
abundance of data collected from multiple sensors allows
us to make use of various techniques such as feature Design of multisensor fusion-based TCM system consid-
extraction, selection, and classification methods for gener- ered in this paper consists of four layers as illustrated in
ating the crucial feedback information [13, 14]. While Fig. 1: (1) data acquisition through multiple sensors and
many research works focused on the improved accuracy in
the classification of tool conditions by employing multiple
sensors, design of multisensor-based TCM for reduced
complexity and increased robustness has been rarely
studied.
The main goal of this paper is to study the design of
effective tool condition monitoring system in a more
systematic manner when machining 4340 steel by using a
multilayer-coated and multiflute carbide end mill cutter.
Specifically, we study decision making in the design
process that includes determination of a multisensor
combination, feature selection method, machine learning-
based classifier, and machine ensemble technique. Impor-
tantly, two different fusion methods are evaluated in this
paper: (1) decision level fusion and (2) feature level fusion.
To achieve the aforementioned goal, this paper investigates
the following three objectives as part of the analysis. The
first one is to study the significance of reducing the input
space dimension for the classification model and selecting
the most significant subset of features with which higher
level of information related to the tool condition classifica-
tion can be achieved. The second one is to study the
significance of different information fusion strategies to the
classification model, i.e., no fusion with best single sensor
model, feature level fusion with best multiple sensors
model. The third objective is to investigate the effectiveness
of several decision-making methods, which are multilayer
perceptron neural network (MLP), radial base function
neural network (RBF), and support vector machine (SVM).
Furthermore, these three classifiers are studied with a
machine ensemble approach, which is referred to as
majority vote. The rest of the paper is organized as follows:
Section 2 explains the design of multisensor fusion-based
TCM and required components that are involved in the
design. A detailed review is given in regards to data
acquisition system, signal processing methods and their Fig. 1 Design of multisensor fusion-based TCM system
Int J Adv Manuf Technol

digital signal process, (2) feature extraction, (3) feature automatically extract different features from incoming
selection, and (4) ML and ensemble-based classification. signals in both time and frequency domain has been
Figure 1 also shows specific attributes associated with each constructed using the LabVIEW software. Specifically,
layer. For example, selection of multiple sensors and signal thefollowingfeaturesareextractedfromthemultiplesensorsfor
processing techniques is the main attribute that is associated further analysis in the subsequent design stages. Note that
with data acquisition layer. amplitude values of a signal are expressed as [x1, x2, … xn].
Table 1 summarized features considered in this paper that
2.1 Data acquisition using multisensor are extracted from each sensor signal.
Table 2 provides the distribution of all extracted features
The following multiple sensors are used to collect data in both time and frequency domain per sensor. Specifically,
required for the subsequent design and analysis of TCM there are 135 extracted features from eight sensory signals,
systems: a dynamometer to measure three-directional i.e., three force signals, one acoustic emission signal, three
forces, an accelerometer to measure three-directional vibration signals, and one spindle power signal. For
vibrations, an acoustic emission sensor, and a spindle instance, there are 27 features from force sensor signals in
power sensor. Note that these four sensors are most the table (nine features from each force sensor × 3 force
frequently used sensors in the literature [12] and account sensors of Fx, Fy, and Fz). In addition to the extracted
for eight sensory signals. In the subsequent design features, machining parameters are also considered as a part
stages, data obtained from individual sensor is fused of the feature space, which are axial depth of cut, cutting
into each other at feature level or decision level speed, and feed rate. Therefore, the total number of features
depending on the design objectives. The advantage of considered in this paper is 138.
fusing the outputs from one sensor with those from
another independent sensors stems from redundancy 2.3 Feature reduction method
being present in the information [13]. More specifically,
if redundant sensors are employed, the overall uncer- Training ML classifiers using the maximum number of
tainty of the resulting measurement can be reduced, and features obtainable is not always the best option, as
thus, the performance of the system can be improved irrelevant and redundant features can negatively influence
by averaging out the independent noise acting on the the performance of ML algorithms. In order to improve the
different sensors because the noise inherent in individ- accuracy of the classification model and increase the
ual sensor measurement is not correlated with noise efficiency of the computational performance of TCM
from other sensors to a large extent. In addition, systems, inclusion of an optimal number of significant
complementary sensors provide extended and indepen- features in the final model is desirable. This can be
dent information about the process, which is difficult to achieved by reducing the number of features utilizing
be captured otherwise. On the other hand, signal features selection techniques. Correlation-based feature
processing attribute includes the selection of band pass selection method (CFS) and χ2 statistics selection method
filters, sampling rate, and gain of the coupler to are studied in this research to evaluate different feature
improve the quality of the data. In this research, all subsets. Also, note that a greedy hill climbing search
the signals are properly filtered and analyzed by algorithm is employed to search for optimal subset size
commercially available software—LabVIEW. [20].

2.2 Feature extraction 2.3.1 Correlation-based feature selection method

The main purpose of feature extraction is to significantly CFS measures the goodness of feature subsets by taking the
reduce the dimension of raw data in time and frequency followings into account:
domain and at the same time maintain the relevant
& the level of correlation of individual features with the
information about tool conditions in the extracted features.
predicted class
Many research works have studied various feature extrac-
& the level of inter-correlation among features
tion methods, and most of these extraction methods can be
found in [14–19]. In this paper, a comprehensive set of Importantly, high scores are assigned to subsets contain-
feature extraction methods that have been previously ing features that are highly correlated with the class, yet
studied is established. Note that different extraction have low inter-correlation measure with each other. Entropy
methods have different capabilities in extracting key measures are utilized to obtain a measure of correlation
information about tool conditions from multisensor between features and classes and also between features. All
signals. In this research, a program code that can continuous features are discretized using the technique
Int J Adv Manuf Technol

Table 1 Features extracted from multiple sensors in time and frequency domain

Features Description

Time domain
Pn
Arithmetic mean (M) M ¼ 1n xi
sffiffiffiffiffiffiffiffiffiffiffiffiffi
i¼1
P
n
Root mean square (RMS) RMS ¼ 1n x2i
i¼1
Pn
ðxi mÞ2
Variance (V) V ¼ i¼1
Pn1
n
ðxi mÞ3
Skewness (Sk) Sk ¼ 1n i¼1s 3
Pn
ðxi mÞ4
Kurtosis (Ku) Ku ¼ 1n i¼1s 4
Pn
Signal power (P) P ¼ 1n x2i
i¼1

Peak-to-peak amplitude (pp) pp ¼ maxðxi Þ  minðxi Þ


Crest factor (CF) CF ¼ Peak
RMS
Burst rate (Br) Number of times the signal exceeds preset thresholds per second. This feature is only applied to
vibration and AE signals. The preset threshold is set to 300 μV
Frequency domain
RF2
Sum of total band power (STPB) STPB ¼ Sð f Þ where S( f ) is the power at a specific frequency component and (F1, F2) is the
F1
frequency band
P
n
Mean of band power spectrum (MBP) MBP ¼ 1n Sð f Þi
i¼1
Pn
ðSð f Þi MBPÞ
2

Variance of band power spectrum (VBP) VBP ¼ i¼1


n1
Pn
ðSð f Þi MBPÞ
3

Skewness of band power spectrum (SkBP) SkBP ¼ 1n i¼1


VBP3=2
Pn
ðSð f Þi MBPÞ
4

Kurtosis of band power spectrum (KuBP) KuBP ¼ 1n i¼1 VBP4=2


Maximum (peak) of band power (PBP) Peak of power spectrum in a specific frequency band that is expressed by the energy level (W/Hz)
Frequency of maximum peak of band power Relative frequency that corresponds to the highest amplitude
(FPBP)
Relative spectral peak per band (RSPBP) Ratio of peak of band power (PBP) over the mean of band power (MBP)
PN
Total harmonic band power (THBP)a THBP ¼ PðmÞ; m ¼ 1; 2; :::; N where P(m) is the power at the fundamental tooth
m¼1

frequency, body cutter, and their harmonics, and N is the largest integer for which N is the cut-off
frequency for the sensor
a
This feature is only applied to the three-directional force signals

studied by Fayyad and Irani [21]. The entropy of a feature Y


is given as follows:
X
HðY Þ ¼  pðyÞ logðpðyÞÞ ð1Þ
Table 2 Distribution of time and frequency domain features y2Ry

Sensor Number of Features where Y is a discrete random variable with respective range
Ry. Then, the conditional entropy of any feature Y given the
Time domain Freq. domain occurrence of feature X, which has range Rx, can be
calculated as:
Force 24 27 51
X X
AE 9 16 26 H ðY j X Þ ¼  pðxÞ pðyÞ logðpðyÞÞ ð2Þ
Vibration 27 24 51 x2Rx y2Ry
Spindle power 8 0 8
Therefore, a measure of correlation can be obtained for
Total 68 67 135
either two features or between a feature and a class X and Y
Int J Adv Manuf Technol

where a class of an instance is considered to be a feature. of features for each individual sensor is selected by using the
This measure is often called uncertainty coefficient of Y and greedy hill climbing search algorithm. However, χ2 statistics
is calculated as follows: method is a ranking method and thus requires setting a
threshold value to include the specified number of features
HðY Þ  H ðY jX Þ
C ðY j X Þ ¼ ð3Þ within the developed subset. Therefore, in this paper, to
HðY Þ make a fair comparison between two feature selection
Now, the scores of the CFS subsets are obtained using methods, the number of features of the χ2 statistics subset
the following heuristic: is set to be equal to the one achieved by the CFS method.

krcf
MeritS ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð4Þ 2.4 Machine learning classifiers
k þ k ðk  1Þrff
where MeritS is the heuristic of a feature subset S In this paper, three ML classifiers are used to classify tool
containing k number features, and rcf and rff are the average conditions, and then, the ensemble techniques are applied
feature–class correlation and average feature–feature inter- for them to further improve the accuracy of the classifica-
correlation, respectively. In Eq. 4, the numerator is an tion. In this paper, all classifiers and reduction methods
indication of the predictive power of the feature set, while have been implemented using WEKA ML suite, which
the denominator measures redundancy among features. provides a freeware environment supported by many
machine learning authorities [24]. All of three ML
2.3.2 Greedy hill climbing search algorithm algorithms employed in this study have proven to be
effective in the pattern recognition communities: (1) SVM,
Clearly, it is prohibitive to try all possible combinations of (2) MLP, and (3) RBF.
feature subset using the evaluation function of CFS. A
simple yet effective search algorithm such as greedy hill 2.4.1 Multiclass support vector machine
climbing has demonstrated its efficiency in searching the
feature space in reasonable time and provided good results Since SVM, which is based on the statistical learning theory
[22]. Greedy search expands the current parent node and presented by [25], was proposed as a decision making
picks the child with the highest evaluation. Nodes are method, SVM has received a lot of attention in the pattern
expanded by applying search space operators to them in recognition literature. While typical ML algorithms attempt
which a single feature is added or deleted. A backward to minimize the empirical risk that is the misclassification
elimination strategy is employed where the search starts errors on the training set, the SVM attempts to minimize the
with the full set of features. Then, backward elimination structural risk that is the probability of misclassification of a
will continue to delete features as long as child is not worse previously unseen data point drawn randomly from fixed but
than its parents. This process is repeated until no more unseen distribution. The SVM generates an efficient means
improvement can be achieved. of classification by condensing the relevant information and
selecting the most important samples, called support vectors
2.3.3 χ2 statistics-based feature selection method to the target. These support vectors achieve the maximal
margin classification between classes. If linear separability of
This method measures the rank of various features based on the data is not achieved, the training data are mapped into a
their statistical dependency relative to the class. The main higher dimensional feature space using a kernel function,
objective of using χ2 statistics is to maximize its relevancy. which permits a higher level of linear separability.
In statistics, the χ2 test is applied to test the independence In this paper, the SVM has been implemented using
of two events, where two events, A and B, are defined to be sequential minimum optimization algorithm. The selection
independent if P(AB)=P(A)P(B) or, equivalently, P(A|B)=P of kernel function has influence on the decision boundary.
(A) and P(B|A)=P(B). Similarly, the χ2 coefficient in In general, a RBF is favored instead of polynomial kernel
feature selection application is given by Hong et al. [23]: functions because they are not sensitive to outliers and do
not require inputs to have equal variances. Therefore, a
      
X p f ¼ xj ; Ci  p f ¼ xj  p Cj 2 RBF has been selected as a kernel function after prelimi-
# ðf ; C Þ ¼
2     ð5Þ
p f ¼ xj  p C j nary analysis. The RBF kernel is defined as:
ij
    2 
where p(.) represents probability, Cj class j, and xj feature j. K xi ; xj ¼ exp g xi  xj  g>0 ð6Þ
Note that increasing values of χ2 indicates higher depen-
dency between feature values and class labels. It should be where K(xi, xj) defines an inner product that maps the input
pointed out here that in the CFS method, the optimal number vector x 2 <d to a high dimensional space. Moreover, in
Int J Adv Manuf Technol

this research, a grid search has been performed on the algorithm. As a result, a learning rate of 0.1 and momentum
training data in order to select the appropriate parameter for rate of 0.2 have been selected as optimal values. Interest-
the width of the RBF function, γ, and the cost function ingly, the MLP has been trained iteratively to minimize the
parameter C. The grid search has resulted in optimal values performance function of mean squared error (MSE)
of γ=0.25, and C=12.0. between the network outputs and the corresponding target
values. Specifically, the gradient of the performance
2.4.2 Multilayer perceptron neural networks function (MSE) has been used at each iteration to adjust
the network weights and biases. In this study, a mean
MLP is the most widely used learning algorithm and is square error of 10−6, a minimum gradient of 10−10, and
discussed at length in most neural network textbooks [26, 27]. maximum iteration number (epoch) of 500 have been used.
The learning process of the MLP network is based on the The training process terminates if any of these conditions
data samples, composed of the N-dimensional vector x of the are satisfied.
input layer and the M-dimensional desired output vector c of
the output layer. By processing the input vector x, the MLP 2.5 Radial basis function neural network
creates the output vector y(x, w), where w is the vector of
modified weights. The error produced triggers a control An RBF network generally consists of three layers: the
mechanism of the learning algorithm. The corrective adjust- input layer, the hidden layer, and the output layer. The input
ments are designed to make the output signal yk (k=1, 2,…, l) layer is the same as an MLP. The hidden layer consists of
to the desired response ck in an iterative manner, where l is radial function
 neurons.
 A radial function has the form of
number of classes in the output layer.  
gðxÞ ¼ g x  cj , which is symmetric with respect to c.
The learning algorithm of MLP is based on the The c can also be called the center of the function. When a
minimization of the error function defined on the learning vector is feeding into an RBF network, each hidden neuron
set (xi,ci) for i=1,2,…,p using a Euclidean norm, where p is generates a value according to how close the input vector is
the number of hidden nodes: from the center of the RBF. If the input space vector is
close to the center, the hidden neuron generates a value that
1X P
is close to 1. Then, the outputs of hidden neurons are
EðwÞ ¼ kyð xi ; w Þ  c i k2 ð7Þ
2 i¼1 combined linearly by vectors to generate the output based
on the following equation.
The MLP used in this research consists of input layer,
X
m  
hidden layer, and output layer. The input layer has nodes yk ¼ w0 wjk 8 x  cj  ð8Þ
representing the normalized features calculated from the j¼1
sensory signals. There are various methods, both heuristic
and systematic, to select the neural network structure and where 8 ðÞis the radial basis function, wjk, j=(1,2,…,m),
activation functions. In this paper, a heuristic that varies the and k=(1,2,…,l) are the output weights, w0 is the bias, x are
number of input nodes depending on the number of features the inputs to the network, ci are the centers associated with
applied to the network has been established. Specifically, the basis function, m is the number of hidden neurons, and
since the number of input changes depending on the l is the number of classes. In this paper, the activation
number of sensors and features selected, the main search function 8 ðÞis defined as follows:
engine tries to set the hidden layer to be half of the total  T  
number of the input and output layers [24]. In addition, the   x  cj x  cj
 
8 x  cj ¼ exp½ ð9Þ
number of output nodes represented the number of tool 2s 2j
condition classes (five different tool condition classes in
this paper). The target value of each output node produces where s 2j is the dispersion or smoothing parameter of the
confidence level that represents the classification probabil- jth basis function. K-means clustering algorithm is
ity of each class. In this research, the activation functions of employed to provide the basis functions. The number of
sigmoid have been used in the hidden layers and in the cluster of the k-mean that should be generated has been
output layer, respectively. Moreover, the MLP has been selected experimentally by a grid search using all available
trained and implemented using back propagation (BPN) features.
algorithm. Back propagation parameters include momen-
tum and learning rate that affect the way the network is 2.6 Machine learning ensemble
trained and, possibly, the performance of the learned
classifier. Using all available features, a grid search has In this research, a machine ensemble technique is intro-
been performed to select the parameters of the BPN duced as another part of the information fusion approach. In
Int J Adv Manuf Technol

the approach considered in this paper, a multiple classifier has to be noted that the confidence level of each prediction
model conducts an ensemble of generally weak and/or provided by each base classifier is not considered.
diverse classifiers. Then, a pool of opinion is made using a Therefore, the resulting vote is unweighted with all base
meta-classification decision that makes superior decision to classifiers having equal input to vote.
the individual classifier. The diversity of classifiers allows
different decision boundaries to be created. The intuition is
that each classifier makes a different error, and strategically 3 Experimentation
combining these classifiers can reduce the total error and at
least reduces the variance of classification error [28, 29]. In 3.1 Experimental setup
order to improve the accuracy of classification, the base
classifiers must have high disagreement between one Figure 3 shows the experimental setup of this paper to
another in mapping the solution space [30]. Otherwise, if study the design of TCM system. The experiment was
all classifiers map the solution space in a similar manner, conducted by using an OKUMA ES 3016 CNC vertical
only little improvement can be achieved over simply using machining center for machining AISI 4340 steel using
one of the base classifiers. The machine ensemble tech- Kennametal (type: HEC500S2; 12.7 mm diameter) general
nique introduced in this paper is referred to as majority vote purpose solid carbide two-flute end mill coated with a
ensemble. This ensemble technique combines the classifi- ground physical vapor deposited multilayer coating of
cation power of three ML algorithms, which are SVM, titanium nitride/titanium carbo-nitride/titanium nitride
MLP, and RBF. Figure 2 illustrates general architecture of (TiN/TiCN/TiN).
TCM model using machine ensemble approach, which An acoustic emission (AE) sensor, manufactured by
utilizes all extracted sensory features that has been selected Physical Acoustic Corporation, (PAC-Wsα), was used to
by a feature selection method (CFS in this illustration). capture the AE signal generated during machining operation.
More specifically, the majority vote ensemble is The AE signal was divided into two frequency bins. The first
achieved by combining the aforementioned base classifiers: one was created by a band pass filter of 100–300 kHz using a
SVM, MLP, and RBF. Under this ensemble scheme, each linear filtering with third-order Butterworth filters. The
classifier is trained with the same data set. When the testing second frequency bin also has a third-order Butterworth
set is applied to all base classifiers involved in the band pass filter of 300–600 kHz. In addition, to avoid
ensemble, the class with the most number of predictions aliasing in AE signal, the sampling rate was set to
is voted to be the final prediction. In general, a majority 1.5 MHz, which is a little over the Nyquist sampling rate
vote classifier is defined as: of 1.2 MHz. Gain was set to 40 dB. Also, a triaxial
accelerometer manufactured by Kistler (Type 8692C50)
L X
B   that simultaneously measures vibration in three mutually
Cmeta ðX Þ ¼ arg max I Cj ðX Þ ¼ i ð10Þ
i
j¼1
perpendicular axes (x, y, and z) was mounted on the spindle
used to measure the vibration during cutting operation. The
where I ðÞis an indicator function, Cj are the classifiers sensor was connected to a Kistler (type 5134) coupler,
where j=(1,..., B), and L is the number of target classes. It which provides a DC power and a signal processing by

Fig. 2 Example of a multisen-


sor fusion TCM system utilizing
correlation-based feature selec-
tion method and a machine
learning ensemble technique
Int J Adv Manuf Technol

Fig. 3 Schematic diagram of


experimental setup

adjustable gains and cut-off frequencies. The gain of the 200 ms, was used. The flank wear was measured by a
coupler was selected to be 10× for (X, Z) signals and 5× for microscope (Carl Zeiss Axioskop 2 Mat), which has a high-
(Y) signal. The filtering was digitally accomplished by resolution digital camera (Axiocam MRC™). The combi-
using LabVIEW software. An IIR filter with an order of 29 nation of microscope, digital camera, and Axiovision
and a cut-off frequency of 3,000 kHz was selected for all software was used to acquire, edit, measure, and store
x, y, and z vibration signals. In addition, a quartz three- images in conjunction to measure the flank wear of the tool
component dynamometer manufactured by Kistler (type: and any abnormalities such as edge chipping.
9257B) was connected to a charge amplifier (Kistler, type:
5010B) and mounted on the machining table under the job 3.2 Design of experiment
to measure the three orthogonal components of force. A
band pass filter of (30–3,000 Hz) was applied on each of It is desirable that TCM model reflects the conditional
the axial force signals. The last sensor used in this changes in cutting tools under diverse cutting conditions
experiment was a true power measuring transducer MU3 such as different level of cutting speed, feed rate, and depth
manufactured by Artis Systems, which was used in of cut. Therefore, in this study a 23 full-factorial design
combination with two hall sensors (model: LT-100S) to with three replications was selected in order to effectively
measure true power of spindle motor. All sensors were capture the relationship between the milling process
connected through BNC cables to a National Instrument parameters (independent variables) and the calculated
noise rejecting shielded BNC connection box that acts as a signal features (dependent variables). Specifically, three
gateway for all the eight signals. The connection box then factors (independent variables) used for the design of
sends all of the eight signals to a National Instrument data experiment in this study were surface speed, chip load,
acquisition card (model: NI PCI-6133), which has the and axial depth of cut. Each factor has two levels, i.e., High
ability to convert the signals from analog to digital with a (H) and Low (L). These levels are provided in Table 3, and
high sampling rate of 3 MB/s/channel. Finally, all the in total, there are eight cutting conditions or treatments per
digital signals are properly filtered and analyzed by replication. In Table 3, different cutting conditions of an
LabVIEW software. The software extracts predefined experiment are represented using three-letter notations such
features in the time and frequency domains as defined in that the first letter refers to the depth of cut, the second
Section 2.2. These features have been set as the input letter defines the cutting speed, and the third letter refers to
features (predictors) to the TCM classification model. An the feed rate. The radial depth of cut, also referred to as
identical sampling time of all the sensors signals, which is immersion, was kept constant at 11.1 mm throughout the
Int J Adv Manuf Technol

Table 3 Machining parameters and their levels 4 Result and discussion


Depth of cut Cutting speed Feed rate
(mm) (m/min) (mm/tooth) 4.1 Classifier training and evaluation

2.54 122 0.08 0.13 The notion of training and testing a dataset is fundamental
LLL LLH to the performance of ML algorithms. In this paper, the
152 LHL LHH training set contains examples of signals features (135
3.56 122 0.08 0.13 features) and/or machining parameters features, i.e., speed,
HLL HLH feed, and depth of cut, from different classes (tool
152 HHL HHH conditions), and this is used to build the classification
model. The testing set represents the unknown sensory
information that can be classified. Both testing and training
sets are labeled with appropriate class a priori. To improve
experiments. This radial depth of cut is about 80% of tool training process of a classifier, normalization was applied to
diameter (12.7 mm), and this insures that all teeth of the all training features as a preprocessing step. To test and
milling cutter (two-flute tool cutter in this experiment) are compare algorithms, we used ten times repeated 10-fold
in contact with the workpiece during end milling. stratified cross-validation (CV), where accuracy results
were averaged across replications to minimize type I error
3.3 Tool condition classes [31]. This means that each classification model was trained
on nine tenths of the total data and tested on the remaining
In this research, a tool condition is classified based on the tenth. This process is repeated ten times, each with a
following three criteria: (1) if the tool is worn due to flank different partitioning seed, in order to account for the
wear, (2) if the tool is worn due to chipping, and (3) if the variance between partitions. Also, stratification was applied
tool breaks. For each criterion, a measurement was to every testing set in order to count for the non-uniform
established to verify the level of failure. The reason distribution of tool condition classes within the collected
behind differentiating between chipping and breakage is dataset as seen in Fig. 4. The accuracy of classification
that chipping phenomena cannot be detected for a results in this research stands for the percentage of correctly
considerable amount of time and is hardly noticed by the classified instances over the total number of instances. For
operator. Measurements of wear and chipping were various accuracy measures and the relationship between the
collected as the tool condition progresses during cutting. test conditions and the TCM accuracy, refer to the literature
However, breakage was obtained artificially through the [32]. Each result represents the average of 100 runs (10×10
grinding of the tool since it is hard to observe breakage for fold CV). A paired t test was applied for pairwise
each treatment within the design of experiment. Therefore, comparisons of classifications algorithms [33].
experiments were conducted for five different tool
conditions, namely breakage (B), chipping (C), and three 4.2 Classification under no sensor fusion
states of wear, which are defined as slight (LW), medium
(MW), and severe (SW) wear. Table 4 summaries the In this section, we compare the performance of base
specification of tool condition classes according to the classifiers that are based on a single-sensor model for
level of maximum flank wear, i.e., VB max that is which no fusion is applied. With such restriction, a
illustrated in Fig. 4. comparison of features per sensor, which are defined in
The flank wear (VBmax) is the average of the two flank the Section 2.2, is investigated, and the significance of each
wear readings recorded for each flute. Reading of the flank individual sensor on tool condition classification is deter-
wear was taken at the end of every third cut until the tool mined. First, 28 features were extracted from AE sensor,
reached its wear criterion of 0.6 mm per each treatment.
Chipping of the tool edge was considered valid if there
were a chipped area of over 0.04 mm2. Otherwise, the tool Table 4 Definition of the tool conditions classes
was classified by its flank wear level. Tool breakage class
Tool (LW) (MW) (SW) (C) mm2 (B) mm2
was defined for a tool that has a breakage area over condition mm mm mm
0.36 mm2. Figure 5 provides the distribution of classes per class
cutting condition after the experiment was conducted. This
distribution clearly shows that at LHL condition, tool life Tool 0< 0.25< 0.4< 0.04<C. Brk.
features wear< wear< wear< area< area>
increased with no chipping occurred so that various data 0.25 0.4 0.6 0.36 0.36
points can be collected for further analysis.
Int J Adv Manuf Technol

Fig. 4 Flank wear boundaries at C B N


the tool-cutting edge

VBave VN
VC
VBmax

and only 13 features were selected by CFS and χ2 have high diversity in representing the solution space,
statistics. These features were used to compare the which allows improved accuracy and stability in the
classification accuracy of different TCM architectures using classification. Improved stability can be captured by
different ML classifiers. The classification results are standard deviation that is shown in the right side graph. It
shown in Fig. 6, where AE_All, AE_CFS, and AE_Chi2 should be pointed out here that force sensor-based TCM
represent accuracy with all feature selected, reduced can provide higher accuracy than AE sensor-based TCM.
features by CFS and χ2 statistics, respectively. It is Figure 8 shows the classification accuracy of different
observed that SVM outperforms both types of neural designs of TCM system when only vibration sensors are
networks (MLP, RBF). Figure 6 also shows that CFS can used. In this case, 54 features were extracted, and they were
increase the accuracy significantly. Interestingly, it is shown reduced to 16 by selection methods. Like the case of AE
that the accuracy is not improved by using machine and force sensor, the application of SVM and CFS
ensemble, and it is conjectured as follows: To improve the improves the accuracy with 88.84%. While the application
accuracy by using machine ensemble, individual classifiers of machine ensemble does not improve the accuracy, it has
must have high disagreement-maintaining diversity. The improved the stability as shown in the right side figure.
result shown in Fig. 6 implies that in case of features When spindle power is measured, 11 features were
obtained from AE sensor, the decision boundaries generated extracted, and they were further reduced to six features by
by different classifiers are similar and thus produce selection methods. Figure 9 shows that SVM outperforms
insignificant improvement. MLP and RBF. It also shows that the application of feature
Next, Fig. 7 shows the classification accuracy of selection methods and machine ensemble does not improve
different designs of TCM systems when only force sensors accuracy. We conjecture that the dimension of extracted
are used. It is observed that when the force sensor is used, features is not sufficiently large in this case.
17 features were selected out of 54 features extracted and
SVM outperforms MLP and RBF. It is also observed that 4.3 Classification using multisensor fusion
like the case of AE sensor, CFS improves the accuracy. In
addition, Fig. 7 shows that machine ensemble for feature In this section, no fusion constraint in Section 4.2 is relaxed
subset F_CFS results in the highest accuracy with 91.98%. so that sensor fusion is applied to the information provided
This result implies that base classifiers for F_CFS ensemble by each sensor. Specifically, in this paper, we test two
different designs for multisensor fusion: (1) decision level
fusion and (2) feature level fusion.

4.3.1 Decision level fusion

In this fusion method, individual TCM model using single


sensor that is explained in previous section acts as an expert
within its own feature space. Then, a pool of opinion is
made by using a majority vote rule to provide a meta-
classification decision that would be superior to individual
classifiers. In this analysis, we need to find sensor
combinations that can give us the best accuracy under a
Fig. 5 Distribution of classes per cutting condition for 758 experi-
specific classifier as all the analysis required has been
mentally collected instances (first stack on bottom, broken; second
stack, chipped; third stack, slight wear; fourth stack, medium wear; conducted under no fusion assumption in the previous
fifth stack, severe wear) section. The rank of single sensors is force (F), vibration
Int J Adv Manuf Technol

Fig. 6 Classification with


different feature selection and
machine learning methods
when using AE sensor (left
column AE_All, middle
column AE_CFS, right column
AE_Chi2): accuracy (left),
standard deviation (right)

Fig. 7 Classification with


different feature selection and
machine learning methods when
using force sensor (left column
F_All, middle column F_CFS,
right column F_Chi2): accuracy
(left), standard deviation (right)

Fig. 8 Classification with


different feature selection and
machine learning methods when
using vibration sensor (left
column V_All, middle column
V_CFS, right column V_Chi2):
accuracy (left), standard
deviation (right)

Fig. 9 Classification with


different feature selection and
machine learning methods when
using spindle power sensor (left
column P_All, middle column
P_CFS, right column P_Chi2):
accuracy (left), standard
deviation (right)
Int J Adv Manuf Technol

Table 6 Percent accuracy of tool condition classification obtained by


using three-sensor combined TCM models with feature level fusion
(all: all features; cfs: reduced by cfs; chi2: reduced by χ2 statistics)

All CFS Chi2

AE + F + V
SVM 96.23±2.21 95.89±2.03 93.99±2.49
MLP 94.83±2.36 93.98±2.29 92.42±2.88
RBF 93.21±2.20 93.72±2.27 91.38±2.63
Ensemble 96.97±1.73 97.67±1.39 94.51±2.73
AE + V + P
Fig. 10 Classification accuracy of different sensor combinations and SVM 91.08±2.94 89.78±2.56 87.46±2.67
feature selection methods (left column SVM, middle column MLP, MLP 89.96±2.87 88.79±2.97 86.03±3.35
right column RBF)
RBF 87.01±3.27 86.15±3.01 84.73±3.61
Ensemble 93.55±2.60 91.58±2.53 90.09±2.94
AE + F + P
SVM 92.53±2.67 91.10±2.97 89.13±3.05
Table 5 Percent accuracy of tool condition classification obtained by MLP 89.91±2.53 89.04±2.80 87.62±3.01
using two-sensor combined TCM models with feature level fusion RBF 87.85±2.41 87.19±2.37 85.54±3.49
(all: all features; cfs: reduced by cfs; chi2: reduced by χ2 statistics) Ensemble 93.51±2.33 94.27±2.19 92.94±2.69
All CFS Chi2 F+V+P
SVM 94.99±2.19 94.44±2.19 93.01±2.46
AE + V MLP 92.43±2.91 92.19±2.79 90.52±3.12
SVM 89.82±3.11 90.01±2.98 87.13±2.66 RBF 91.50±3.53 90.53±2.91 89.09±3.88
MLP 88.39±2.93 86.46±2.73 83.83±3.51 Ensemble 95.57±2.04 95.82±1.91 94.38±2.59
RBF 86.02±2.83 86.52±2.48 84.67±3.03
Ensemble 90.92±2.59 89.57±2.43 86.21±2.81
AE + F
(V), acoustic emission (AE), and spindle power (P) in terms
SVM 91.94±2.97 91.63±2.67 90.18±2.79
of their order of accuracy. Therefore, we test the following
MLP 89.74±3.03 89.56±2.91 87.29±2.93
sensor combinations: (F+V+AE+P), (F+V+AE), (F+V),
RBF 88.32±3.01 86.94±2.34 83.16±2.89
(V+AE+P), and (V+AE) with different ML classifiers
Ensemble 92.10±2.68 92.81±2.70 91.39±2.83
(SVM, RBF, and MLP) and selection methods (CFS and χ2
F+P
statistics). Figure 10 shows the classification accuracy of
SVM 90.11±3.97 89.79±3.59 89.52±3.50
these sensor combinations with two different feature
MLP 87.12±3.83 87.73±3.42 86.19±3.83
selection methods (CFS and χ2 statistics). It is shown in
RBF 85.05±4.24 85.97±3.90 84.15±4.71
the figure that none of those sensor combinations based on
Ensemble 89.46±2.99 88.85±2.70 86.57±3.18
decision level fusion outperforms single-sensor TCM
AE + P model with force sensor. Nonetheless, we can observe that
SVM 71.96±3.16 71.46±3.72 70.63±3.57 SVM outperforms other neural networks (MLP and RBF)
MLP 70.51±3.97 69.47±3.50 66.31±4.20 as the base classifier.
RBF 69.44±3.92 64.31±3.61 64.19±3.80
Ensemble 71.47±2.91 71.08±2.34 70.37±2.97
F+V Table 7 Percent accuracy of tool condition classification obtained by
using four-sensor combined TCM models with feature level fusion
SVM 94.48±2.24 94.63±2.24 93.26±2.21 (all: all features; cfs: reduced by cfs; chi2: reduced by χ2 statistics)
MLP 92.66±2.21 91.41±3.01 90.41±2.74
RBF 91.61±2.91 90.97±2.71 88.17±3.06 AE + F + V + P
Ensemble 95.71±2.03 96.22±2.23 94.16±2.41
All CFS Chi2
V+P
SVM 88.20±3.73 89.17±3.50 88.56±3.61 SVM 95.91±2.34 95.56±2.40 93.91±2.67
MLP 87.98±3.32 88.15±4.50 85.61±3.97 MLP 94.23±2.31 93.59±2.35 92.34±3.10
RBF 86.19±3.83 86.03±3.59 84.92±4.50 RBF 93.04±2.14 93.49±2.19 91.49±2.59
Ensemble 90.15±2.41 89.63±2.27 87.21±2.57 Ensemble 97.32±1.98 97.28±1.70 94.43±2.21
Int J Adv Manuf Technol

4.3.2 Feature level fusion introduction of machine ensemble (majority vote) has
relatively improved the accuracy of tool condition classifi-
In this fusion method, features from multiple sensors are cation. This is true especially:
combined into a single set, and then their informational
& where there exist more complementary features from
combinatory power is fed to the classifiers for training
each sensor within the fused feature set
purpose. This implies that individual classifier experiences a
& where there exists a high degree of diversity (in
larger input space and higher volume of information and thus
mapping of the solution space) within the ensemble
may increase the accuracy and robustness of the classifica-
base classifiers
tion. It should be pointed out that when features from
different sensors are fused, the performance of ML algo- It has also been shown from the experiment that CFS can
rithms may not be as predictable as in the decision level improve the accuracy and robustness of the classification.
fusion because in decision level fusion the performance of This is important because CFS can reduce feature space
base classifiers is known from no fusion analysis a priori. considerably in general. It should be pointed out here that
Therefore, in feature level fusion, a total of 11 combinations most of existing research on TCM systems for end-milling
of sensors are tested, which are one four-sensor combination, operations focuses on a specific condition of the cutting
four three-sensor combinations, and six two-sensor combi- tools such as estimation of wear level and breakage
nations. The number of features is again reduced by CFS and detection (normal or broken), and thus, chipping is mostly
χ2 statistics. Tables 5, 6, and 7 summarize the classification ignored as a critical tool condition in the literature. As a
accuracy achieved by two-, three-, and four-sensor combi- conclusion, this study can measure every possible tool
nations for feature fusion-based TCM system. It is observed condition under one TCM system. The results from the
from these tables that the best performance achieved in case experiments show that our method can classify a multiflute
of two-sensor combination is 96.22%, particularly by force end-milling tool with great accuracy and optimally reduced
and vibration sensor combination with CFS selection method features and specific sensor combination. In addition,
and machine ensemble. It is also observed that in the case of irrelevant sensors and features have been observed. This
three-sensor combination, the best performance achieved is facilitates more efficient modeling of the TCM because a
97.67% by force, vibration, and acoustic emission sensor selection of 25 feature subset size proves to be more
combination with CFS selection method, and the best accurate and robust than the inclusion of the entire set of
performance is 97.32% in case of all sensor combination 138 explanatory features, which can significantly reduce
without feature reduction. From the accuracy and robustness computational effort of the TCM system.
perspectives, we suggest to use force, vibration, and acoustic
sensor combination with CFS feature selection method and
machine ensemble technique for multisensor fusion TCM References
system. If this combination of sensors is not available, the
alternative is to use a force and vibration sensor combination 1. Kegg RL (1984) On-line machine and process diagnostics. Ann
CIRP 32(2):469–473
with CFS and machine ensemble technique. It should be 2. Kurada S, Bradley C (1997) A review of machine vision sensors
emphasized here that overall, SVM outperforms other for tool condition monitoring. Comput Ind 34:55–72. doi:10.1016/
machine learning classifiers (MLP, RBF neural network) in S0166-3615(96)00075-9
all sensor combinations tested. 3. Elbestawi MA, Papazafiriou TA, Du RX (1991) Process monitor-
ing of tool wear in milling using cutting force signature. Int J
Mach Tools Manuf 31(1):55–73. doi:10.1016/0890-6955(91)
90051-4
5 Conclusion 4. Lee BY, Tarng YS (1999) Milling cutter breakage detection by
discrete wavelet transform. Mechatronics 9:225–234. doi:10.1016/
S0957-4158(98)00049-X
This paper studied the design of multisensor fusion-based 5. Bhattacharyya P, Senupta D, Mukhopadhaya S (2007) Cutting
TCM system in end-milling process. Specifically, we force based real-time estimation of tool wear in face milling using
focused on the accuracy and robustness of tool condition a combination of signal processing techniques. Mech Syst Signal
classification when different design components are con- 21(6):2665–2683. doi:10.1016/j.ymssp.2007.01.004
6. Chen JC, Chen W (1999) Tool breakage detection system using
sidered for multisensor fusion-based TCM system, for accelerometer sensor. J Intell Manuf 10:187–197. doi:10.1023/
example, different sensor combinations, ML algorithms A:1008980821787
(SVM, MLP, and RBF), and fusion methods (feature and 7. Yesilyurt I, Ozturk H (2007) Tool condition monitoring in milling
decision level). The results from the experiment have using vibration analysis. Int J Prod Res 45(4):1013–1028.
doi:10.1080/00207540600677781
shown that the SVM outperforms other neural network- 8. Atlas L, Ostendorf M, Bernard GD (2000) Hidden Markov models
based algorithms (MLP and RBF) due to its nature of for monitoring machining tool wear. IEEE International Conference
structural risk minimization. It is also shown that the on Acoustics, Speech, and Signal Processing 6:3887–3890
Int J Adv Manuf Technol

9. Tansel IN, Trujillo ME, Bao WY (2001) Acoustic emission-based 20. Hall MA (1999) Correlation-based feature selection for machine
tool breakage detector for micro-end milling operations. Int J learning. PhD dissertation, The University of Waikato, New Zealand
Model Simul 21(1):10–16 21. Fayyad UM, Irani KB (1993) Multi-interval discretization of
10. Ghosh N, Ravi YB, Patra A, Mukhopadhyay S, Paul S, Mohanty continuous-valued attributes for classification learning. IJCA
AR, Chattopadhyay AB (2007) Estimation of tool wear during 93:1022–1027
CNC milling using neural network based sensor fusion. Mech 22. Kohavi R, John GH (1997) Wrappers for feature subset selection.
Syst Signal Process 21:466–479. doi:10.1016/j.ymssp. Artif Intell 97:273–324. doi:10.1016/S0004-3702(97)00043-X
2005.10.010 23. Hong SJ, Raman M, Wong YS (2002) Feature extraction and
11. Cho S, Asfour S, Onar A, Kaundinya N (2005) Tool breakage selection in tool condition monitoring system. Lecture notes in
detection using support vector machine learning in a milling computer science. Springer, Berlin, pp 487–497
process. Int J Mach Tools Manuf 45(3):241–249. doi:10.1016/j. 24. Witten IH, Frank E (2005) Data mining: practical machine
ijmachtools.2004.08.016 learning tools and techniques, 2nd edn. Morgan Kaufmann, San
12. Norman P, Kaplan A, Rantatalo M, Svenningsson I (2007) Study Francisco 2005
of a sensor platform for monitoring machining of aluminum and 25. Vapnik VN (1999) An overview of statistical learning theory.
steel. Meas Sci Technol 18:1155–1166. doi:10.1088/0957-0233/ IEEE Trans Neural Netw 10(5):988–999. doi:10.1109/72.788640
18/5/001 26. Rumelhart DE, McClelland JL (1986) Parallel distributed pro-
13. Reddy YB (1992) Multisensor data fusion: state of the art. J Inf cessing: explorations in the microstructure of cognition. MIT
Sci Technol 2(1):91–103 Press, Boston
14. Rehorn AG, Jiang J, Orban PE (2005) State-of-the-art methods 27. Bishop CM (2006) Pattern recognition and machine learning.
and results in tool condition monitoring: a review. Int J Adv Springer Science and Business Media, LLC., New York
Manuf Technol 26:693–710. doi:10.1007/s00170-004-2038-2 28. Kittler J, Hatef M, Duin R, Matas J (1998) On combining
15. Elbestawi MA, Marks J, Papazafiriou T (1989) Process monitor- classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239.
ing in milling by pattern recognition. Mech Syst Signal Process 3 doi:10.1109/34.667881
(3):305–315. doi:10.1016/0888-3270(89)90055-1 29. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE
16. Silva RG, Reuben RL, Wilcox SJ (1998) Tool wear monitoring Trans Pattern Anal Mach Intell 12(10):993–1001. doi:10.1109/
of turning operation by neural network and expert system 34.58871
classification of a feature set generated from multiple sensors. 30. Dietterich TG (2000) An experimental comparison of three
Mech Syst Signal Process 12(2):319–332. doi:10.1006/ methods for constructing ensembles of decision trees: bagging,
mssp.1997.0123 boosting, and randomization. Mach Learn 40(2):139–157.
17. Sick B (2002) On-line and indirect tool wear monitoring in doi:10.1023/A:1007607513941
turning with artificial neural networks: a review of more than a 31. Bouckaert RR (2003) Choosing between two learning algorithms
decade of research. Mech Syst Signal Process 16(4):487–546. based on calibrated test. Proceedings of 20th International
doi:10.1006/mssp.2001.1460 Conference on Machine Learning, Morgan Kauffmann
18. Brezak D, Udiljak T, Majetic D, Novakovic B, Kasac J (2004) 32. Palanisamy P, Rajendran I, Shanmugasundaram S (2008) Predic-
Tool wear monitoring using radial basis function neural network. tion of tool wear using regression and ANN models in end-milling
Proc IEEE Int Jt Conf Neural Netw 3:1859–1862 operation. Int J Adv Manuf Technol 37:29–41. doi:10.1007/
19. Yuan S, Chu F (2006) Support vector machines-based fault s00170-007-0948-5
diagnosis for turbo-pump rotor. Mech Syst Signal Process 33. Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–
20:939–952. doi:10.1016/j.ymssp.2005.09.006 259. doi:10.1016/S0893-6080(05)80023-1

You might also like