LSTM Deep Learning Approach For Bearing Fault Diagnosis
LSTM Deep Learning Approach For Bearing Fault Diagnosis
Online ISSN 1841-9844, ISSN-L 1841-9836, Volume: 15, Issue: 1, Month: February, Year: 2020
Article Number: 1010, https://doi.org/10.15837/ijccc.2020.1.3780
CCC Publications
Ping Zou
1. School of Economics and Management,
Beijing Jiaotong University, China
No.3 Shangyuancun Haidian District Beijing 100044 P. R. China
[email protected]
2. Beijing Aerospace Smart Manufacturing Technology Development CO., Ltd, China
49 Badachu Road, Shijingshan Dist. Beijing 100144 P. R. China
[email protected]
Baocun Hou
Beijing Aerospace Smart Manufacturing Technology Development CO., Ltd, China
49 Badachu Road, Shijingshan Dist. Beijing 100144 P. R. China
[email protected]
Lei Jiang
Railway Information Center, China
China National Railway Corporation,
Beijing 100844, China
[email protected]
Zhenji Zhang*
School of Economics and Management,
Beijing Jiaotong University, China
No.3 Shangyuancun Haidian District Beijing 100044 P. R. China
*Corresponding author: [email protected]
Abstract
The condition monitoring and fault detection of rolling bearing are of great significance to
ensure the safe and reliable operation of rotating machinery system.In the past few years, deep
neural network (DNN) has been recognized as an effective tool to detect rolling bearing faults.
However, It is too complex to directly feed the original vibration signal to the DNN neural network,
and the accuracy of fault identification is not high. By using the signal preprocessing technology,
the original signal can be effectively removed and preprocessed without losing the key diagnosis
information. In this paper, a new EEMD-LSTM bearing fault diagnosis method is proposed, which
combines the signal preprocessing technology with the EEMD method that can get clear fault
feature signals, and LSTM technology to extract fault features automatically that improves the
efficiency of fault feature extraction. In the case of small sample size, this method can significantly
improve the accuracy of fault diagnosis.
Keywords: fault diagnosis, EEMD, LSTM, motor bearing.
https://doi.org/10.15837/ijccc.2020.1.3780 2
1 Introduction
Rotating machinery is widely used in modern manufacturing, aerospace, vehicles, wind turbines
and other fields. Fault diagnosis of key components of rotating machinery, especially bearings, plays
an important role in the reliability and safety of modern industrial system [19]. Research on fault
diagnosis methods of bearings is of great significance to maintain technical status of equipment and
equipment and extend the service time.In the past few decades, major fault diagnosis technology based
on vibration signal analysis can be divided into signal processing method [21], intelligent diagnosis
method [22]and remaining service life prediction method [24].The signal preprocessing method is very
important for bearing fault feature extraction and fault recognition accuracy.On the one hand, the
bearing vibration signal is often disturbed by the external environment noise, which makes the signal-
to-noise ratio low.On the other hand, there are many kinds of bearing parts, the vibration signal
measured shows obvious nonlinearity, and the vibration caused by large-scale transient fluctuation of
load has strong non-stationary.Therefore, it is necessary to preprocess the acquired original signal to
enhance the feature information, so as to obtain better diagnosis results.
EMD (empirical mode decomposition) is a time-frequency signal decomposition tool proposed by
N.E. Huang for analyzing non-stationary and non-linear signals [15]. It is a typical signal processing
method and widely used in the field of mechanical fault diagnosis [13]. EMD can decompose complex
signals into sum of a series of intrinsic mode function (IMF), and each IMF component represents
an inherent vibration mode of signals, including different characteristic time scales, so that the signal
characteristics are displayed at different resolutions.Because the frequency component of each IMF
is not only related to the sampling frequency, but also changes with the change of the signal itself,
EMD method is an adaptive signal decomposition method with high signal-to-noise ratio, which is
very suitable for processing time-varying and non-stationary signals.However, mode mixing is the most
significant disadvantage of EMD [25]. EEMD proposed by Z. Wu and N.E. Huang is an improved
method [28], which solves the problem of mode mixing phenomenon by adding white noise into the
original signal. The spectrum obtained by EEMD method reflects the bearing fault characteristics
more accurately.Based on the EEMD method, scholars have made some achievements in the field of
fault diagnosis.Yaguo Lei proposed a fault feature extraction method based on EEMD [17], which is
applied to the early rub impact fault diagnosis of heavy oil catalytic cracking unit.In order to solve
the problem of EMD mode mixing, Tong Wang proposed a noise aided data analysis method called
set experience [26], and studied the influence of EMD and EEMD on time-frequency.The pattern
recognition methods used in these studies are mainly based on the "shallow learning" algorithm, which
has high requirements for signal acquisition and processing, limited expression ability for complex
functions, and poor generalization ability of the model.
Deep neural network (DNN) originated from artificial neural network (ANN) [5], widely used in
vision, speech recognition and natural language processing.Various deep learning algorithms, such as
auto encoder, stack auto encoder [6], DBM and DBN [14], have also been successfully applied to fault
diagnosis.Deep learning is a powerful feature learning ability, which can meet the requirements of
adaptive feature extraction of mechanical fault diagnosis. Its application in the field of fault diagnosis
can reduce the dependence on expert fault diagnosis experience and signal processing technology, and
reduce the uncertainty introduced by traditional fault diagnosis methods due to artificial design and
feature extraction.In addition, deep learning can learn the complex mapping relationship between
monitoring data and fault state by building a deep model with multiple hidden layers and multi-level
abstraction in the way of nonlinear mapping, which is very suitable for fault diagnosis in the context
of complex data.
Convolution neural network (CNN) and recurrent neural network (RNN) are the typical methods of
DNN.Most fault vibrations, such as flaking vibration, are periodic due to repeated impact. The ampli-
tude and period depend on the bearing specifications and operating conditions. It is difficult for CNN
to identify the continuous changes of these vibration characteristics. Therefore, CNN is not enough to
diagnose continuous and periodic faults.Recursive neural network (RNN) architecture, especially long-
term and short-term memory (LSTM) neural network and its variants, has been successfully applied
in the fields of image subtitle, speech recognition, genome analysis and natural language processing
[3]. LSTM is a kind of special RNN, which can extract the characteristic information contained in
https://doi.org/10.15837/ijccc.2020.1.3780 3
the data of sea going volume efficiently, and can learn the dynamic change of time series adaptively,
and can solve the problem of gradient disappearance and gradient explosion in the process of long
series training. The experiment shows that the accuracy is better than the conventional RNN model,
and can effectively improve the diagnosis rate [16]. Zhao, R. obtains the long-term dependence of
vibration signal through the LSTM model and models the sequence data, monitors the health status
of the machine [32]. Xiang Xu uses the method based on LSTM to improve the accuracy of fault
identification in power system [31].
This paper focuses on the intelligent fault diagnosis of the bearing of the rotating electrical machine
by referring to the research progress of deep learning.According to the non-stationary characteristics of
bearing vibration signal, EEMD method is combined with LSTM. EEMD is used as the preprocessor
of vibration signal. IMF component with obvious fault feature information is selected according to the
kurtosis of each component. Then, the information of IMF component is adaptively fused by LSTM
and the feature is extracted from it to complete the intelligent classification and recognition of bearing
state.At last, the method is compared and analyzed with other fault diagnosis methods by test data.
The results show that the performance of the proposed method is far better than SVM and BPNN.
Compared with emd-cnn, the method has higher accuracy and better stability.
(1) Add n groups of different Gaussian white noise si (t) into the original signal x(t), so that the
signal and noise become n populations
yi (t) = x(t) + si (t)
(2) EMD decomposition of N populations with white noise is carried out to obtain the correspond-
ing intrinsic modal function components (IMF) cij (t) and remainder terms ri (t). cij (t) represents the
https://doi.org/10.15837/ijccc.2020.1.3780 4
j-th IMF obtained by decomposition after the i-th white noise is added.
(3) The final decomposition result of EEMD is obtained by averaging the corresponding IMF in
step 2
N
X
ci (t) = 1/N cij (t)
i=1
where cj (t) represents the j-th component obtained after EEMD decomposition of the original signal.
Compared with one EMD decomposition, EEMD can eliminate mode aliasing, making the physical
meaning of IMF clearer.
(4) IMF component selection: each modal function component decomposed IM Fi is a narrow band
signal from high frequency to low frequency. However, when EEMD noise reduction, IMF component
should be properly selected to retain the most obvious IMF component of fault feature information.
Kurtosis has nothing to do with bearing speed, size and load, etc. It is very sensitive to impact signal.
It is particularly suitable for surface damage faults, especially early faults diagnosis. Through kurtosis
analysis of IM Fi components , IMF components with obvious fault feature information are selected,
and the signal kurtosis formula is as follows:
1 PN
N i=1 (|xi | − x̄)4
Cq = 4
Xrms
where Xrms represents the root mean square value of the signal and is an important index used to
determine whether the operation state is normal in the mechanical fault. By kurtosis selection [18],
the reserved component IM Fi fault feature signal is more significant.
Forgetting gate, input gate and output gate are the core components of LSTM, which are composed
of multiple feature maps.The forgetting gate determines how much front information passes, and the
output results are as follows:
ft = σ(wf z zt + whf ht−1 + bf )
where σ represents the activation function, w represents the weight, zt represents the current input,
and ht−1 is the input result of the previous neuron.bf represents offset.
The input gate determines which information can be retained. The calculation formula is as
follows:
it = σ(wzi zt + whi ht−1 + bi )
c˜t = tanh(wxc zt + whc ht−1 + bi )
The output gate determines which information can be output as a unit, and the calculation formula
is as follows:
ct = ct−1 ft + it c˜t
ot = σ(wzo zt + who ht−1 + bo )
ht = ot × tanh(ct )
After the LSTM layer is the layer for classification, and the calculation formula is as follows:
eui
sof t max(yi ) = P u
e i
where p(x) represents the real classification result, and q(x) is the classification result output by
sof t max.
Using a method for stochastic optimization(ADAM) to train LSTM and determine network param-
eters, that is, weight and deviation.Adam can dynamically adjust the learning rate of each parameter
by using the first-order moment estimation (mean) and second-order moment estimation (variance) of
the gradient, and has achieved success in optimizing the learning rate of LSTM.
The performance of LSTM is related to the number of network layers and the size of output
layer.First, the learning ability of LSTM is positively related to the number of network layers.To a
certain extent, the deeper the network structure, the stronger the expression ability. Multi layer LSTM
is to stack LSTM. Its advantage is that it can express features more abstractly at high level, reduce the
number of neurons, increase recognition accuracy and reduce training time.If the structure of LSTM is
too simple and the learning ability is poor, the information contained in the input cannot be extracted
effectively.The deep structure of the network means that more training data is needed to train a
large number of parameters, otherwise, over fitting and local optimization problems may occur.At the
same time, due to the increasing complexity of the network, it is very time-consuming to train the
model, which increases the difficulty of building a suitable and efficient model.Secondly, the output
size of LSTM needs to be tested and determined according to some basic design principles.Generally
speaking, the longer the time step is, the higher the output dimension of the LSTM is, and the more
information it remembers [7], more data can be obtained and more information can be provided for the
deeper level. More importantly, for mechanical fault diagnosis, it can better suppress high-frequency
noise and improve the accuracy of fault identification.In addition, the output dimension of LSTM in
the first layer is set to a large size to deepen the network and improve the expression ability of the
network, so as to learn to obtain a good feature representation of the input signal.Although some basic
principles of setting super parameters are mentioned above, a lot of debugging is still needed in the
actual process to get the appropriate parameter value [30].
merge and create a data set, form a sample data set of [n, k, s] dimension, and divide the data set into
training set and test set.
Step 2: LSTM neural network training
(1) LSTM design and training: design feasible LSTM in accordance with the design principles
described in Section 3.3 and use the training set for training, debug the super parameters, reduce
the network calculation through the pooling layer, increase the model generalization ability, use the
full connection layer and classify the training results, and obtain a better performance LSTM model
through network iterative training.
(2) Qualitative diagnosis of bearing fault: verify the validity of bearing fault diagnosis model based
on EEMD and LSTM through test set.
5 Test
5.1 Test description
In this paper, PHM 2009 challenge data of Xichu university [33] is used as experimental data to
analyze and verify the EEMD + LSTM method proposed in this paper. As shown in Figure 5 and
Figure 6 below, the gearbox test bench includes a 2 horsepower motor (left), a torque sensor / encoder
(middle), a dynamometer (right) and control electronic equipment. The test bearing is used to support
the motor shaft. Each pair of meshing gears contains a spur gear and a helical gear. Acceleration data
are measured near and far away from the motor bearing. Single point fault is introduced into the test
bearing by EDM.
The experimental environment is shown in Table 1 below. The faults between 0.007 inch and 0.040
inch in diameter occur on the inner raceway, rolling element (i.e. ball) and outer raceway respectively,
https://doi.org/10.15837/ijccc.2020.1.3780 8
and the vibration data of 0-3 horsepower (motor speed 1797-1720 RPM) motor load are recorded.
the effective IMF, which contains more vibration characteristic information generated by bearing
fault.Then, the six IMF components with obvious fault characteristics are stacked into a multi-channel
sample in the order of kurtosis from large to small, and the sample data dimension changes from
3000 × 400to3000 × 6 × 400. The sample data dimension changes from 3000 × 400to3000 × 6 × 400.
Carry out the above operations on all sample signals, create the data set, and divide the data set into
training set and test set. The final data set is shown in Table 2.
Table 2: Fault diagnosis training set and test set of rotating motor
Bearing state Fault classification Faulty diameter Training set Test set Label
normal normal - 210 90 0
Spur1 Ball fault 007 210 90 1
Spur2 Ball fault 014 210 90 2
Spur3 Ball fault 021 210 90 3
Spur4 Inner ring fault 007 210 90 4
Spur5 Inner ring fault 014 210 90 5
Spur6 Inner ring fault 021 210 90 6
Spur7 Outer ring fault 007 210 90 7
Spur8 Outer ring fault 014 210 90 8
Spur9 Outer ring fault 021 210 90 9
https://doi.org/10.15837/ijccc.2020.1.3780 10
certain extent, and improve the generalization performance of LSTM. The designed LSTM network is
extracted by three LSTM feature extraction units, and a full connection layer for feature integration
and classification through the layer. The output of LSTM in the first level feature extraction unit
is 3 × 256, and then there are two feature extraction units, which realize the deep structure of the
network and improve the expression ability of the network.
NU=number of units;OC=output channel;TS=time steps.
One of the important tasks of deep learning is to adjust the super parameters, as shown in Table
4. In this paper, the super parameters of batch size and learning rate are as follows. The mini batch
gradient method is used to calculate the loss function. The batch size represents the number of samples
processed in each batch, and the learning rate represents the update speed of weight and paranoid
amount in back propagation.
The accuracy rate reflects the validity of the model. It can be seen from Table 4 that the accuracy
rate of 80 batches and 0.004 learning rate is the highest, which can reach 100%, and the average
accuracy rate is 90%.
Figure 9: Change of fault diagnosis accuracy tested by different deep learning methods
fusion of LSTM, the weight of IMF component to the output is adaptively obtained.
6 Conclusion
In the first chapter, the paper reviews the machine learning methods in the field of fault diagnosis,
points out the shortcomings of DNN method in the field of fault diagnosis, a fault diagnosis method
based on ensemble empirical mode decomposition (EEMD) and long and short term memory neural
network (LSTM) is proposed to solve the problem that the fault vibration signal of rotating machinery
motor is usually nonstationary and the noise pollution is serious. It can realize the adaptive processing
of data and simplify the operation of fault diagnosis. The paper introduces EEMD and LSTM methods
in detail in the second and third chapter respectively, and puts forward the fault diagnosis process
based on EEMD-LSTM in the fourth chapter, and in the fifth chapter Experimental verification was
carried out. In this paper, The proposed method is compared with 1-DCNN and EMD-CNN.The
experimental results show that:
(1) Combining the advantages of EEMD in analyzing and processing non-linear and non-stationary
random signals and the powerful automatic feature extraction ability of LSTM, the intelligent bearing
fault diagnosis is realized.
(2) Through kurtosis to determine the effective IMF component and build a more centralized data
set with vibration information, the classification performance of LSTM network is improved.
(3) Through the test of test data, the proposed fault diagnosis method can achieve 99.98% accuracy,
superior to the current advanced diagnosis methods 1-DCNN and EMD-CNN, with higher accuracy
and better stability of diagnosis results.
Deep learning has a strong learning ability. It can update the weight by iterative learning method
in the signal and automatically raising the fault feature. It has more advantages than those traditional
artificial intelligence methods. What is learned from vibration signal and what is its physical meaning
in each layer of deep learning? Although many literatures have done research and explanation in the
field of image recognition, the relevant research in the field of fault diagnosis is not mature. The defects
of long training time and poor generalization ability of LSTM have not been objectively described.
The characteristics of LSTM in diagnosis need further study. Using the characteristics of vibration
signal to design a more "suitable" deep learning model will be the direction of future research [20]
[23].In addition to the application in the field of industrial robots, fault diagnosis can also be derived
https://doi.org/10.15837/ijccc.2020.1.3780 12
and applied to the fields of transportation [11], finance [11] and supply chain [9] [8], which has a broad
application prospect.
Funding
The research was funded by the National Key R&D Program of China, (No.2018YFB1702700).
Author contributions
The authors contributed equally to this work.
Conflict of interest
The authors declare no conflict of interest.
References
[1] Alangari, H.; Kimura, Y. (2017). A Hybrid EMD-Kurtosis Method for Estimating Fetal Heart
Rate from Continuous Doppler Signals, physiology, 8, 2017.
[2] Andrychowicz, M.; Denil, M. (2016). Learning to learn by gradient descent by gradient descent,
Proc. NIPS, 2016.
[3] Auli, M.; Galley, M.; Quirk, C.; Zweig, G. (2013). Joint Language and Translation Modeling with
Recurrent Neural Networks, Proc. of EMNLP, 2013.
[4] Bahdanau, D.; Cho, K.; Bengio, Y. (2014). Neural machine translation by jointly learning to
align and translate, arXiv preprint arXiv, 1409.0473, 2014.
[5] Bengio, Y. (2009). Learning deep architectures for ai, Found, Trends Mach. Learn, 2(1), 1-127,
2009.
[6] Bengio, Y.; Courville, A.; Vincent, P.(2013). Representation learning: a review and new perspec-
tives, IEEE Trans. Pattern Anal. Mach. Intell, 35(8), 1798–1828, 2013.
[7] Blitzer, J.; McDonald, R.; Pereira, F. (2006). Domain adaptation with structural correspondence
learning, Proceedings of EMNLP, 120-128, 2006.
[8] Chu, X.; Liu, J.; Gong, D.; Wang, R. (2019). Preserving Location Privacy in Spatial Crowdsourc-
ing under Quality Control, IEEE Access, 7, 155851-155859, 2019.
[9] Chu, X.; Zhong, Q.; Li, X. (2018). Reverse channel selection decisions with a joint third-party
recycler, International Journal of Production Research, 56(18), 5969-5981, 2018.
[10] Gong, D.; Tang, M.; Liu, S.; Xue, G.; Wang, L.(2019). Achieving sustainable transport through
resource scheduling: A case study for electric vehicle charging stations, Advances in Production
Engineering & Management, 14(1), 65-79, 2019.
[11] Gong, D.; Liu, S.; Liu, J.; Ren, L.(2019). Who benefits from online financing? A sharing
economy E-tailing platform perspective, International Journal of Production Economics, DOI:
10.1016/j.ijpe.2019.09.011.
[12] Gu, N.L.; , Pan, H. (2017). Bearing Fault Diagnosis Method Based on EMD-CNNs, CSMA, 2017.
[13] He, Q.; Li, P.; Kong, F. (2012). Rolling bearing localized defect evaluation by multiscale signature
via empirical mode decomposition, J. Vib. Acoust, 134, 061013, 2012.
[14] Hinton, G.E.; Salakhutdinov, R.R.(2006). Reducing the dimensionality of data with neural net-
works, Science, 313(5786), 504–507, 2006.
https://doi.org/10.15837/ijccc.2020.1.3780 13
[15] Huang, N.E.; Shen, Z.; Long, S.; Wu, M.; Shih, H.; Zheng, Q.; Yen, N.-C.; Tung, C.; Liu,
H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-
stationary time series analysis, Proc. R. Soc. A Math. Phys. Eng. Sci, 454, 903–995, 1998.
[16] Karpathy, A.; Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image de-
scriptions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
3128–3137, 2015.
[17] Lei, Y.; He, Z.(2009). Application of the EEMD method to rotor fault diagnosis of rotating
machinery, Mechanical Systems and Signal Processing, 23(4), 1327-1338, 2009.
[18] Lei, Y.; He, Z. (2011). EEMD method and WNN for fault diagnosis of locomotive roller bearings,
Expert Systems with Applications, 38(6), 7334-7341, 2011.
[19] Liu, R.; Yang, B. (2018). Artificial intelligence for fault diagnosis of rotating machinery, Mech.
Syst. Signal Process, 135, 33–47, 2018.
[20] Nazifa, T.S.; Mohaed, S.F.; Amin, A.B. (2019). A Brief Discussion on Supply Chain Management
in Construction Industry, Journal of System and Management Science, 9(1), 69-86, 2019.
[21] Randall, R.B.; Antoni, J.(2011). Rolling element bearing diagnostics-A tutorial, Mech. Syst.
Signal Process, 25, 485–520, 2011.
[22] Song, L.; Wang, H.; Chen, P. (2018). Vibration-Based Intelligent Fault Diagnosis for Roller
Bearings in Low-Speed Rotating Machinery, IEEE Trans. Instrum. Meas, 67, 1887–1899, 2018.
[23] Tabsh, Y.; Davidaviciene, V. (2016). Information and Communication Technologies in Energy
Management, Journal of System and Management Science, 6(4), 67-81, 2016.
[24] Wang, D.; Tsui, K. (2018). Brownian motion with adaptive drift for remaining useful life predic-
tion: revisited, Mech. Syst. Signal Process, 99, 691–701, 2018.
[25] Wang, J.; Du, G. (2020). Fault diagnosis of rotating machines based on the EMD manifold, Mech.
Syst. Signal Process, 135, 106443, 2020.
[26] Wang, T.; Zhang, M. (2012). Comparing the applications of EMD and EEMD on time–frequency
analysis of seismic signal, Journal of Applied Geophysics, 83, 29-34, 2012.
[27] Wu, C.; Jiang, P. (2019). Intelligent fault diagnosis of rotating machinery based on one-
dimensional convolutional neural network, Computers in Industry, 108, 53–61, 2019.
[28] Wu, Z.; Huang, N.E. (2006). Ensemble empirical mode decomposition:a noise assisteted data
analysis method, Advances in Adaptive Data Analysis, 1(1), 1-41, 2009.
[29] Xu, X.; Chen, R. (2007). Recurrent Neural Network Based On-line Fault Diagnosis Approach for
Power Electronic Devices, ICNC, 24-27, 2007.
[30] Yink, W.; Kann, K.(2017). Comparative Study of CNN and RNN for Natural Language Process-
ing, Computer Science, 1702.01923, 2017.
[31] Zhao, H.; Sun, S. (2016). Sequential Fault Diagnosis based on LSTM Neural Network, IEEE
Access, 6, 12929-12939, 2018.
[32] Zhao, R.; Wang, J.; Yan, R.; Mao, K. (2016). Machine health monitoring with LSTM networks,
Proceedings of the 2016 10th International Conference on Sensing Technology (ICST), 1-6, 2016.