0% found this document useful (0 votes)
173 views

Anomaly ND Condition Monitoring 2

Anomaly detection

Uploaded by

Rahul Saraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
173 views

Anomaly ND Condition Monitoring 2

Anomaly detection

Uploaded by

Rahul Saraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 18
2irro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstartes | openinapp on 546K Followers = This is your last free member-only story this month. Sign up for Medium and get an extra one Machine learning for anomaly detection and condition monitoring Astep-by-step tutorial from data import to model output g Vegard Flovik Apr 23,2019 10min read « My previous article on anomaly detection and condition monitoring has received a lot of feedback. Many of the questions I receive, concern the technical aspects and how to set hitps:towardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring461467de770 we 2irro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstarea | openinapp @ results. For an introduction to anomaly detection and condition monitoring, I recommend first reading my original article on the topic. This provides the neccesary background information on how machine learning and data driven analytics can be utilized to extract valuable information from sensor data. The current article focuses mostly on the technical aspects, and includes all the code needed to set up anomaly detection models based on multivariate statistical analysis and autoencoder neural networks. Download the dataset: To replicate the results in the original article, you first need to download the dataset from the NASA Acoustics and Vibra se. See the downloaded Readme Document for IMS Bearing Data for further information on the experiment and available ‘ion Data data. Each data set consists of individual files that are 1-second vibration signal snapshots recorded at specific intervals. Each file consists of 20.480 points with the sampling rate set at 20 kHz. The file name indicates when the data was collected. Each record (row) in the data file is a data point. Larger intervals of time stamps (showed in file names) indicate resumption of the experiment in the next working day. Import packages and libraries: ‘The first step is to import some useful packages and libraries for the analysis: # Common imports import os import pandas as pd import numpy as np from sklearn import preprocessing import seaborn as sns sns.set (color_codes=Txue) hitpsstowardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring4614674e770 ane 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp @n from from from from from from “Hampy + rencouapox cs Srer tensorflow import set random seed keras.layers import Input, Dropout keras.layers.core import Dense keras.models import Model, Sequentia keras import regularizers keras.models import model 1, load_model from_json Data loading and pre-processing: Anassut datapoit mption is that gear degradation occur gradually over time, so we use one int every 10 minutes in the following analysis. Each 10 minute datapoint is aggregated by using the mean absolute value of the vibration recordings over the 20.480 datapoi ints in each file. We then merge together everything in a single dataframe. In the following example, I use the data from the 2nd Gear failure test (see readme document for further info on that experiment). data for dir = '2nd_test @d_data = pd.DataFrame() filename in os.listdir(d print (filename) dataset=pd. read_csv(os.p: ath. join(data_dir, filename), sep="\t") dataset_mean_abs = np.array (dataset abs () .mean()) dataset_mean_abs = pd.DataFrame (dataset_mean_abs. reshape (1,4)) dataset_mean_abs.index = [filename] merged data = merged data.append(dataset_mean_abs} data.columns = ['Bearing 1','Bearing 2', "Bearing 3", 'Bearing After loading the vibration data, we transform the index to datetime format (using the following convention), and then sort the data by index in chronological order before saving the merged dataset as a .csv file )_datetime (merged_data. index, hitpsstowardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring4614674e770 ane 2irro2 Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Resulting dataframe: “merged data” Define train/test data: Before setting up the models, we need to define train/test data. To do this, we perform a simple split where we train on the first part of the dataset (which should represent normal operating conditions), and test on the remaining parts of the dataset leading up to the bearing failure. dataset_train - merged_data[' 23:52:39 dataset_test = merged_data['2004-02-13 23:52:39" dataset_train.plot (£: hitpsstowardedatascience.com/machine-learning-or-anomaly-detection-and-coneion-monitoring-d461467de770 ane 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp @n Normalize data: I then use preprocessing tools from Scikit-learn to scale the input variables of the model. The “MinMaxScaler” simply re-scales the data to be in the range [0,1]. scaler = preprocessing. MinMaxScaler () X train = pda. DataFrame (scaler. fit_transform(dataset_train), columns=dataset_train.columns, index: _t¥ain. index) atase! # Random shuffle training data X_train.sample(frac=1) X test = pd.DataFrame (scaler.transform(dataset_test), dataset_test.columns, index-dataset_test. index) PCA type model for anomaly detectioi ‘As dealing with high dimensional sensor data is often challenging, there are several techniques to reduce the number of variables (dimensionality reduction). One of the main techniques is principal component analysis (PCA). For a more detailed introduction, I refer to my original article on the topic. As an initial attempt, let us compress the sensor readings down to the two main principal components. from sklearn.decomposition import PCA pea = PCA(n_components=2, svd_solver= 'full') X_train_PCA = pea. fit_transform(X_train X_train_PCA = pd.DataFrame(x_train_PCA) X_train PCA. index = X_train. index tr: X_test_PCA = pi form(X ) X test PCA = pd.DataFrame(X test PCA) X_test_PCA.index = X_test.index hitpsstowardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring4614674e770 ene 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp on techniques. In order to use the Mahalanobis distance to classify a test point as belonging to one of N classes, one first estimates the covariance matrix of each class, usually based on samples known to belong to each class. In our case, as we are only interested in classifying “normal” vs “anomaly”, we use training data that only contains normal operating conditions to calculate the covariance matrix. Then, given a test sample, we compute the Mahalanobis distance to the “normal” class, and classifies the test point as n “anomaly” if the distance is above a certain threshold. For a more detailed introduction to these technical aspects, you can have a look at my previous article, which covers these topics in more detail. Define functions used in the PCA model: Calculate the covariance matrix: det (data, verbose-False) : ix = np.cov(data, rowvar=False) covariance_matrix) dance matrix = np.linalg.inv (covariance matrix) if is pos def (inv_covariance matrix) return covariance matrix, inv covariance matrix els. print ("Error: Inverse of Covariance Matrix is not positive definite!") else: print ("Error: Covariance Matrix is n tive definite!") t pos Calculate the Mahalanobis distance: def MahalanobisDist (inv_cov_matrix, mean_distr, data, verbose-False) : inv_covariance_matrix xix ean = mean_distr md. append (np. return md £[i] .dot (inv_covariance matrix) .dot (dif£[i]))) hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 ene 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp on def MD_detectoutliers (dist, extreme=False, verbose=False) : k if extreme else 2. threshold = np.mean(dist) * k outliers = [] for i in range(len(dist)): if dist[i] > nresholdt pend(i) # index of the outlier return np. array (outliers) Calculate threshold value for classifying datapoint as anomaly: def MD_threshold(dist, extreme-False, k = 3. if extreme else 2. threshold = np.mean(dist) * k return threshold Check if matrix is positive definite: def is _pos_def(A): if np.aliclose(A, ? try np. linalg.cholesky (A) return True except np. linalg.LinAlgError: return False 7) else: return False Set up PCA model: Define train/test set from the two main principal components: data_train = np.array(X_train PCA.values) data_test = np.array(X_test_PCA.values) Calculate the covariance matrix and its inverse, based on data in the training set: hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 m8 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstared } openinapp @ We also calculate the mean value for the input variables in the training set, as this is used later to calculate the Mahalanobis distance to datapoints in the test set mean_distr = data_train.mean(axis=0) Using the covariance matrix and its inverse, we can calculate the Mahalanobis distance for the training data defining “normal conditions”, and find the threshold value to flag datapoints as an anomaly. One can then calculate the Mahalanobis distance for the datapoints in the test set, and compare that with the anomaly threshold. dist_test - MahalanobisDist (inv_cov_matrix, mean_di verbose-False) dist_train = MahalanobisDist (inv_cov_matrix, mean_distr, data_train, verbose-False) threshold = MD_threshold(dist_train, ex data_test, ‘creme = True) Threshold value for flagging an anomaly: The square of the Mahalanobis distance to the centroid of the distribution should follow x2 distribution if the assumption of normal distributed input variables is fulfilled. This is also the assumption behind the above calculation of the “threshold value” for flagging an anomaly. As this assumption is not necessarily fulfilled in our case, it is beneficial to visualize the distribution of the Mahalanobis distance to set a good threshold value for flagging anomalies. Again, I refer to my previous article, for a more detailed introduction to these technical aspects. We start by visualizing the square of the Mahalanobis distance, which should then ideally follow a x2 distribution. plt.£igure() sns.distplot (np.square (dist_train), bins = 10, hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 ane arriagat Machine leaming for anomaly detection and condivon monitoring | by Vegard Flovk | Towards Data Science Open in app et ‘Square of the Mahalanobis distance ‘Then visualize the Mahalanobis distance itself: plt. figure () sns.distplot (dist_train, bins = 10, kde True, color = 'green'); plt.xlim([0.0,5]) plt.xlabel (‘Mahalanobis dist") hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conalion-montoring4614674e770 one 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp on From the above distributions, the calculated threshold value of 3.8 tor tlagging an. anomaly seems reasonable (defined as 3 standard deviations from the center of the distribution) ‘We can then save the Mahalanobis distance, as well as the threshold value and “anomaly flag” variable for both train and test data in a dataframe: anomaly train = pd.DataFrame() maly_train['Mob dist']= dist_train maly train['Thres nreshold # If Mob dist above threshold: Flag as anomaly anomaly train[‘Anomaly'] = anomaly train['Mob dist'] > anomaly train['Thresh' anomaly _train.index = X_train PCA, index anoma pd. DataFrame () anomaly['Mob dist']= dist _test anomaly['Thresh'] = threshold # If Mob dist above threshold: Flag as anomaly anomaly['Anomaly'] = anomaly['Mob dist'] > anomaly['Thresh' anomaly. index = X_1 anomaly. head () Resulting dataframe for the test data Based on the calculated statistics, any distance above the threshold value will be flagged as an anomaly. ‘We can now merge the data in a single dataframe and save it as a .csv file: hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 sone 2ira9a Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstated | Openinapp on Verifying PCA model on test data: ‘We can now plot the calculated anomaly metric (Mob dist), and check when it crosses the anomaly threshold (note the logarithmic y-axis). anomaly _alldata.plot (logy-True, figsize = (10,6), ylim ~ [le-1,1¢3], color = ["green', 'red']) From the above figure, we see that the model is able to detect the anomaly approximately 3 days ahead of the actual bearing failure. Other approach: Autoencoder model for anomaly detection The basic idea here is to use an autoencoder neural network to “compress” the sensor readings to a low dimensional representation, which captures the correlations and interactions between the various variables. (Essentially the same principle as the PCA model, but here we also allow for non-linearities among the input variables). hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 nie 2inro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstarted | Openinapp on Defining the Autoencoder network: We use a3 layer neural network: First layer has 10 nodes, middle layer has 2 nodes, and third layer has 10 nodes. We use the mean square error as loss function, and train the model using the “Adam” optimizer. seed (10) set_random_seed (10) act_fune = ‘elu! Input layer: mode1=Sequential () # First hidden layer, connected to input vector x. model. add (Dense (10, activation=act_func, kernel _initializer="glorot_uniform', input, ) nel_regularizer-regularizers.12(0.0), hape=(X_train.shape[1],) ) vation-act_func, alizer="glorot_uniform')) model. add (Dense (2, ac model . add (Dense (10, activation=act_func, kernel _initializer="glorot_uniform')) model. add (Dense (X_train.shape[1], kernel initialize: ‘glorot_uniform')) model.compile (loss='mse', optimizer="adam') model for 100 epochs, batch size of 10: -OCHS=100 SIZE=10 Fitting the model: To keep track of the accuracy during training, we use 5% of the training data for validation after each epoch (validation_split = 0.05) model. fit (np.array (X_train) ,np.array(X_train), hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 se 2irro2t Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstared | openinapp @ Training process Visualize training/validation loss: ot (history. hi: ‘bt, label="Training loss') plt.plot (history.history['val_loss'], label="Validation loss") plt.legend(1oc="upper right") plt.xlabel ('Epochs') plt.ylabel('Loss, [mse] ') plt.ylim((0,.1]) plt.show() ry['loss'], hitpstowardsdatascience.com/machine-learning-for-anomaly-detecton-anc-condiion-montoring4614674e770 sae 2ira02 Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Open in app @o Train/validation loss Distribution of loss function in the tré 9g set: By plotting the distribution of the calculated loss in the training set, one can use this to identify a suitable threshold value for identifying an anomaly. In doing this, one can make sure that this threshold is set above the “noise level”, and that any flagged anomalies should be statistically significant above the noise background. X pred X pred model.predict (np.array(X_train)) pd. DataFrame (X_pred, columns-X_train.columns) X_pred.index = X_train.index scored = pd. DataFrame (index=X_train. index) scored['Loss_mae'] = np.mean(np.abs(X_pred-X_train), axis = 1) plt. figure () sns.distplot (scored["Loss_mae'], bins = 10, kde= True, color "blue'); plt.xLim({0.0,.51) Loss distribution, training set hitpstowardsdatascience.com/machine-learning-or-anomaly-detecton-anc-conaiion-montoring 46146746770 sae 2ira02 Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Open in app eo anomaly threshold. X_pred X_pred model predict (np.array(X_test)) pd.DataFrame (X_pred, columns=X_test.columns) X_pred.index = X_test. index scored = pd.DataFrame (index-X_test. index) scored['Loss_mae'] = np.mean(np.abs(X_pred-X_test), axis = 1) scored['Threshold'] = 0.3 scored['Anomaly'] = scored['Loss_mae'] > scored['Threshold'] scored. head () We then calculate the same metrics also for the training set, and merge all data ina single dataframe: X_pred_train = model.predict (np.array(X_train)) X pred train ~ pd.DataFrame(X_pred train, columns=X_train.columns) X_pred_train.index = X_train, index scored_train = pd.DataFrame (index=-x_train. index) scored_train['Loss_mae'] = np.mean(np.abs(X_pred_train-X_train), axis =v scored_train['Threshold'] = 0.3 scored_train['Anomaly'] = scored_train['Loss_mae'] > scored_train['Threshold"] scored = pd.concat ([scored_train, scored]) hitpsstowardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring4614674e770 s918 2ira02 Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Open in app eo model output in the time leading up to the bearing failure: scored.plot (logy=True, figsize = (10,6), ylim = [le-2,le2], color = ["blue', 'red"]) Summary: Both modeling approaches give similar results, where they are able to flag the upcoming bearing malfunction well in advance of the actual failure. The main difference is essentially how to define a suitable threshold value for flagging anomalies, to avoid to many false positives during normal operating conditions. hope this tutorial gave you inspiration to try out these anomaly detection models yourselves. Once you have succesfully set up the models, it is time to start experimenting with model parameters etc. and test the same approach on new datasets. If you come across some interesting use cases, please let me know in the comments below. Have fun! hitpsstowardsdatascience.com/machine-learning-or-anomaly-detection-anc-conaiion-montoring4614674e770 s618 zinro2s Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstarea | openinapp @ 1. The transition from Physics to Data Science 2. What is Graph theory, and why should you care? 3. Deep Transfer Learning for Image Classification 4, Building an Al that can read your mind 5. Machine Learning: From Hype to real-world applications 6. The hidden risk of Al and Big Data 7. How to use machine learning for anomaly detection and condition monitoring 9. How (not) to use Machine Learning for time series forecasting: Avoiding the pitfalls 10. How to use machine learning for production optimization: Using data to improve performance 11, How do you teach physics to Al systems? 12. Can we build arti Al workshop — From hype to real-world applications hitpsstowardedatascience.commachine-learning-or-anomaly-detection-and-concon-monioring-d461467de770 ate zinro2s Machine leaming for anomaly detection and condition monitoring | by Vegard Flovk | Towards Data Science Getstarted | Openinapp @ Sign up for The Daily Pick By Towards Data Science Hands-on real-world examples, research, tutorials, and cutting-edge techniques delivered Monday to. Thursday. Make learning your daily ritual. Take a look Your email [__cemienwene By signing up, you will ereate a Medium account ityou dont already have one, Review our Privacy Palicy for more information about our privacy practices MachineLearning DataScience Al‘ IoT_~—Towards Data Science hitpsstowardedatascience.commachine-learning-or-anomaly-detection-and-conon-monitoring-d461467de770 sae

You might also like