0% found this document useful (0 votes)
28 views

Anomaly-Detection 112940

Anomaly detection

Uploaded by

Assem Mahmoud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Anomaly-Detection 112940

Anomaly detection

Uploaded by

Assem Mahmoud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Introduction to

Anomaly Detection

Introduction to
Anomaly Detection
Linghao Chen

HOMEPAGE: https://lhchen.top
[email protected]
School of Computer Science and Technology, Xidian University, Xi'an, ShaanXi, P.R.China
Anomaly Detection

What is it?

[1]: Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation forest." 2008 eighth ieee international conference on data mining. IEEE, 2008.
[2]: Eswaran, Dhivya, et al. "Spotlight: Detecting anomalies in streaming graphs." Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. 2018.
Problems

Why so hard to detect anomaly?

✓ Unsupervised learning in most cases;


✓ The data is extremely unbalanced;
✓ It often involves density estimation, which requires a large amount of
distance or similarity calculations, and computationally expensive;
✓ Real-time detection;
✓ Interpretability of methods.
Methods

Classic Methods:

➢ kNN(K-Nearest Neighbor)
➢ LOF(Local Outlier Factor)
➢ PCA(Principal Component Analysis)
➢ HBOS(Histogram-based Outlier Score)
➢ Isolation Forest
➢ AE(Auto Encoder)
kNN(K-Nearest Neighbor)

1
𝑁 𝑝
𝑝
𝐷𝑖𝑠 𝑥, 𝑦 = ෍ 𝑥𝑖 − 𝑦𝑖
𝑖=1

Choose Top K-th Dstance

Simple but expensive!

[1]: Ramaswamy, S., Rastogi, R. and Shim, K., 2000, May. Efficient algorithms for mining outliers from large data sets. ACM Sigmod
Record, 29(2), pp. 427-438.
LOF(Local Outlier Factor)

K-distance of an object p

5-distance

[1]: Breunig, M.M., Kriegel, H.P., Ng, R.T. and Sander, J., 2000, May. LOF: identifying density-based local outliers. ACM Sigmod Record,
29(2), pp. 93-104.
LOF(Local Outlier Factor)

K-distance neighborhood of an object p


𝑁𝑘 𝑂 = 𝑃′ ∈ 𝐷{𝑂 | 𝑑 𝑂, 𝑃′ ≤ 𝑑𝑘 (𝑂)}

𝑁5 𝑂 = {𝑃1 , 𝑃2 , 𝑃3 , 𝑃4 , 𝑃5 , 𝑃6 }
𝑃5

Reachability distance of an object P w.r.t. object O


|𝑁𝑘 (𝑃)| 𝑃6
𝜌𝑘 𝑃 =
σ𝑂∈𝑁𝑘 (𝑃) 𝑑_𝑘(𝑃, 𝑂)

𝜌𝑘 𝑂
σ𝑂∈𝑁𝑘 (𝑃)
𝜌𝑘 𝑃
5-distance
𝐿𝑂𝐹𝑘 𝑃 =
|𝑁𝑘 (𝑃)|

[1]: Breunig, M.M., Kriegel, H.P., Ng, R.T. and Sander, J., 2000, May. LOF: identifying density-based local outliers. ACM Sigmod Record,
29(2), pp. 93-104.
PCA(Principal Component Analysis)

Algorithm
Input: 𝑋 ∈ ℝn×𝑚 with 𝑛 samples
Output: 𝑌 = 𝑊𝑋 ∈ ℝn×𝑚′

1 𝑚
Normalization: 𝑥𝑖 = 𝑥𝑖 − σ 𝑥
𝑚 𝑗=1 𝑗
1
Covariance matrix: 𝐶 = 𝑋𝑋 𝑇
𝑚

Calculate eigenvectors
Anomaly score: the distance between the abnormal sample and the feature vector

[1]: Shyu, Mei-Ling, et al. A novel anomaly detection scheme based on principal component classifier. MIAMI UNIV CORAL GABLES
FL DEPT OF ELECTRICAL AND COMPUTER ENGINEERING, 2003.
HBOS(Histogram-based Outlier Score)

Methods

Low density area

[1]: Goldstein, M. and Dengel, A., 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. In KI-2012:
Poster and Demo Track, pp.59-63.
HBOS(Histogram-based Outlier Score)

Assumption
Multidimensional data is independent of each dimension.

Algorithm
➢ Draw a data histogram
➢ Divide the value range into K buckets of equal(sometimes can be dynamic)
width, and the frequency of the value falling into each bucket is used as an
estimate of density.

Anomaly Score
𝑎
1
𝐻𝐵𝑂𝑆 𝑝 = ෍ log( )
ℎ𝑖𝑠𝑡𝑖 (𝑝)
𝑖=0
[1]: Goldstein, M. and Dengel, A., 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. In KI-2012:
Poster and Demo Track, pp.59-63.
AE(Auto Encoder)

Latent Representation

[1]: Ramaswamy, S., Rastogi, R. and Shim, K., 2000, May. Efficient algorithms for mining outliers from large data sets. ACM Sigmod
Record, 29(2), pp. 427-438.
Isolation Forest

ICDM '08

[1]: Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In International Conference on Data Mining (ICDM), pp. 413-
422. IEEE
Isolation Forest

Anomaly Detection

[1]: Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In International Conference on Data Mining (ICDM), pp. 413-
422. IEEE
Isolation Forest

Anomaly Detection

[1]: Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In International Conference on Data Mining (ICDM), pp. 413-
422. IEEE
Isolation Forest

Anomaly Detection

[1]: Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In International Conference on Data Mining (ICDM), pp. 413-
422. IEEE
REFERENCE

[1]: Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation forest." 2008 eighth ieee international
conference on data mining. IEEE, 2008.
[2]: Eswaran, Dhivya, et al. "Spotlight: Detecting anomalies in streaming graphs." Proceedings of the 24th
ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018.
[3]: Ramaswamy, S., Rastogi, R. and Shim, K., 2000, May. Efficient algorithms for mining outliers from
large data sets. ACM Sigmod Record, 29(2), pp. 427-438.
[4]: Breunig, M.M., Kriegel, H.P., Ng, R.T. and Sander, J., 2000, May. LOF: identifying density-based local
outliers. ACM Sigmod Record, 29(2), pp. 93-104.
[5]: Shyu, Mei-Ling, et al. A novel anomaly detection scheme based on principal component classifier.
MIAMI UNIV CORAL GABLES FL DEPT OF ELECTRICAL AND COMPUTER ENGINEERING, 2003.
[6]: Goldstein, M. and Dengel, A., 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly
detection algorithm. In KI-2012: Poster and Demo Track, pp.59-63.
[7]: Ramaswamy, S., Rastogi, R. and Shim, K., 2000, May. Efficient algorithms for mining outliers from
large data sets. ACM Sigmod Record, 29(2), pp. 427-438.
[8]: Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In International Conference on
Data Mining (ICDM), pp. 413-422. IEEE
Introduction to
Anomaly Detection

Q&A

Linghao Chen
HOMEPAGE: https://lhchen.top
[email protected]
School of Computer Science and Technology, Xidian University, Xi'an, ShaanXi, P.R.China

You might also like