0% found this document useful (0 votes)

5 views

20 - AAAI - MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Uploaded by

luoxj0116

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

20 - AAAI - MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Uploaded by

luoxj0116

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

M IDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Siddharth Bhatia,1 Bryan Hooi,1 Minji Yoon,2 Kijung Shin,3 Christos Faloutsos2
1
National University of Singapore, 2 Carnegie Mellon University, 3 KAIST
{siddharth, bhooi}@comp.nus.edu.sg, {minjiy, christos}@cs.cmu.edu, [email protected]

Abstract Moreover, fraudulent or anomalous events in many appli-

cations occur in microclusters or suddenly arriving groups
Given a stream of graph edges from a dynamic graph, how of suspiciously similar edges e.g. denial of service attacks in
can we assign anomaly scores to edges in an online manner, network traffic data and lockstep behavior. However, exist-
for the purpose of detecting unusual behavior, using constant
ing methods which process edge streams in an online man-
time and memory? Existing approaches aim to detect indi-
vidually surprising edges. In this work, we propose M IDAS, ner, including (Eswaran and Faloutsos 2018; Ranshous et al.
which focuses on detecting microcluster anomalies, or sud- 2016), aim to detect individually surprising edges, not mi-
denly arriving groups of suspiciously similar edges, such as croclusters, and can thus miss large amounts of suspicious
lockstep behavior, including denial of service attacks in net- activity.
work traffic data. M IDAS has the following properties: (a) In this work, we propose M IDAS, which detects micro-
it detects microcluster anomalies while providing theoretical cluster anomalies, or suddenly arriving groups of suspi-
guarantees about its false positive probability; (b) it is online, ciously similar edges, in edge streams, using constant time
thus processing each edge in constant time and constant mem- and memory. In addition, by using a principled hypothesis
ory, and also processes the data 108 − 505 times faster than
testing framework, M IDAS provides theoretical bounds on
state-of-the-art approaches; (c) it provides 46%-52% higher
accuracy (in terms of AUC) than state-of-the-art approaches. the false positive probability, which these methods do not
provide.
Our main contributions are as follows:
Introduction 1. Streaming Microcluster Detection: We propose a novel
Anomaly detection in graphs is a critical problem for find- streaming approach for detecting microcluster anomalies,
ing suspicious behavior in innumerable systems, such as in- requiring constant time and memory.
trusion detection, fake ratings, and financial fraud. This has 2. Theoretical Guarantees: In Theorem 1, we show guaran-
been a well-researched problem with majority of the pro- tees on the false positive probability of M IDAS.
posed approaches (Akoglu, McGlohon, and Faloutsos 2010;
Chakrabarti 2004; Hooi et al. 2017; Jiang et al. 2016; 3. Effectiveness: Our experimental results show that M IDAS
Kleinberg 1999; Shin, Eliassi-Rad, and Faloutsos 2018) fo- outperforms baseline approaches by 46%-52% accuracy
cusing on static graphs. However, many real-world graphs (in terms of AUC), and processes the data 108−505 times
are dynamic in nature, and methods based on static connec- faster than baseline approaches.
tions may miss temporal characteristics of the graphs and Reproducibility: Our code and datasets are publicly avail-
anomalies. able at https://github.com/bhatiasiddharth/MIDAS.
Among the methods focusing on dynamic graphs, most of
them have edges aggregated into graph snapshots (Eswaran
et al. 2018; Sun, Tao, and Faloutsos 2006; Sun et al. 2007; Related Work
Koutra, Vogelstein, and Faloutsos 2013; Sricharan and Das In this section, we review previous approaches to detect
2014; Gupta et al. 2012). However, to minimize the effect of anomalous signs on static and dynamic graphs. See (Akoglu,
malicious activities and start recovery as soon as possible, Tong, and Koutra 2015) for an extensive survey on graph-
we need to detect anomalies in real-time or near real-time based anomaly detection.
i.e. to identify whether an incoming edge is anomalous or Anomaly detection in static graphs can be classified by
not, as soon as we receive it. In addition, since the number which anomalous entities (nodes, edges, subgraph, etc.) are
of vertices can increase as we process the stream of edges, spotted.
we need an algorithm which uses constant memory in graph
size. • Anomalous node detection: (Akoglu, McGlohon, and
Faloutsos 2010) extracts egonet-based features and finds
Copyright c 2020, Association for the Advancement of Artificial empirical patterns with respect to the features. Then, it
Intelligence (www.aaai.org). All rights reserved. identifies nodes whose egonets deviate from the patterns,

3242
including the count of triangles, total weight, and princi- Table 1: Comparison of relevant edge stream anomaly de-
pal eigenvalues. (Jiang et al. 2016) computes node fea- tection approaches.
tures, including degree and authoritativeness (Kleinberg

S EDAN S POT (2018)

1999), then spots nodes whose neighbors are notably
close in the feature space.
• Anomalous subgraph detection: (Hooi et al. 2017) and

RHSS (2016)
(Shin, Eliassi-Rad, and Faloutsos 2018) measure the
anomalousness of nodes and edges, detecting a dense sub-

M IDAS
graph consisting of many anomalous nodes and edges.
• Anomalous edge detection: (Chakrabarti 2004) encodes
an input graph based on similar connectivity among
nodes, then spots edges whose removal reduces the total Microcluster Detection
encoding cost significantly. (Tong and Lin 2011) factorize Guarantee on False Positive Probability
the adjacency matrix and flag edges with high reconstruc- Constant Memory
tion error as outliers. Constant Update Time
Anomaly detection in graph streams use as input a series
of graph snapshots over time. We categorize them similarly
according to the type of anomaly detected:
• Anomalous node detection: (Sun, Tao, and Faloutsos (ui , vi , ti ) consisting of a source node ui ∈ V, a destina-
2006) approximates the adjacency matrix of the current tion node vi ∈ V, and a time of occurrence ti , which is the
snapshot based on incremental matrix factorization, then time at which the edge was added to the graph. For example,
spots nodes corresponding to rows with high reconstruc- in a network traffic stream, an edge ei could represent a con-
tion error. nection made from a source IP address ui to a destination
• Anomalous subgraph detection: Given a graph with IP address vi at time ti . We do not assume that the set of
timestamps on edges, (Beutel et al. 2013) spots near- vertices V is known a priori: for example, new IP addresses
bipartite cores where each node is connected to others or user IDs may be created over the course of the stream.
in the same core densly within a short time. (Jiang et al. We model G as a directed graph. Undirected graphs can
2016) detects groups of nodes who form dense subgraphs simply be handled by treating an incoming undirected ei =
in a temporally synchronized manner. (ui , vi , ti ) as two simultaneous directed edges, one in either
• Anomalous event detection: (Eswaran et al. 2018) detects direction.
sudden appearance of many unexpected edges, and (Yoon We also allow G to be a multigraph: edges can be created
et al. 2019) spots sudden changes in 1st and 2nd deriva- multiple times between the same pair of nodes. Edges are
tives of PageRank. allowed to arrive simultaneously: i.e. ti+1 ≥ ti , since in
Anomaly detection in edge streams use as input a stream many applications ti are given in the form of discrete time
of edges over time. Categorizing them according to the type ticks.
of anomaly detected: The desired properties of our algorithm are as follows:
• Anomalous node detection: Given an edge stream, (Yu et
al. 2013) detects nodes whose egonets suddenly and sig-
nificantly change. • Microcluster Detection: It should detect suddenly ap-
pearing bursts of activity which share many repeated
• Anomalous subgraph detection: Given an edge stream, nodes or edges, which we refer to as microclusters.
(Shin et al. 2017) identifies dense subtensors created
within a short time.
• Guarantees on False Positive Probability: Given any
• Anomalous edge detection: (Ranshous et al. 2016) fo- user-specified probability level (e.g. 1%), the algorithm
cuses on sparsely-connected parts of a graph, while should be adjustable so as to provide false positive prob-
(Eswaran and Faloutsos 2018) identifies edge anomalies ability of at most (e.g. by adjusting a threshold that de-
based on edge occurrence, preferential attachment, and pends on ). Moreover, while guarantees on the false pos-
mutual neighbors. itive probability rely on assumptions about the data distri-
Only the 2 methods in the last category are applicable to our bution, we aim to make our assumptions as weak as pos-
task, as they operate on edge streams and output a score per sible.
edge. However, as shown in Table 1, neither method aims to
detect microclusters, or provides guarantees on false positive • Constant Memory and Update Time: For scalability in
probability. the streaming setting, the algorithm should run in con-
stant memory and constant update time per newly arriv-
Problem ing edge. Thus, its memory usage and update time should
Let E = {e1 , e2 , · · · } be a stream of edges from a time- not grow with the length of the stream, or the number of
evolving graph G. Each arriving edge is a tuple ei = nodes in the graph.

3243
Proposed Algorithm Secondly, let auv be the number of edges from u to v in
the current time tick (but not including past time ticks). We
Overview
keep track of auv using a similar CMS data structure, the
Next, we describe our M IDAS and M IDAS-R approaches. only difference being that we reset this CMS data structure
The following provides an overview: every time we transition to the next time tick. Hence, this
1. Streaming Hypothesis Testing Approach: We describe CMS data structure provides approximate counts aûv for the
our M IDAS algorithm, which uses streaming data struc- number of edges from u to v in the current time tick t.
tures within a hypothesis testing-based framework, al- Hypothesis Testing Framework Given approximate
lowing us to obtain guarantees on false positive proba- counts sûv and aûv , how can we detect microclusters? More-
bility. over, how can we do this in a principled framework that al-
2. Detection and Guarantees: We describe our decision lows for theoretical guarantees?
procedure for determining whether a point is anomalous, Fix a particular source and destination pair of nodes,
and our guarantees on false positive probability. (u, v), as in Figure 1. One approach would be to assume
3. Incorporating Relations: We extend our approach to the that the time series in Figure 1 follows a particular genera-
M IDAS-R algorithm, which incorporates relationships tive model: for example, a Gaussian distribution. We could
between edges temporally and spatially1 . then find the mean and standard deviation of this Gaussian
distribution. Then, at time t, we could compute the Gaussian
M IDAS: Streaming Hypothesis Testing Approach likelihood of the number of edge occurrences in the current
time tick, and declare an anomaly if this likelihood is below
a specified threshold.
Occurrences of edge (u, v) However, this requires a restrictive Gaussian assumption,
which can lead to excessive false positives or negatives if the
data follows a very different distribution. Instead, we use a
1000
weaker assumption: that the mean level (i.e. the average rate
at which edges appear) in the current time tick (e.g. t = 10)
is the same as the mean level before the current time tick

0
Time tick (t < 10). Note that this avoids assuming any particular dis-
1 10 tribution for each time tick, and also avoids a strict assump-
tion of stationarity over time.
Figure 1: Time series of a single source-destination pair Hence, we can divide the past edges into two classes: the
(u, v), with a large burst of activity at time tick 10. current time tick (t = 10) and all past time ticks (t < 10).
Recalling our previous notation, the number of events at
(t = 10) is auv , while the number of edges in past time
Consider the example in Figure 1 of a single source- ticks (t < 10) is suv − auv .
destination pair (u, v), which shows a large burst of activity Under the chi-squared goodness-of-fit test, the chi-
at time 10. This burst is the simplest example of a micro- squared statistic is defined as the sum over categories of
cluster, as it consists of a large group of edges which are (observed−expected)2
very similar to one another (in fact identical), both spatially expected . In this case, our categories are t = 10
(i.e. in terms of the nodes they connect) and temporally. and t < 10. Under our mean level assumption, since we
have suv total edges (for this source-destination pair), the
Streaming Data Structures In an offline setting, there are expected number at t = 10 is suv t , and the expected num-
many time-series methods which could detect such bursts of ber for t < 10 is the remaining, i.e. t−1 t suv . Thus the chi-
activity. However, in an online setting, recall that we want squared statistic is:
memory usage to be bounded, so we cannot keep track of
even a single such time series. Moreover, there are many
such source-destination pairs, and the set of sources and des- (observed(t=10) − expected(t=10) )2
tinations is not fixed a priori. X2 =
expected(t=10)
To circumvent these problems, we maintain two types
of Count-Min-Sketch (CMS) (Cormode and Muthukrishnan (observed(t<10) − expected(t<10) )2
2005) data structures. Assume we are at a particular fixed +
expected(t<10)
time tick t in the stream; we treat time as a discrete variable suv 2
for simplicity. Let suv be the total number of edges from u (auv − t ) ((suv − auv ) − t−1
t suv )
2
= +
to v up to the current time. Then, we use a single CMS data suv t−1
t t suv
structure to approximately maintain all such counts suv (for
all edges uv) in constant memory: at any time, we can query (auv − suv
t )
2
(auv − suv
t )
2
= suv + t−1
the data structure to obtain an approximate count sûv . t t suv
suv 2 t2
1
We use ‘spatially’ in a graph sense, i.e. connecting nearby = (auv − )
nodes, not to refer to any other continuous spatial dimension. t suv (t − 1)

3244
Note that both auv and suv can be estimated by our CMS Theorem 1 (False Positive Probability Bound). Let
data structures, obtaining approximations aûv and sûv re- χ21−/2 (1) be the 1 − /2 quantile of a chi-squared random
spectively. This leads to our following anomaly score, using variable with 1 degree of freedom. Then:
which we can evaluate a newly arriving edge with source-
destination pair (u, v): P (X˜2 > χ21−/2 (1)) < (6)
Definition 1 (Anomaly Score). Given a newly arriving edge
(u, v, t), our anomaly score is computed as: In other words, using X˜2 as our test statistic and threshold
sûv 2 t2 χ21−/2 (1) results in a false positive probability of at most .
score((u, v, t)) = (aûv − ) (1)
t sûv (t − 1)
Algorithm 1 summarizes our M IDAS algorithm. Proof. Recall that

suv 2 t2
Algorithm 1: M IDAS: Streaming Anomaly Scoring X 2 = (auv − ) (7)
t suv (t − 1)
Input: Stream of graph edges over time
Output: Anomaly scores per edge was defined so that it has a chi-squared distribution. Thus:
1 Initialize CMS data structures:
2 Initialize CMS for total count suv and current count auv P (X 2 ≤ χ21−/2 (1)) ≥ 1 − /2 (8)
3 while new edge e = (u, v, t) is received: do
4 Update Counts: At the same time, by the CMS guarantees we have:
5 Update CMS data structures for the new edge uv
6 Query Counts: P (aûv ≤ auv + ν · Nt ) ≤ 1 − /2 (9)
7 Retrieve updated counts sûv and aûv
8 Anomaly Score: By union bound, with probability at least 1 − , both these
t2 events (8) and (9) hold, in which case:
9 output score((u, v, t)) = (aûv − suv
ˆ 2
t ) suv
ˆ (t−1)
sûv 2 t2
X˜2 = (a˜uv − )
t sûv (t − 1)
Detection and Guarantees sûv 2 t2
= (aûv − ν · Nt − )
While Algorithm 1 computes an anomaly score for each t sûv (t − 1)
2
edge, it does not provide a binary decision for whether an suv 2 t
edge is anomalous or not. We want a decision procedure ≤ (auv − )
t suv (t − 1)
that provides binary decisions and a guarantee on the false
positive probability: i.e. given a user-defined threshold , the = X 2 ≤ χ21−/2 (1)
probability of a false positive should be at most . Intuitively,
the key idea is to combine the approximation guarantees of Finally, we conclude that
CMS data structures with properties of a chi-squared ran-
dom variable. P (X˜2 > χ21−/2 (1)) < . (10)
The key property of CMS data structures we use is that
given any and ν, for appropriately chosen CMS data struc-
ture sizes, with probability at least 1 − 2 , the estimates aûv
satisfy: Incorporating Relations
aûv ≤ auv + ν · Nt (2) In this section, we describe our M IDAS-R approach, which
where Nt is the total number of edges at time t. Since CMS considers edges in a relational manner: that is, it aims to
data structures can only overestimate the true counts, we ad- group together edges which are nearby, either temporally or
ditionally have spatially.
suv ≤ sûv (3)
Define an adjusted version of our earlier score: Temporal Relations Rather than just counting edges in
the same time tick (as we do in M IDAS), we want to allow
a˜uv = aûv − νNt (4) for some temporal flexibility: i.e. edges in the recent past
To obtain its probabilistic guarantee, our decision procedure should also count toward the current time tick, but modified
computes a˜uv , and uses it to compute an adjusted version of by a reduced weight. A simple and efficient way to do this
our earlier statistic: using our CMS data structures is as follows: at the end of ev-
sûv 2 t2 ery time tick, rather than resetting our CMS data structures
X˜2 = (a˜uv − ) (5) auv , we reduce all its counts by a fixed fraction α ∈ (0, 1).
t sûv (t − 1) This allows past edges to count toward the current time tick,
Then our main guarantee is as follows: with a diminishing weight.

3245
Spatial Relations We would like to catch large groups of Q1. Accuracy: How accurately does M IDAS detect real-
spatially nearby edges: e.g. a single source IP address sud- world anomalies compared to baselines, as evaluated
denly creating a large number of edges to many destinations, using the ground truth labels?
or a small group of nodes suddenly creating an abnormally Q2. Scalability: How does it scale with input stream
large number of edges between them. A simple intuition we length? How does the time needed to process each in-
use is that in either of these two cases, we expect to ob- put compare to baseline approaches?
serve nodes with a sudden appearance of a large number of
edges. Hence, we can use CMS data structures to keep track Q3. Real-World Effectiveness: Does it detect meaningful
of edge counts like before, except counting all edges adja- anomalies in case studies on Twitter graphs?
cent to any node u. Specifically, we create CMS counters
aû and sû to approximate the current and total edge counts Datasets: DARPA (Lippmann et al. 1999) has 4.5M IP-
adjacent to node u. Given each incoming edge (u, v), we IP communications between 9.4K source IP and 2.3K des-
can then compute three anomalousness scores: one for edge tination IP over 87.7K minutes. Each communication is
(u, v), as in our previous algorithm; one for node u, and one a directed edge (srcIP, dstIP, timestamp, attack) where the
for node v. Finally, we combine the three scores by taking ground truth attack label indicates whether the communica-
their maximum value. Another possibility of aggregating the tion is an attack or not (anomalies are 23.8% of total).
three scores is to take their sum. Algorithm 2 summarizes the TwitterSecurity (Rayana and Akoglu 2015; 2016) has
resulting M IDAS-R algorithm. 2.6M tweet samples for four months (May-Aug 2014) con-
taining Department of Homeland Security keywords related
Algorithm 2: M IDAS-R: Incorporating Relations to terrorism or domestic security. Entity-entity co-mention
temporal graphs are built on daily basis (80 time ticks).
Input: Stream of graph edges over time TwitterWorldCup (Rayana and Akoglu 2015; 2016) has
Output: Anomaly scores per edge 1.7M tweet samples for the World Cup 2014 season (June
1 Initialize CMS data structures:
12-July 13). The tweets are filtered by popular/official World
2 Initialize CMS for total count suv and current count auv
Cup hashtags, such as #worldcup, #fifa, #brazil, etc. Similar
3 Initialize CMS for total count su and current count au
to TwitterSecurity, entity-entity co-mention temporal graphs
4 while new edge e = (u, v, t) is received: do
are constructed on 5 minute sample rate (8640 time points).
5 Update Counts:
6 Update CMS data structures for the new edge uv
7 Query Counts: Baseline: As described in our Related Work, only RHSS
8 Retrieve updated counts sûv and aûv and S EDAN S POT operate on edge streams and provide a
9 Retrieve updated counts sû , sˆv , aû , aˆv score for each edge. S EDAN S POT uses personalised PageR-
10 Compute Edge Scores: ank to detect anomalies in sublinear space and constant time
t2 per edge. However, RHSS was evaluated in (Eswaran and
11 score(u, v, t) = (aûv − suvˆ 2
t ) suv ˆ (t−1) Faloutsos 2018) on the DARPA dataset and found to have
12 Compute Node Scores: AUC of 0.17 (lower than chance). Hence, we only compare
t2
13 score(u, t) = (aû − sˆtu )2 sû (t−1) with S EDAN S POT.
2
14 score(v, t) = (aˆv − sˆtv )2 sˆv (t−1)
t
Evaluation Metrics: All the methods output an anomaly
15 Final Node Scores: score per edge (higher is more anomalous). We calculate the
16 output max{score(u, v, t), score(u, t), score(v, t)} True Positive Rate (TPR) and False Positive Rate (FPR) and
plot the ROC curve (TPR vs FPR). We also report the Area
under the ROC curve (AUC) and Average Precision Score.
Time and Memory Complexity Experimental Setup
In terms of memory, both M IDAS and M IDAS-R only need All experiments are carried out on a 2.7GHz Intel Core i5
to maintain the CMS data structures over time, which are processor, 16GB RAM, running OS X 10.14.6. We imple-
proportional to O(wb), where w and b are the number of ment M IDAS and M IDAS-R in C++. We use 2 hash func-
hash functions and the number of buckets in the CMS data tions for the CMS data structures, and we set the number of
structures; which is bounded with respect to the data size. CMS buckets to 2719 to result in an approximation error of
For time complexity, the only relevant steps in Algorithm ν = 0.001. For M IDAS-R, we set the temporal decay fac-
1 and 2 are those that either update or query the CMS data tor α as 0.5. We used an open-sourced implementation of
structures, which take O(w) (all other operations run in con- S EDAN S POT, provided by the authors, following parameter
stant time). Thus, time complexity per update step is O(w). settings as suggested in the original paper (sample size 500).

Experiments Q1. Accuracy

In this section, we evaluate the performance of M IDAS and Figure 2 plots the ROC curve for M IDAS-R, M IDAS and
M IDAS-R compared to S EDAN S POT on dynamic graphs. We S EDAN S POT. Figure 3(top) plots accuracy (AUC) vs. run-
aim to answer the following questions: ning time (log scale, in seconds, excluding I/O). We see that

3246
M IDAS achieves a much higher accuracy (= 0.94) com-
pared to the baseline (= 0.64), while also running sig-
niﬁcantly faster (0.31s vs. 156s). This is a 46% accuracy
improvement at 505× faster speed. M IDAS-R achieves the
highest accuracy (= 0.977) which is 52% accuracy im-
provement compared to the baseline at 163× faster speed.
Figure 3(bottom) plots the average precision score vs.
running time. We see that M IDAS is more precise (= 0.969)
compared to the baseline (= 0.751). This is a 29% preci-
sion improvement. M IDAS-R achieves the highest average
precision score (= 0.987) which is 31% more precise than
S EDAN S POT.
We see that M IDAS and M IDAS-R greatly outperform
S EDAN S POT on both accuracy and precision metrics.

Figure 3: (top) Accuracy (AUC) vs time, (bottom) Average

Precision Score vs time

than S EDAN S POT (0.96s vs. 156s) and M IDAS is 505×

faster than S EDAN S POT(0.31s vs 156s).
S EDAN S POT requires several subprocesses (hashing,
random-walking, reordering, sampling, etc), resulting in the
large computation time. M IDAS and M IDAS-R are both both
Figure 2: ROC for DARPA dataset scalable and fast.

Midas-R Midas slope = 1

Q2. Scalability 1
Elapsed Time (s)

Figure 4 shows the scalability of M IDAS and M IDAS-R. We 0.1

plot the wall-clock time needed to run on the (chronologi-
cally) ﬁrst 212 , 213 , 214 , ..., 222 edges of the DARPA dataset. 0.01
This conﬁrms the linear scalability of M IDAS and M IDAS-
R with respect to the number of edges in the input dynamic 0.001
graph due to its constant processing time per edge. Note that
both M IDAS and M IDAS-R process 4M edges within 1 sec- 0
1000 10000 100000 1000000 10000000
ond, allowing real-time anomaly detection.
Number of Edges
Figure 5 plots the number of edges (in millions) and time
to process each edge for DARPA dataset. M IDAS processes Figure 4: M IDAS and M IDAS-R scale linearly with the num-
2.38M edges within 1μs each and 1.77M edges within 2μs ber of edges in the input dynamic graph.
each. M IDAS-R processes 1.04M edges within 1μs each and
2.24M edges within 2μs each.
Table 2 shows the time it takes S EDAN S POT, M IDAS and
M IDAS-R to run on the TwitterWorldCup, TwitterSecurity Table 2: Running time for different datasets in seconds
and DARPA datasets. For TwitterWorldCup dataset, we see
that M IDAS-R is 108× faster than S EDAN S POT (0.48s vs. S EDAN S POT M IDAS M IDAS-R
52.17s) and M IDAS is 306× faster than S EDAN S POT(0.17s TwitterWorldCup 52.17s 0.17s 0.48s
vs 52.17s). For TwitterSecurity dataset, we see that M IDAS- TwitterSecurity 76.60s 0.27s 0.68s
R is 112× faster than S EDAN S POT (0.68s vs. 76.6s) and DARPA 156.66s 0.31s 0.96s
M IDAS is 283× faster than S EDAN S POT(0.27s vs 76.6s).
For the DARPA dataset, we see that M IDAS-R is 163× faster

3247
3
Frequency (millions of edges)
Midas-R Midas

0
1 2 10 >10
Time (microseconds)

Figure 5: Distribution of processing times for ∼ 4.5M

edges of DARPA dataset.

Q3. Real-World Effectiveness

We measure anomaly scores using M IDAS, M IDAS-R and
S EDAN S POT on the TwitterSecurity dataset. Figure 6 plots
anomaly scores vs. day (during the four months of 2014).
To visualise, we aggregate edges occurring in each day by
taking the max anomalousness score per day, for a total of
90 days. Anomalies correspond to major world news such as
Mpeketoni attack (Event 6) or Soma Mine explosion (Event
1). M IDAS and M IDAS-R show similar trends whereas
S EDAN S POT misses some anomalous events (Events 2, 7),
and outputs many high scores unrelated to any true events.
This is also reﬂected in the low accuracy and precision of
S EDAN S POT in Figure 3. The anomalies detected by M IDAS
and M IDAS-R coincide with major events in the TwitterSe-
curity timeline as follows: Figure 6: Anomalies detected by M IDAS and M IDAS-R cor-
respond to major security-related events in TwitterSecurity.
1. 13-05-2014. Turkey Mine Accident, Hundreds Dead
2. 24-05-2014. Raid.
3. 30-05-2014. Attack/Ambush.
03-06-14. Suicide bombing
888

4. 09-06-14. Suicide/Truck bombings.

44
5. 10-06-2014. Iraqi Militants Seized Large Regions.

4
11-06-2014. Kidnapping
6. 15-06-14. Attack
7. 26-06-14. Suicide Bombing/Shootout/Raid
Figure 7: Microcluster Anomaly in TwitterSecurity
8. 03-07-14. Israel Conﬂicts with Hamas in Gaza.
9. 18-07-14. Airplane with 298 Onboard was Shot Down
over Ukraine. Conclusion
10. 30-07-14. Ebola Virus Outbreak. In this paper, we proposed M IDAS and M IDAS-R for micro-
cluster based detection of anomalies in edge streams. Future
This shows the effectiveness of M IDAS and M IDAS-R for work could consider more general types of data, including
catching real-world anomalies. heterogeneous graphs or tensors. Our contributions are as
Microcluster anomalies: Figure 7 corresponds to Event follows:
7 in the TwitterSecurity dataset. All single edges are equiv-
1. Streaming Microcluster Detection: We propose a novel
alent to 444 edges and double edges are equivalent to 888
streaming approach for detecting microcluster anomalies,
edges between the nodes. This suddenly arriving (within
requiring constant time and memory.
1 day) group of suspiciously similar edges is an exam-
ple of a microcluster anomaly which M IDAS-R detects, but 2. Theoretical Guarantees: In Theorem 1, we show guaran-
S EDAN S POT misses. tees on the false positive probability of M IDAS.

3248
3. Effectiveness: Our experimental results show that M IDAS Rayana, S., and Akoglu, L. 2015. Less is more: Building se-
outperforms baseline approaches by 46%-52% accuracy lective anomaly ensembles with application to event detec-
(in terms of AUC), and processes the data 108−505 times tion in temporal graphs. In Proceedings of the 2015 SIAM
faster than baseline approaches. International Conference on Data Mining, 622–630. SIAM.
Rayana, S., and Akoglu, L. 2016. Less is more: Building
Acknowledgments selective anomaly ensembles. ACM Transactions on Knowl-
This work was supported in part by NUS ODPRT Grant R- edge Discovery from Data (TKDD) 10(4):42.
252-000-A81-133. Shin, K.; Hooi, B.; Kim, J.; and Faloutsos, C. 2017. Denseal-
ert: Incremental dense-subtensor detection in tensor streams.
References KDD.
Akoglu, L.; McGlohon, M.; and Faloutsos, C. 2010. Odd- Shin, K.; Eliassi-Rad, T.; and Faloutsos, C. 2018. Patterns
ball: Spotting anomalies in weighted graphs. In PAKDD. and anomalies in k-cores of real-world graphs with applica-
Akoglu, L.; Tong, H.; and Koutra, D. 2015. Graph based tions. KAIS 54(3):677–710.
anomaly detection and description: a survey. Data Mining Sricharan, K., and Das, K. 2014. Localizing anomalous
and Knowledge Discovery 29(3):626–688. changes in time-evolving graphs. In Proceedings of the 2014
Beutel, A.; Xu, W.; Guruswami, V.; Palow, C.; and Falout- ACM SIGMOD International Conference on Management
sos, C. 2013. Copycatch: stopping group attacks by spotting of Data, SIGMOD ’14, 1347–1358. New York, NY, USA:
lockstep behavior in social networks. In WWW. ACM.
Chakrabarti, D. 2004. Autopart: Parameter-free graph par- Sun, J.; Faloutsos, C.; Papadimitriou, S.; and Yu, P. 2007.
titioning and outlier detection. In PKDD. Graphscope: parameter-free mining of large time-evolving
Cormode, G., and Muthukrishnan, S. 2005. An improved graphs. In Proceedings of the 13th ACM SIGKDD interna-
data stream summary: the count-min sketch and its applica- tional conference on Knowledge discovery and data mining,
tions. Journal of Algorithms 55(1):58–75. 687–696.
Eswaran, D., and Faloutsos, C. 2018. Sedanspot: Detect- Sun, J.; Tao, D.; and Faloutsos, C. 2006. Beyond streams
ing anomalies in edge streams. In 2018 IEEE International and graphs: dynamic tensor analysis. In KDD.
Conference on Data Mining (ICDM), 953–958. IEEE. Tong, H., and Lin, C.-Y. 2011. Non-negative residual matrix
Eswaran, D.; Faloutsos, C.; Guha, S.; and Mishra, N. 2018. factorization with application to graph anomaly detection. In
Spotlight: Detecting anomalies in streaming graphs. In SDM.
KDD. Yoon, M.; Hooi, B.; Shin, K.; and Faloutsos, C. 2019. Fast
Gupta, M.; Gao, J.; Sun, Y.; and Han, J. 2012. Integrating and accurate anomaly detection in dynamic graphs with a
community matching and outlier detection for mining evolu- two-pronged approach. In Proceedings of the 25th ACM
tionary community outliers. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discov-
SIGKDD International Conference on Knowledge Discov- ery & Data Mining, 647–657. ACM.
ery and Data Mining, KDD ’12, 859–867. New York, NY, Yu, W.; Aggarwal, C. C.; Ma, S.; and Wang, H. 2013. On
USA: ACM. anomalous hotspot discovery in graph streams. In ICDM.
Hooi, B.; Shin, K.; Song, H. A.; Beutel, A.; Shah, N.; and
Faloutsos, C. 2017. Graph-based fraud detection in the face
of camouﬂage. TKDD 11(4):44.
Jiang, M.; Cui, P.; Beutel, A.; Faloutsos, C.; and Yang, S.
2016. Catching synchronized behaviors in large networks:
A graph mining approach. TKDD 10(4):35.
Kleinberg, J. M. 1999. Authoritative sources in a hyper-
linked environment. JACM 46(5):604–632.
Koutra, D.; Vogelstein, J. T.; and Faloutsos, C. 2013. Delta-
con: A principled massive-graph similarity function. arXiv
preprint arXiv:1304.4657.
Lippmann, R.; Cunningham, R. K.; Fried, D. J.; Graf, I.;
Kendall, K. R.; Webster, S. E.; and Zissman, M. A. 1999.
Results of the darpa 1998 ofﬂine intrusion detection evalua-
tion. In Recent advances in intrusion detection, volume 99,
829–835.
Ranshous, S.; Harenberg, S.; Sharma, K.; and Samatova,
N. F. 2016. A scalable approach for outlier detection in edge
streams using sketch-based approximations. In Proceedings
of the 2016 SIAM International Conference on Data Mining,
189–197. SIAM.

3249

Network Anomaly Detection-Methods, Systems and Tools
No ratings yet
Network Anomaly Detection-Methods, Systems and Tools
34 pages
Anomaly Detection in Dynamic Networks - A Survey
No ratings yet
Anomaly Detection in Dynamic Networks - A Survey
25 pages
Monitoring The Network Monitoring System: Anomaly Detection Using Pattern Recognition
No ratings yet
Monitoring The Network Monitoring System: Anomaly Detection Using Pattern Recognition
4 pages
Multi-Level Association Rules For Anomaly Extraction in Backbone Network With Improved Scalability and Efficiency
No ratings yet
Multi-Level Association Rules For Anomaly Extraction in Backbone Network With Improved Scalability and Efficiency
8 pages
2021 - A Graph Neural Network Method For Distributed Anomaly Detection in IoT - Protogerou Et Al
No ratings yet
2021 - A Graph Neural Network Method For Distributed Anomaly Detection in IoT - Protogerou Et Al
18 pages
Time-Series Anomaly Detection Service at Microsoft
No ratings yet
Time-Series Anomaly Detection Service at Microsoft
9 pages
AI-Driven Anomaly Detection in Network Monitoring
No ratings yet
AI-Driven Anomaly Detection in Network Monitoring
6 pages
Single Pass Anomaly Detection
No ratings yet
Single Pass Anomaly Detection
14 pages
Graph Anomaly Detection With Graph Neural Networks-Current Status and Challenges
No ratings yet
Graph Anomaly Detection With Graph Neural Networks-Current Status and Challenges
8 pages
2015KDD_Generic and Scalable Framework for Automated Time-series Anomaly Detection
No ratings yet
2015KDD_Generic and Scalable Framework for Automated Time-series Anomaly Detection
9 pages
3. Tensor-Based Online Network Anomaly Detection and Diagnosis
No ratings yet
3. Tensor-Based Online Network Anomaly Detection and Diagnosis
26 pages
Graph Anomaly Detection With Graph Neural Networks Current Status and Challenges
No ratings yet
Graph Anomaly Detection With Graph Neural Networks Current Status and Challenges
10 pages
05graph Anomaly
No ratings yet
05graph Anomaly
136 pages
Path
No ratings yet
Path
16 pages
2021 - CmaGraph - Lin Et Al
No ratings yet
2021 - CmaGraph - Lin Et Al
12 pages
Symmetry 15 01205
No ratings yet
Symmetry 15 01205
21 pages
A Two-Stage Aggregation/thresholding Scheme For Multi-Model Anomaly-Based Approaches
No ratings yet
A Two-Stage Aggregation/thresholding Scheme For Multi-Model Anomaly-Based Approaches
8 pages
1-s2.0-S2214212622000394-main
No ratings yet
1-s2.0-S2214212622000394-main
8 pages
Research
No ratings yet
Research
15 pages
Variational restricted Boltzmann machines to automated anomaly detection
No ratings yet
Variational restricted Boltzmann machines to automated anomaly detection
14 pages
1 s2.0 S1110016824002850 Main
No ratings yet
1 s2.0 S1110016824002850 Main
11 pages
Smart Anomaly Detection in Sensor Systems: A Multi-Perspective Review
No ratings yet
Smart Anomaly Detection in Sensor Systems: A Multi-Perspective Review
21 pages
2021 - Discovering Attack Scenarios Via Intrusion Alert Correlation Using Graph - Cheng Et Al
No ratings yet
2021 - Discovering Attack Scenarios Via Intrusion Alert Correlation Using Graph - Cheng Et Al
4 pages
Graph Based Anomaly Detection and Description: A Survey: Leman Akoglu Hanghang Tong Danai Koutra
No ratings yet
Graph Based Anomaly Detection and Description: A Survey: Leman Akoglu Hanghang Tong Danai Koutra
63 pages
El 17 01 08
No ratings yet
El 17 01 08
8 pages
AnomalyDetection_HimanshuDhakad_YuvrajSingh
No ratings yet
AnomalyDetection_HimanshuDhakad_YuvrajSingh
9 pages
1 s2.0 S0957417423014963 Main
No ratings yet
1 s2.0 S0957417423014963 Main
20 pages
Anomaly Detection in Cybersecurity With Graph Based Approaches
No ratings yet
Anomaly Detection in Cybersecurity With Graph Based Approaches
9 pages
High-efficiency Anomaly Detection of Traffic Data
No ratings yet
High-efficiency Anomaly Detection of Traffic Data
9 pages
Graph Diffusion Models for Anomaly Detection
No ratings yet
Graph Diffusion Models for Anomaly Detection
6 pages
CauseFormer Interpretable Anomaly Detection With Stepwise Attention for Cloud Service
No ratings yet
CauseFormer Interpretable Anomaly Detection With Stepwise Attention for Cloud Service
16 pages
Anomaly Detection For Mobile Network
No ratings yet
Anomaly Detection For Mobile Network
19 pages
IForest ASD
No ratings yet
IForest ASD
6 pages
4-anomaly-detection
No ratings yet
4-anomaly-detection
4 pages
SNA - Anomaly Detection in Networks
No ratings yet
SNA - Anomaly Detection in Networks
34 pages
Anomaly Detection Analysis and Prediction-2019
No ratings yet
Anomaly Detection Analysis and Prediction-2019
18 pages
APznzaYnecyXMEr-LQv9QUeETcJbwmNAK5O2xkfwKE5El6mPIXg-eQ6OudWQ8xqcHCcshI4kt4YoHR-8-Lae73pSsYtWtH3sqgsmz-84SS5iw7zEzloHoCgnXck3YIYOl394oOSaCz1LiK_6zHRd4YBxHjFbOFpkQIw7oHY5cPjdLVc05WexLgIMCgl1DJr8l7m7Ov56K9yGpHibBITGrM
No ratings yet
APznzaYnecyXMEr-LQv9QUeETcJbwmNAK5O2xkfwKE5El6mPIXg-eQ6OudWQ8xqcHCcshI4kt4YoHR-8-Lae73pSsYtWtH3sqgsmz-84SS5iw7zEzloHoCgnXck3YIYOl394oOSaCz1LiK_6zHRd4YBxHjFbOFpkQIw7oHY5cPjdLVc05WexLgIMCgl1DJr8l7m7Ov56K9yGpHibBITGrM
16 pages
2012 Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree Algorithm
No ratings yet
2012 Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree Algorithm
9 pages
14 Dami Graphanomalysurvey
No ratings yet
14 Dami Graphanomalysurvey
68 pages
Anomaly Detection 2
No ratings yet
Anomaly Detection 2
20 pages
A Survey of Anomaly Detection Methods in Networks: Weiyu Zhang, Qingbo Yang, Yushui Geng
No ratings yet
A Survey of Anomaly Detection Methods in Networks: Weiyu Zhang, Qingbo Yang, Yushui Geng
3 pages
Intrusion Detection in Wireless Sensor Networks
No ratings yet
Intrusion Detection in Wireless Sensor Networks
5 pages
Machine Learning Approaches To Network Anomaly Detection: Tarem Ahmed, Boris Oreshkin and Mark Coates
No ratings yet
Machine Learning Approaches To Network Anomaly Detection: Tarem Ahmed, Boris Oreshkin and Mark Coates
6 pages
Ahmed PDF
No ratings yet
Ahmed PDF
6 pages
Evaluation of Anomaly Detection Method B PDF
No ratings yet
Evaluation of Anomaly Detection Method B PDF
2 pages
2503.00036v1
No ratings yet
2503.00036v1
17 pages
Paper 7 CN
No ratings yet
Paper 7 CN
4 pages
Real-Time_Anomaly_Detection_and_Classification_from_Surveillance_Cameras_Using_Deep_Neural_Network
No ratings yet
Real-Time_Anomaly_Detection_and_Classification_from_Surveillance_Cameras_Using_Deep_Neural_Network
6 pages
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
No ratings yet
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
13 pages
A Spatiotemporal Deep Learning Approach For Unsupervised Anomaly Detection in Cloud Systems
No ratings yet
A Spatiotemporal Deep Learning Approach For Unsupervised Anomaly Detection in Cloud Systems
15 pages
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
No ratings yet
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
17 pages
7. NADS-RA_Network_Anomaly_Detection_Scheme_Based_on_Feature_Representation_and_Data_Augmentation
No ratings yet
7. NADS-RA_Network_Anomaly_Detection_Scheme_Based_on_Feature_Representation_and_Data_Augmentation
20 pages
Multi Level Deep Learning Model For Network Anomal
No ratings yet
Multi Level Deep Learning Model For Network Anomal
12 pages
Vertuamneto 2021
No ratings yet
Vertuamneto 2021
8 pages
CnS
No ratings yet
CnS
7 pages
Graph Neural Network-Based Anomaly Detection in Multivariate Time Series
No ratings yet
Graph Neural Network-Based Anomaly Detection in Multivariate Time Series
9 pages
Research 2
No ratings yet
Research 2
12 pages
Sun - Etal - 2021 - Generic and Scalable Periodicity Adaptation For Time Series Anomaly Detection
No ratings yet
Sun - Etal - 2021 - Generic and Scalable Periodicity Adaptation For Time Series Anomaly Detection
18 pages
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
From Everand
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
Fouad Sabry
No ratings yet
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
From Everand
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
Fouad Sabry
No ratings yet
GNN Hands On 04
No ratings yet
GNN Hands On 04
8 pages
unit-3.2 static interconnection networks
No ratings yet
unit-3.2 static interconnection networks
10 pages
Binomial Theorem - Question
No ratings yet
Binomial Theorem - Question
3 pages
DAA Lab Manual
No ratings yet
DAA Lab Manual
19 pages
Untitled
No ratings yet
Untitled
27 pages
12-Graphs-Traversal-Topological Sort-Shortest Path
No ratings yet
12-Graphs-Traversal-Topological Sort-Shortest Path
57 pages
Chapter 4 Trigonometry
100% (1)
Chapter 4 Trigonometry
26 pages
Even Odd
No ratings yet
Even Odd
5 pages
Graph Tree Notes
No ratings yet
Graph Tree Notes
76 pages
Scheme of Study For Bachelor of Science in Computer Science Bs (CS)
No ratings yet
Scheme of Study For Bachelor of Science in Computer Science Bs (CS)
20 pages
Tutorial 7,8,9
No ratings yet
Tutorial 7,8,9
2 pages
Mahi Math
No ratings yet
Mahi Math
6 pages
F Kdks - Kfefr DK Ifjp
No ratings yet
F Kdks - Kfefr DK Ifjp
23 pages
TEC - Algorithm Lab
No ratings yet
TEC - Algorithm Lab
42 pages
Exponential Inequalities 2
No ratings yet
Exponential Inequalities 2
9 pages
Trigo WS 7
No ratings yet
Trigo WS 7
4 pages
Groups and Graphs Designs and Dynamics 1st Edition R. A. Bailey - Download the ebook today and own the complete content
No ratings yet
Groups and Graphs Designs and Dynamics 1st Edition R. A. Bailey - Download the ebook today and own the complete content
47 pages
Binomial Expansion Notes
No ratings yet
Binomial Expansion Notes
6 pages
Cos Sine Tan Formulas
No ratings yet
Cos Sine Tan Formulas
1 page
Trignometry
No ratings yet
Trignometry
2 pages
ADA Lab Programs-2024
No ratings yet
ADA Lab Programs-2024
16 pages
Maharishi University of Management: CS 435 - Design and Analysis of Algorithms
No ratings yet
Maharishi University of Management: CS 435 - Design and Analysis of Algorithms
8 pages
When To Use Sine Rule and Cosine Rule
No ratings yet
When To Use Sine Rule and Cosine Rule
1 page
Unit 10 Review
No ratings yet
Unit 10 Review
5 pages
TSP Using Branch and Bound
No ratings yet
TSP Using Branch and Bound
5 pages
A Study On Graph Theory of Path Graphs: Mr. B. Senthilkumar, M. Jayafranglin
No ratings yet
A Study On Graph Theory of Path Graphs: Mr. B. Senthilkumar, M. Jayafranglin
7 pages
Integer Partition Up To Number 20
No ratings yet
Integer Partition Up To Number 20
59 pages
Backtracking: General Method
No ratings yet
Backtracking: General Method
68 pages
5 - Finding Reference Angles
No ratings yet
5 - Finding Reference Angles
14 pages
Chapter 6.1 Biconnected Components
No ratings yet
Chapter 6.1 Biconnected Components
30 pages

20 - AAAI - MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Uploaded by

20 - AAAI - MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Uploaded by

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

M IDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Abstract Moreover, fraudulent or anomalous events in many appli-

S EDAN S POT (2018)

Experiments Q1. Accuracy

Figure 3: (top) Accuracy (AUC) vs time, (bottom) Average

than S EDAN S POT (0.96s vs. 156s) and M IDAS is 505×

Midas-R Midas slope = 1

Figure 4 shows the scalability of M IDAS and M IDAS-R. We 0.1

Figure 5: Distribution of processing times for ∼ 4.5M

Q3. Real-World Effectiveness

4. 09-06-14. Suicide/Truck bombings.

You might also like