0% found this document useful (0 votes)
58 views

Spell Streaming Parsing of System Event Logs

Spell is an online streaming method for parsing system event logs into structured data in real-time. It utilizes a longest common subsequence approach to dynamically extract log patterns from incoming logs and maintain discovered message types. Evaluation on large real logs shows Spell has better efficiency and effectiveness than offline log parsing alternatives.

Uploaded by

redzgn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Spell Streaming Parsing of System Event Logs

Spell is an online streaming method for parsing system event logs into structured data in real-time. It utilizes a longest common subsequence approach to dynamically extract log patterns from incoming logs and maintain discovered message types. Evaluation on large real logs shows Spell has better efficiency and effectiveness than offline log parsing alternatives.

Uploaded by

redzgn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Spell: Streaming Parsing of System Event Logs

Min Du, Feifei Li


School of Computing, University of Utah
[email protected], [email protected]

Abstract—System event logs have been frequently used as a approach that leverages the source code [12] which is often
valuable resource in data-driven approaches to enhance system unavailable, none of the previous methods could achieve online
health and stability. A typical procedure in system log analytics parsing in a streaming fashion. Some work claimed “online”
is to first parse unstructured logs, and then apply data analysis
on the resulting structured data. Previous work on parsing processing, but with the requirement of doing some extensive
system event logs focused on offline, batch processing of raw offline processing first, and only then matching log entries
log files. But increasingly, applications demand online monitoring with the data structures and patterns identified first through
and processing. We propose an online streaming method Spell, the offline, batched process [13].
which utilizes a longest common subsequence based approach,
to parse system event logs. We show how to dynamically extract There is also an increasing demand to properly manage and
log patterns from incoming logs and how to maintain a set of store system logs [14]. A log management system typically has
discovered message types in streaming fashion. Evaluation results a log shipper installed on each node to forward log entries to
on large real system logs demonstrate that even compared with a centralized server, which often contains a log parser, a log
the offline alternatives, Spell shows its superiority in terms of indexer, a storage engine and a user interface. In such systems
both efficiency and effectiveness.
the default log parser only parses simple schema information
I. I NTRODUCTION such as timestamp and hostname. The log entry itself is treated
as an unstructured text value. An online structured approach
The increasing complexity of modern computer systems has that could parse the event logs into structured data could make
become a significant limiting factor in deploying and manag- the logs much easier to query, summarize and aggregate.
ing them. Being able to be alerted and mitigate the problem Log entries are produced by the “print” statements in a
right away has become a fundamental requirement in many system program’s source code. As such, we can view a log
systems. As a result, automatically detecting anomalies upon entry as a collection of (“message type”, “parameter value”)
happening in an online fashion is an appealing solution. Data- pairs. For example, a log printing statement printf(“File %d
driven methods are heavily employed to understand complex finished.”, id); contains a constant message type File finished
system behaviors, for example, exploring machine data for and a variable parameter value which is the file id. Hence,
automatic pattern discovery and anomaly detection [1]. System the goal of a structured log parser is to identify the message
logs, as a universal data source that contains important infor- type File * finished, where * stands for the place holder for
mation such as usage patterns, execution paths, and program variables (parameter values).
running status, are valuable assets in assisting these data-driven
system analytics, in order to gain insights that are useful to Contributions. In this paper, we propose Spell, a structured
enhance system health, stability, and usability. Streaming Parser for Event Logs using an LCS (longest com-
The effectiveness of system log mining has been validated mon subsequence) based approach. Spell parses unstructured
by recent literature. Logs could be used to detect execution log messages into structured message types and parameters in
anomalies [2], [3], [4], monitor network failures [5], or even an online streaming fashion. The time complexity to process
find software bugs [6]. Researchers have also used system logs each log entry e is close to linear (to the size of e).
to discover and diagnose performance problems [7]. Recently With streaming, real-time message type and parameter ex-
to untangle the interleaved event logs from concurrent systems traction produced by Spell, not only it provides a concise,
has also become a hot topic of research [8]. intuitive summary for the end users, but the logs are also
To alleviate the pain of diving into massive unstructured represented by clean structured data to be processed and ana-
log data, in most prior work, the first and foremost step is to lyzed further using advanced data analytics methods by down-
automatically parse the unstructured system logs to structured stream analysts. Using two state-of-the-art offline methods to
data [2], [3], [4], [6]. There have been a substantial study on automatically extract message types and parameters from raw
how to achieve this, for example, using regular expressions log files as the competing baseline, our study shows that even
[8], leveraging the source code [6], or parsing purely based compared with the offline methods, Spell still outperforms
on system log characteristics using data mining approaches them in terms of both efficiency and effectiveness.
such as clustering and iterative partitioning [2], [9], [10], [11]. The rest of this paper is organized as follows. Section
Nevertheless, except the approach that uses regular expressions II provides the problem formulation and a literature survey.
which requires domain-specific expert knowledge [8], hence, Section III presents our streaming Spell algorithm and a
does not work for general purpose system log parsing, or the number of optimizations. Section IV evaluates our method
using large real system logs. Finally, section V concludes the system logs, such as log size and the bipartite relationship
paper and section VI is our acknowledgement. between words in the same log message. LogTree [10] utilized
II. P RELIMINARY AND BACKGROUND the format information of raw logs and applied a tree structure
to extract system events from raw logs. LogSig [11] generates
A. Problem formulation
system events from textual log messages by searching the
System event logs are a universal resource that exists
most representative message signatures. HELO [13] extracts
practically in any system. We use system event logs to denote
constants and variables from message bodies, by first using an
the free-text audit trace generated by the system execution
offline classification step and then performing online clustering
(typically in the /var/log folder). A log message or a log
based on the template set by the first step. HLAer [16] is a
record/entry refers to one line in the log file, which is produced
heterogeneous log analysis system which utilizes a hierarchical
by a log printing statement in the source code of a user or
clustering approach with pairwise log similarity measures
kernel program running on or inside the system.
to assist log formatting. All previous structured log parsing
Our goal is to parse each log entry e into a collection of
methods focus on offline batched processing or matching new
message types (and the corresponding parameter values). Here
log entries with previously offline-extracted message types or
each message type in e has a one-to-one mapping with a log
regular expressions (e.g., from source code).
printing statement in the source code producing the raw log
There are also commercial and open source softwares on log
entry e. For example, a log printing statement:
printf("Temperature %s exceeds warning threshold\n", tmp);
management and analysis. Splunk is a leading log management
system that offers a suite of solutions to find useful information
may produce several log entries such as:
Temperature (41C) exceeds warning threshold
from machine data. Elastic Stack offers a rich set of open-
sourced tools that could gather logs from distributed nodes,
where the parameter value is 41C, and the message type is:
and then index, store, for user to query/visualize. All these
Temperature * exceeds warning threshold.
tools provide interface to parse logs upon their arrival. How-
Formally, a structured log parser is defined as follows:
ever, their parsers are based on regular expressions defined
Definition 1 (Structured Log Parser) Given an ordered set
by end users. The system itself can only parse very simple
of log entries (ordered by timestamps), log= {e1 , e2 , . . . , eL },
and basic structured schema such as timestamp and hostname,
that contain m distinct message types produced by m different
while log messages are treated as unstructured text values.
log printing statements from p different programs, where the
values of m and p (and the printing statements and the program III. Spell: STREAMING STRUCTURED LOG PARSER
source code) are unknown, a structured log parser is to parse We now present Spell, a streaming structured log parser for
log and produce all message types from those m statements. system event logs. Since a basic building block for Spell is a
A structured log parser is the first and foremost step for most longest common subsequence (LCS) algorithm, hence, Spell
automatic and smart log mining and data-driven log analytics stands for Streaming structured Parser for Event Logs using
solutions, and also a useful and critical step for managing LCS. In what follows, we first review the LCS problem.
logs in a log management system. Our objective is to design A. The LCS problem
a streaming structured log parser such that it makes only one Suppose Σ is a universe of alphabets (e.g., a-z, 0-9). Given
pass over the log and processes each log entry in an online, any sequence α = {a1 , a2 , ..., am }, such that ai ∈ Σ for 1 ≤
streaming fashion continuously. Without loss of generality, we i ≤ m, a subsequence of α is defined as {ax1 , ax2 , . . . , axk },
assume that the size of each log entry is O(n) words. where ∀xi , xi ∈ Z+ , and 1 ≤ x1 < x2 < · · · < xk ≤ m. Let
B. Related work β = {b1 , b2 , ..., bn } be another sequence such that bj ∈ Σ for
Mining interesting patterns from raw system logs has been j ∈ [1, n]. A subsequence γ is called a common subsequence
an active research field for over a decade. Two major efforts of α and β iff it is a subsequence of each. The longest common
in this area include generating features from raw logs to apply subsequence (LCS) problem for input sequences α and β is to
various data analytics, e.g. [3], [4], [6], and building execution find longest such γ. For instance, sequence {1, 3, 5, 7, 9} and
models from system logs followed by comparing it with future sequence {1, 5, 7, 10} yields an LCS of {1, 5, 7}.
system executions, e.g. [2]. There are also efforts in identifying We observe that an LCS-based method can be developed
dependencies from concurrent logs [3], [4], [8]. to efficiently and effectively extract message types from raw
To achieve effective data-driven log analytics, the first and system logs. This is a seemingly natural idea, yet has not been
foremost process is to turn unstructured logs into structured explored by existing literature. Our key observation is that, if
data. Xu et al. [6] used the schema from log printing statements we view the output by a log printing statement (which is a
in the original programs’ source code to extract message types. log entry) as a sequence, in most log printing statements, the
In [8], the raw logs are parsed using pre-defined, domain- constant that represents a message type often takes a majority
specific regular expressions. There are efforts to make this pro- part of the sequence and the parameter values take only a
cess more automatic and more accurate. Fu et al. [2] proposed small portion. If two log entries are produced by the same
a method to first cluster log entries using pairwise weighted log printing statement stat, but only differ by having different
edit distance, and then perform recursively splitting. IPLoM parameter values, the LCS of the two sequences is very likely
[9], [15] explored several heuristics to iteratively partition to be the constant in the code stat, implying a message type.
new log entry: Temperature (41C) exceeds new log entry: Temperature (43C) exceeds new log entry: Commandhas completed new log entry: norecent update
warning threshold warning threshold successfully

LCSObject

LCSObject

LCSObject LCSObject
LCSseq: Temperature (41C) LCSseq: Temperature * exceeds LCSseq: Temperature * exceeds
exceeds warning threshold warning threshold warning threshold
lineIds: {0} lineIds: {0, 1} lineIds: {0, 1}

LCSseq: Command has completed


successfully
...
lineIds: {2}
LCSMap LCSMap LCSMap LCSMap

Fig. 1. Basic workflow of Spell.


The merit of using the LCS formulation to parse system number of tokens in a log entry e), we consider the LCSseq
event logs, as compared with the previously mentioned clus- qj and the new log sequence s having the same message type.
tering and iterative partitioning methods, is that the LCS The intuition is that the LCS of qj and s is the maximum
sequence of two log messages is naturally a message type, LCS among all LCSObjects in the LCSMap, and the length
which makes streaming log parsing possible. of LCS(qj , s) is at least half the length of s; hence, unless
the total length of parameter values in e is more than half
B. Basic notations and data structure of its size, which is very unlikely in practice, the length of
In a log entry e, we call each word a token. A log entry e LCS(qj , s) is a good indicator whether the log entries in the
could be parsed to a set of tokens using system defined (or as jth LCSObject (which share the LCSseq qj ) share the same
user input) delimiters according to the format of the log. In message type with e or not (which would be LCS(qj , s)).
general common delimiters such as space and equal sign are If there are multiple LCSObjects having the same max `
sufficient to cover most cases. After tokenization of a log, each values, we choose the one with the smallest |qj | value, since
log entry is translated into a “token” sequence, which we will it has a higher set similarity value with s. Then we use
use to compute the longest common subsequence, i.e., Σ = backtracking to generate a new LCS sequence to represent the
{tokens from e1 } ∩ {tokens from e2 } · · · ∩ {tokens from eL }. message type for all log entries in the jth LCSObject and e.
Each log entry is assigned a unique line id which is initialized Note when using backtracking to get the new LCSseq of qj and
to 0 and auto-incremented for the arrival of a new log entry. s, we mark ‘*’ at the places where the two sequences disagree,
We create a data structure called LCSObject to hold cur- as the place holders for parameters, and consecutive adjacent
rently parsed LCS sequences and the related metadata infor- ‘*’s are merged into one ‘*’. For instance, consider the
mation. We use LCSseq to denote a sequence that’s the LCS following two sequences: s = Command Failed on: node-127 and
of multiple log messages, which, in our setting, is a candidate qj =Command Failed on: node-235 node-236, LCSseq of the two
for the message type of those log entries. That said, each would be: Command Failed on: *. Once this is done, we update
LCSObject contains an LCSseq and a list of line indices called the LCSseq of the jth LCSObject from qj to LCS(qj , s), and
lineIds that stores the line ids for the corresponding log entries add e’s line id to the lineIds of the jth LCSObject.
that lead to this LCSseq. Finally, we store all currently parsed If none of the existing qi ’s shares an LCS with s that is
LCSObjects into a list called LCSMap. When a new log entry at least |s|/2 in length, we create a new LCSObject for e in
ei arrives, we first compare it with all LCSseq’s in existing LCSMap, and set its LCSseq as s itself.
LCSObjects in LCSMap, then based on the results, either This completes the basic procedures in Spell, and most
insert the line id i to the lineIds of an existing LCSObject, standard logs could be successfully parsed using this method.
or compute a new LCSObject and insert it into LCSMap. We further describe how to improve its efficiency.

C. Basic workflow D. Improvement on efficiency


In this section we show how to achieve nearly optimal time
Our algorithm runs in a streaming fashion, as shown in complexity for most incoming log entries (i.e., linear to |s|,
Figure 1. Initially, the LCSMap list is empty. When a new log the number of tokens of a log entry e). In our basic method,
entry ei arrives, it is firstly parsed into a token sequence si when a new log entry arrives, we’ll need to compute the length
using a set of delimiters. After that, we compare si with the of its LCS with each existing message type. Suppose each log
LCSseq’s from all LCSObjects in the current LCSMap, to see entry is of size O(n) for some small constant n (i.e., n = |s|),
if si “matches” one of the existing LCSseq’s (hence, line id it takes O(n2 ) time to compute LCS of a log entry and an
i is added to the lineIds of the corresponding LCSObject), or existing message type (using a standard dynamic programming
we need to create a new LCSObject for LCSMap. (DP) formulation). Let m0 be the number of currently parsed
Get new LCS. Given a new log sequence s produced by the message types in LCSMap. The method in section III-C leads
tokenization of a new log entry e, we search through LCSMap. to a time complexity of O(m0 · n2 ) for each new log entry.
For the ith LCSObject, suppose its LCSseq is qi , we compute Note that since the number of possible tokens in a complex
the value `i , which is the length of the LCS(qi , s). While system could be large, we cannot apply techniques that com-
searching through the LCSMap, we keep the largest `i value pute LCS or MLCS efficiently by assuming a limited set of
and the index to the corresponding LCSObject. In the end, alphabets [17], [18], i.e., by assuming small |Σ| values.
if `j = max(`0i s) is greater than a threshold τ (by default, A key observation is that, for a vast majority of new log
τ = |s|/2, where |s| denotes the length of a sequence s, i.e., entries (over 99.9% in our evaluation), their message types
Prefix tree of Strs:
?: A B P C ROOT records, for each new log entry, the message type returned
by the prefix tree approach (if found), is 100% equal to the
A E
parameter results returned by the simple loop method. But there also
B C D F
exist cases where the returned message type by prefix tree is
C D less than 12 number of tokens ( 12 |s|) for a new log entry e
Fig. 2. Find the subsequence of σ using Prefix Tree.
while e’s message type already exists in LCSMap.
are often already present in currently parsed message types That said, the complete pre-filtering step in Spell is, for each
(stored by LCSMap). Hence instead of computing the LCS new log entry e, first find its message type using prefix tree,
between a new log entry and each exiting message type, we and if not found, apply the simple loop lookup. In evaluation
adopt a pre-filtering step to find if its message type already section we’ll show that Spell with pre-filtering step produces
exists, which reduces to the following problem: almost equally good results for all logs with much less cost.
For a new string σ and a set of current strings For log entries (less than 0.1% in our evaluation) that
strs = {str1 , str2 , ...strm }, find the longest stri such that do not find message types using the pre-filtering step, we
LCS(σ, stri ) = stri , and return true if |stri | >= 12 |σ|. compare the new log entry e with all existing message types
In our problem setting, each string is a set of tokens and to see if a new message type could be generated. However,
we simply view each token as a character. instead of computing LCS between each message type q and
1) Simple loop approach. A naive method is to simply loop e, we first compute their set similarity score using Jaccard
through strs. For each stri , maintain a pointer pi pointing to similarity. Only for those message types that have more than
the head of stri , and another pointer pt pointing to the head half common elements (i.e., tokens) with e do we compute
of σ. If the characters (or tokens in our case) pointed to by pi their LCS. Then if their LCS length exceeds 12 |s|, we adjust
and pt match, advance both pointers; otherwise only advance that message type and prefix tree T accordingly. Otherwise
pointer pt. When pt has gone to the end of σ, check if pi has we simply add e to T and LCSMap as a new message type.
also reached the end of stri . A pruning can be applied which
E. Time complexity analysis
is to skip stri if its length is less than 12 |σ|. The worst time
Spell ensures that the size of LCSMap increases by one
complexity for this approach is O(m · n).
only when a new message type is identified; otherwise, a new
2) Prefix tree approach. To avoid going through the entire
log entry id is added to an existing LCSObject with an updated
strs set, we could index stri s in strs using a prefix tree, and
message type. This guarantees that LCSMap size is at most
prune away many candidates.
the number of total message types (which is m) that could
An example is shown in Figure 2 where strs = {ABC, ACD,
be produced by the corresponding source code, which is a
AD, EF}, and they are indexed by a prefix tree T . Instead of
constant. In section III-D, we’ve shown that for the basic
checking σ against every stri in strs, we first check tree T
Spell, the time complexity for each new log entry is O(m·n2 ),
and see if there is an existing stri that is a subsequence of
since the naive DP method to compute LCS is O(n2 ) for
σ. If such a stri is identified, we check if |stri | > 21 |σ|).
log entries of size O(n), whereas our backtracking method
As shown in Figure 2, suppose σ=ABPC. Then by comparing
is often cheaper and we only do it with a target message type
each character of σ with each node of T , we could efficiently
in LCSMap which has the longest LCS length with respect to
prune most branches in T , and mark the characters in σ that
the new log entry if the length exceeds a threshold.
do not match any node in T as parameters. In this case, we
With the pre-filtering step, for each log entry, we’ll first
identify ABC as the message type for σ, and P as its parameter.
try to find its message type in prefix tree, then apply simple
For most log entries, it is highly likely that their message
loop approach, and only for the small portion that are still
type already exists in tree T , so Spell will stop here, and the
not located, the LCSMap needs to be compared against. For
time complexity is only O(n). This is optimal, since we have
L log records, suppose the number of log records that fail to
to go through every token in a new log entry at least once.
find message types in pre-filtering step is F , and the number
However, this approach only guarantees to return a stri if
of log records that are returned in simple loop step is I. The
such stri = LCS(σ, stri ) exists. It does not guarantee that the
amortized cost for each log record is only O(n + (I+F L
)
·m·
returned stri is the longest one among all stri s that satisfy stri F 2
n + L · m · n ), where m is the number of message types and
= LCS(σ, stri ). For example, if σ=DAPBC while strs={DA,
ABC}, the prefix tree returns DA instead of ABC. n is the log record length. In our evaluation, (I+FL
)
< 0.01
In practice, we find that although the prefix tree approach and FL < 0.001, thus the cost for each log record to find its
does not guarantee to find the longest message type, its message type in Spell is approximately only O(n) in practice.
returned message type is almost identical to the results of F. Remarks
simple loop method. That’s because parameters in each log Parsing each log message to extract their message type,
record tend to appear near the end. In fact one of the state-of- though a vital step for many further data analysis, is not an
art offline methods [2] finds message types by using weighted easy task. It should be noted that no automatic approach is
edit distance and assigns more weight to the token closer to perfect for all possible logs. For example, even the approach
end as parameter position. In particular, the evaluation results that extracts log schema from the source code [6] that produces
show that for the Los Almos HPC log with 433,490 log the corresponding log cannot achieve 100% accuracy.
Spell (naive LCS) CLP (fixed threshold) IPLoM
IV. E VALUATION Spell CLP (auto threshold)

runtime (seconds ×103 )


105 3.0

runtime (seconds)
In this section, we evaluate the efficiency and effectiveness
104 2.5
of Spell, by comparing it with two popular offline log parsing 103 2.0
algorithms, on 2 real log datasets with different formats. All 102 1.5
101 1.0
experiments were executed on a Linux machine with an 8-core
100 0.5
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz computer. We’ll 10-10.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 0.05 10 15 20 25 30 35 40 45 50
show that Spell not only is able to parse logs in an online log size ( ×105 , Los Almos) log size ( ×105 , Blue Gene)
streaming fashion, but also has outperformed the competing Fig. 3. Efficiency comparison of different methods.
offline methods in terms of both efficiency and effectiveness. TABLE III
The two offline algorithms to be compared are IPLoM [9], A MORTIZED COST OF EACH MESSAGE TYPE LOOKUP STEP IN Spell
Los Alamos HPC log BlueGene/L log
[15] and a clustering-based log parser [2] which we refer prefix tree (ms) 0.006 0.011
to as CLP. The idea of IPLoM is to partition the entire simple loop (ms) 0.020 0.087
log into multiple clusters, where each cluster represents a naive LCS (ms) 0.175 0.580
set of log entries printed by the same print statement. The
partition is done using a simple 3-step heuristic: i) partition • Spell (naive LCS): compute the LCS using DP between
by each log record length; ii) partition each cluster by the new log entry and every existing message type.
token position having least distinct tokens; iii) partition by the • Spell: Spell with the pre-filtering step.
bipartite mapping between tokens in each cluster. It is so far Figure 3 left shows the results on Los Almos HPC Log.
the most lightweight automatic log parsing algorithm. CLP, Note that runtime is measured by logarithm scale. To parse the
on the other hand, is a frequently used algorithm by multiple entire log with 433, 490 entries, Spell with naive LCS is about
log mining efforts [2], [3], [4]. It also partitions the log into 75 seconds while it’s only 9 seconds with pre-filtering. IPLoM
clusters, while by first clustering using weighted edit distance, shows the best efficiency, whereas Spell (with pre-filtering)
and then repeatedly partitioning until all clusters satisfy the is only slightly slower (within seconds). The CLP method
heuristic - each position either has the same token, or is a has the worst efficiency (2-4 orders of magnitude slower than
parameter position. IPLoM and Spell). We tested two variants of CLP: 1) CLP
TABLE I
PARAMETERS FOR ALL THREE ALGORITHMS
(auto threshold): it automatically sets the cluster threshold ς
Spell Value IPLoM Value by k-means clustering. When log size is bigger than 100,000,
message type thresh- 0.5 file support threshold 0.01 it’s already too slow to run to completion. 2) CLP (fixed
old τ partition support 0 threshold): it uses a fixed threshold 5 calculated from smaller
CLP Value threshold
edit distance weight ν 10 lower bound 0.1 log file, which significantly improves the runtime. However
cluster threshold ς 5 upper bound 0.9 it’s still much slower than other methods. In later experiments
private contents 4 cluster goodness 0.34 we only use CLP with fixed threshold if applicable.
threshold % threshold
Figure 3 right shows the results on Blue Gene Log. The
Table I shows the default values of key parameters used runtime in this figure is measured in normal decimal scale.
for each algorithm. For parameters with recommended values We didn’t include CLP in this experiment: even CLP with
that were clearly stated in the original papers, such as all fixed threshold is too slow to finish as the Blue Gene log
parameters for IPLoM [9], we simply adopt those values. has nearly 5 million entries. Here the advantage of our pre-
For others that were not clearly specified, we tested the filtering step is clearly demonstrated. In particular, Spell with
corresponding method with different values until we got the pre-filtering has outperformed IPLoM in terms of efficiency.
best result (for the same log data) as in the original paper. With prefix tree, when the log size grows much faster than the
We use the supercomputer logs that were commonly used number of message types, most log entries will find a match
for evaluation by previous work [9], [13], [15], [16], shown in prefix tree, and return immediately. Then for the majority
in table II (count is the total number of log entries). of the rest, the message types could be found using simple
TABLE II loop approach. Only for a small amount of log records that
L OG DATASETS
Log type Count Message type ground truth are not matched in pre-filtering step, we will compare it with
Los Alamos HPC log 1 433,490 available online2 each existing message type. Noticeably, the runtime of Spell
BlueGene/L log1 4,747,963 available online3 (naive LCS) increases exponentially. That’s because when log
A. Efficiency of Spell size grows bigger, more message types also show up, and when
Figure 3 shows the total runtime of different methods when each new log entry comes, it may need to be compared with a
log size (the number of log records) grows bigger. Note that larger number of message types. This result clearly shows the
we tested different alternatives of the Spell method: importance of the pre-filtering step and how it has effectively
mitigated the efficiency issues in the basic Spell method.
1 CFDR Data, https://www.usenix.org/cfdr-data The amortized cost for each log entry to find its message
2 Los Alamos National Lab HPC Log message types, https://web.cs.dal.ca/
∼makanju/iplom/hpc-clusters.txt
type using different lookup method in the pre-filtering step is
3 BlueGene/L message types, https://web.cs.dal.ca/∼makanju/iplom/bgl- shown in Table III (in milliseconds). Recall that for each log
clusters.txt entry, Spell first tries to find its message type in prefix tree,
TABLE IV TABLE V
N UMBER (P ERCENTAGE ) OF LOG ENTRIES RETURNED BY EACH STEP C OMPARISON OF Spell WITH AND WITHOUT PRE - FILTER
Los Alamos HPC log BlueGene/L log Spell Los Alamos HPC log BlueGene/L log
prefix tree 397,412 (91.68%) 4,457,719 (93.89%) With pre- True message Accuracy True message Accuracy
simple loop 35,691 (8.23%) 288,254 (6.07%) filtering types found types found
naive LCS 387 (0.09%) 1,990 (0.042%) False 55 0.822786 165 0.811798
True 55 0.822786 164 0.811791
then simple loop, and finally uses naive LCS if not found in
previous two steps. Table IV shows the number (percentage) V. C ONCLUSIONS
of log entries that are returned in each step, showing that We present a streaming structured log parser, Spell, for
over 91% could be processed in prefix tree in O(n) time, parsing large system event logs in streaming fashion. Spell
and over 99.9% in total could be processed by prefix tree and works perfectly for online system log mining and monitoring.
simple loop combined. The expensive naive LCS computation It is also a great addition to modern log management systems
is only applied to less than 0.1% of log entries. Hence much to provide end-users concise, real-time understanding of the
overhead is reduced by pre-filtering step. We’ll show later that system states. We propose pre-filtering to improve Spell’s
it provides almost identical results with the costly naive LCS efficiency. Experiments over real system logs have clearly
method. demonstrated that Spell has outperformed the state-of-the-art
B. Effectiveness of Spell offline methods in terms of both efficiency and effectiveness.
In this section we evaluate the effectiveness of Spell. After VI. ACKNOWLEDGMENT
parsing, the log file is processed into multiple clusters, where Min Du and Feifei Li were supported in part by grants
each cluster represents one message type with the associated NSF CNS-1314945 and NSF IIS-1251019. We wish to thank
log records (as produced by the corresponding log parsing all members of the TCloud project and the Flux group for
method). A parsed message type is considered as correct if helpful discussion and valuable feedback.
all and only all log records printed by that message type (as
identified through the ground truth) are clustered together. We R EFERENCES
run each method, compare the results with the ground truth [1] M. Du and F. Li, “ATOM: Automated tracking, orchestration and
generated by matching each log entry with its true message monitoring of resource usage in infrastructure as a service systems,”
in IEEE BigData, 2015.
type from Table II, and calculate the accuracy, which indicates [2] Q. Fu, J.-G. Lou, Y. Wang, and J. Li, “Execution anomaly detection in
the total number of log entries that are parsed to correct distributed systems through unstructured log analysis,” in ICDM, 2009.
message types over the number of total processed log records. [3] J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from
console logs for system problem detection.” in USENIX ATC, 2010.
Spell CLP (fixed threshold) IPLoM [4] J.-G. Lou, Q. Fu, S. Yang, J. Li, and B. Wu, “Mining program workflow
0.9 0.9
0.8 0.8
from interleaved traces,” in SIGKDD, 2010.
0.7 [5] K. Yamanishi and Y. Maruyama, “Dynamic syslog mining for network
accuracy

accuracy

0.6 0.7
failure monitoring,” in SIGKDD, 2005.
0.5 0.6
0.4 [6] W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting
0.5
0.3 large-scale system problems by mining console logs,” in SOSP, 2009.
0.2 0.4
[7] K. Nagaraj, C. Killian, and J. Neville, “Structured comparative analysis
0.10.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 0.35 10 15 20 25 30 35 40 45 50 of systems logs to diagnose performance problems,” in NSDI, 2012.
log size ( ×105 , Los Almos) log size ( ×105 , Blue Gene) [8] I. Beschastnikh, Y. Brun, M. D. Ernst, and A. Krishnamurthy, “Inferring
Fig. 4. Effectiveness comparison of different methods. models of concurrent systems from logs of their behavior with csight,”
Figure 4 shows the comparison on supercomputer logs. With in ICSE, 2014.
[9] A. A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “Clustering
more log entries, number of message types also increases; and event logs using iterative partitioning,” in SIGKDD, 2009.
they don’t necessarily show up uniformly over time. Hence, the [10] L. Tang and T. Li, “LogTree: A framework for generating system events
effectiveness of a method does not necessarily show a steady from raw textual logs,” in ICDM, 2010.
[11] L. Tang, T. Li, and C.-S. Perng, “LogSig: Generating system events
trend as log size grows. We can see that in both charts, Spell from raw textual logs,” in CIKM, 2011.
achieves much better accuracy than other methods. IPLoM [12] W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Online system
accuracy is acceptable in Figure 4 left for Los Almos log, and problem detection by mining patterns of console logs,” in ICDM, 2009.
[13] A. Gainaru, F. Cappello, S. Trausan-Matu, and B. Kramer, “Event log
becomes terrible in Figure 4 right for Blue Gene log. mining tool for large scale hpc systems,” in Euro-Par, 2011.
Note that the pre-filtering step in Spell may miss an existing [14] Z. Cao, S. Chen, F. Li, M. Wang, and X. S. Wang, “LogKV: Exploiting
message type t for a new log entry e if LCS(t, s) 6= t key-value stores for event log processing,” in CIDR, 2013.
[15] A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “A lightweight
but LCS(t, s)| > |LCS(t0 , s)| when there is another existing algorithm for message type extraction in system application logs,”
message type t0 that satisfies t0 =LCS(t0 , s), where s is the TKDE, 2012.
token sequence of e. To evaluate such potential degrade to [16] H. C. Xia Ning, Geoff Jiang and K. Yoshihira, “HLAer: A system
for heterogeneous log analysis,” in SDM Workshop on Heterogeneous
the effectiveness due to the pre-filtering step, we show a Learning, 2014.
comparison in Table V. The result shows that Spell with pre- [17] Y. Li, H. Li, T. Duan, S. Wang, Z. Wang, and Y. Cheng, “A real linear
filtering has achieved an accuracy nearly the same as that using and parallel multiple longest common subsequences (mlcs) algorithm,”
in SIGKDD, 2016.
only naive LCS. This means the pre-filtering step has almost [18] Y. Li, Y. Wang, Z. Zhang, Y. Wang, D. Ma, and J. Huang., “A novel fast
no downgrade effect on the parsing results though it greatly and memory efficient parallel mlcs algorithm for longer and large-scale
reduces the parsing overhead. sequences alignments,” in ICDE, 2016.

You might also like