Analytics in Incident Management A Clustering Approach-FinalPaper
Analytics in Incident Management A Clustering Approach-FinalPaper
Approach
Rahul Pant Kalyana Chakravarthy Bedhu
getting insight like major areas of concerns and trend Looking at the text manually the first level support
assigns the ticket to a specific queue for action. In most
of tickets over time. scenarios it is the skill and experience of indiv idual/team
The current method of identifying major areas which ensures ticket is acted within time or assigned to right
queue. The focus till date had been towards improving the
of concern is based on human judgement which is skill of individuals and team rather than understanding what
cluster of issues are being seen so that appropriate focus can
prone to error as well as bias. The Incident are
be put in certain areas. In summary, it was a reactive
analyzed manually by a team every month for approach in place, that an issue is solved as soon as it comes
instead of a proactive one i.e. to understand which
reporting purpose and insights derived. Certain area/system/service should be improved to reduce the tickets.
features of each incident are usually added like With a data driven focus within the organization, and also
Location, Type of Incident, Reporter, Date/Time etc. due to cost optimizat ion constraint, the management is
asking the right questions about improvement scope both in
for reporting purpose but they do not utilize the terms of processes and people. Their primary ask is:
unstructured text which contains the issue details. 1. Can we identify major groups of issues which need
management focus?
In addition to the incident information and site 4. The abbreviations which were used were mainly
information above, the authors were given additional “master related to system names and hence the rare occurrence of
data” sets – related to tools and systems available within terms like RBS did not necessitate another metadata creation
their organization. This dataset was not exhaustive and had for our analysis.
to be modified and appended in the project lifecycle based on Our conclusion from the qualitative analysis was that the
business input at different stage of analysis. Incident text look rich enough to deep dive and use existing
The authors were advised by Subject Matter Experts clustering methods to find similarit ies between tickets. As a
(SM E) fro m business to ignore the 'Product Reported'
first step of understanding the data we looked at the word • All special characters were removed
frequency cloud, as shown below.
• SME provided similar words were co mbined e.g.
(Collab, Collaboration, Collaborative), (connect,
connectivity, connection).
• Careful removal of stopwords was done, as some
stopwords were valid and important names within
organization like: 'ONE'.
• Removed mention of months in data
• Lo wercase all words and removed the numeric
values
• The first word was usually found having
significance to understand about the ticket, and it
Figure I: Text as Wordcloud was extracted as another feature for analysis purpose
"Backup" looked like a clear issue, however after SME At this point we had the textual data available in a clean
discussion it was understood that all global systems where format to initiate clustering. We refer to the literature around
ticket is raised by support, always mention whether we have Text Clustering[1] and papers related to text clustering in a
a “Backup Available” or not. Thus we excluded this, and different domain [6],[3]. We also referred to multip le
other similarr inputs provided by SME from our data. applications of hierarchical clustering in text [2][5]. After
careful reading of papers and keeping simplicity of
Other understandings from looking at word clouds and
implementation in mind, we restricted to Partitional(K -
similar visualization were:
Means) and Hierarchical Clustering options. The limitation
• Months are also being mentioned within ticket, of K-Means, although its much quicker, is its requirement to
maybe by automated alerting tickets know the number of clusters beforehand. However, we did
• Telecom co mpany names, our customers, are also not have an idea about the expected clusters.
mentioned in the tickets, and it was decide to extract such
information too, like Airtel, MTN etc. B. Modeling Technique
• Lemma and Stemming might help as we see terms Hierarchical Clustering was preferred over partitioning
like(available, availability), (perform, performance), (alert, methods as document hierarchical clustering builds a tree of
alerting, alerts) clusters.It can further be classified into agglomerative and
• City names are also frequent and identifying the city divisive approaches, which work in a bottomup and top -
and country information may provide further insights down fashion, respectively. In the author's case, they went
with the agglomerative clustering approach, which iteratively
We decided to extract the following informat ion from each
merges two most similar clusters.
ticket text wherever possible
As a first step a term frequency inverse document
• City Names , Country Names frequency(TFIDF) matrix for the prepared incident text was
• SAP System, non SAP System and Tool Names created. This step prepares the text as numerical input for
relative co mparison. Multip le parameter tuning steps and
• Company Names iterations were carried(in python sklearn) and we finally
This information should help in better cluster creation and went with the configurable items as shown in Table I:
exp loring the created clusters later from different
dimensions. Some information required matching with TABLE I.
available metadata(like Company Names), creating or Parameters
improving metadata( like City, Country) and using patterns
T oken Frequency Upper Limit 90%
to extract data (like System Names)
T oken Frequency Lower Limit 0.25%
A. Data Preparation for Clustering Inverse Document Frequency Used
Following steps were performed for preparing data for User Defined
clustering. T okenization
Function
• All IP's, common ly known extensions (.in,.co m, N-Gram 1
.companyname.se) were removed beforehand as Parameters for TFIDF Matrix Creation in Python
most tools or systems will have that extension
• Keyword replacement done. Few examp les: We then computed and stored cosine similarity of above
Replaced City and Country with keyword 'Location', matrix docu ments. There were mult iple distance
Tools and system names with 'InternalSystem', measurement options for hierarchical clustering like single
Company names with keyword 'Ourcustomer'. lin kages, complete linkages, ward linkage etc. Ward’s
method, or minimal increase of su m-of-squares method,
• All identified sites being replaced by keyword
defines proximity between two clusters as the magnitude by
'Location' too.
which the summed square in their jo int cluster will be
greater than the combined summed square in these two
clusters. The clusters were found to be relat ively
homogeneous when using Ward distance criteria. The last
step in the clustering process was creating a dendogram and
finding the best number of clusters and sub-clusters.
While doing the clustering, we als o tested the results from
K-means to understand the types of clusters getting created.
The results gave credence to our theory that there are sub -
clusters within major clusters and considering them as a
separate entity/group will not be correct. An examp le of 2
such groups is [Cluster 1: Access Slow, Cluster 2: Connect
Timeout, Cluster 3: Connect Failed]. The issues in these 3
clusters belong to the same group of Connection Issues as a
major category. Hence the approach of hierarchical
clustering deemed fit for the problem at hand.
Visual inspection of dendogram and quality analysis of
clusters and sub-clusters for homogenity was done to
identify the best 2 cut points for creating the clusters and
sub-clusters.
VI. A CKNOWLEDGEMENTS
The authors would like to extend thanks to all SM Es
within the IT Operations team for their data and business
knowledge – in particular Mr. Kunjan Sharma and Mr.
Akhilesh Kr. Sinha. Without their support and inputs this
use-case would never have started off – let alone complete.
VII. REFERENCES
[3] Zainol, Zuraini & Marzukhi, Syahaneim & Nohuddin, Puteri &
Noormanshah, Wan & Zakaria, Omar. (2017). Document Clustering
in Military Explicit Knowledge: A Study on Peacekeeping
Documents. 175-184. 10.1007/978-3-319-70010-6_17.
[4] Reddy, Srikanth & Kinnicutt, Patrick & Lee, Roger (2016). Text
Document Clustering: The Application of Cluster Analysis to Textual
Document. 10.1109/CSCI.2016.0222
[6] Intel White Paper: Reducing Client Incidents through Big Data
Predictive Analytics