Aricle-Towards A Framework To Detect Multi-Stage
Aricle-Towards A Framework To Detect Multi-Stage
Abstract— Detecting and defending against Multi-Stage A current weakness to deal with this new scenario is the
Advanced Persistent Threats (APT) Attacks is a challenge for difficulty in constructing, operating and maintaining an
mechanisms that are static in its nature and are based on appropriate defense system. Even larger organizations with
blacklisting and malware signature techniques. Blacklists and sophisticated defenses are targets of attacks. So, there is a
malware signatures are designed to detect known attacks. But
demand for frameworks to support the implementation of
multi-stage attacks are dynamic, conducted in parallel and use
several attack paths and can be conducted in multi-year effective solutions in order that even organizations with fewer
campaigns, in order to reach the desired effect. resources and knowledge can reasonably handle complex and
In this paper the design principles of a framework are presented persistent attacks. In this paper we present a research
that model Multi-Stage Attacks in a way that both describes the framework to handle complex attacks. The central basis of the
attack methods as well as the anticipated effects of attacks. The framework consists of an Intrusion Management System and a
foundation to model behaviors is by the combination of the multi-stage attack model. The multi-stage attack model is used
Intrusion Kill-Chain attack model and defense patterns (i.e. a to identify prevention and detection controls that provide logs
hypothesis based approach of known patterns). The used by the Intrusion Management System, and it is also used
implementation of the framework is made by using Apache
as a guide to logs correlation activities.
Hadoop with a logic layer that supports the evaluation of a
hypothesis. In the next section we present characteristics of APTs and
Keywords—APT; Multi-stage Attack; Hadoop; Intrusion Kill difficulties of current approaches to treat them. In section III
Chain; we present the framework with the underlying models and
architectural principles. In Section IV we present a process
I. INTRODUCTION with correlation patterns to detect an APT. In section V we
Currently, cyber systems are being attacked by complex, present related work. And finally in section VI, we present
persistent and stealthy attacks, also referred as APT conclusions and suggestions for future works.
(Advanced Persistent Threats) [1][2]. An APT usually has
II. APT – ADVANCED PERSISTENT THREATS
multiple stages [13]. At each stage the attacker gets more
privileges, information and resources to penetrate deeper A. APT Attack scenario
within the organization. Persistency means that the attacker A complex attack can overwhelm the defenses of a system
will persist patiently for a long time in their attempts to reach through a well-planned operation to explore existing
the desired goal. The attacker does not give up easily and weaknesses. The attacker first identifies potential targets in the
deviate from their targets. Thus, he has a well defined goal and organization. The selected targets are ceasing to be services or
he will persist until his goal is achieved. The attackers are applications, as these are generally better protected and
supposed to be supported by organizations or nations with monitored. A common target is a user within the organization
capabilities and resources to support their aims. with a closer access to assets desired by the attacker. He or she
To deal with this new scenario it is necessary that the may be the target of a focused phishing attack, or receives a
defense adopts a proactive behavior. That is, an attack must be gadget at a conference or exhibition, or is convinced to bring a
perceived and treated before it causes significant impacts to malicious device inside the organization. Once inside the
the business. The current approach to security, with static supposedly secure network, the malware establishes a stealthy
models of risk, compliance to standards and regulations and communication channel with the attacker, and exploring other
incident handling after impact, is no longer acceptable. weaknesses, the attack advances over other users and
Security must be dynamic, with risks assessed continuously resources to achieve its final goal.
and proactively with treatment actions being performed before
significant impacts are realized.
B. Explored Weaknesses Command and control (C2) - Adversary requires a
This scenario is possible because even well designed communication channel to control its malware and
defenses have blind spots. An anti-virus is unable to detect a continue their actions. Therefore, it needs to be
malware not registered in the database of signatures, an IDS connected to a C2 server.
(Intrusion Detection System) is only effective if an attack
triggers a registered detection rule, and frequently it generates Actions – it is the last phase of the kill chain in which
many false positives and negatives and so it is often
adversary achieves its objectives by performing
overlooked by security administrators (if the organization has
a fully qualified one). Alerts from different security sensors actions like data exfiltration. Defenders can be
are hardly correlated. Already fixed vulnerabilities remain confident that adversary achieves this phase after
present for a long time. Vulnerable applications settings are passing through previous phases.
used by many users. Even users with access to sensitive assets
do not have adequate training and awareness. These different
vulnerabilities could be discovered by an adversary through a
combination of social engineering and network reconnaissance
attacks.
III. PROPOSED FRAMEWORK
A research framework is being developed to support the
detection and analysis of multi-stage cyber-attacks. The
framework has the following main components:
A Multi-stage Attack Model.
A Layered Security Architecture.
A Security Event Collection and Analysis System
A. Multi-stage Attack Model
The treatment of a cyber attack requires the use of an Figure 1: Intrusion Kill Chain (IKC)
appropriate attack model. Using an attack model it is possible
to recognize the current state of an attack and its possible To defeat more sophisticated defense systems, attackers
future states. An attack model is as a model of hypothesis may require the execution of one or more IKCs to circumvent
which will be used to infer possible actions of attackers. We different defensive controls. So, an adequate representation of
adopted the Intrusion Kill Chain (IKC) [3] model as the a complex attack is a multi-stage model, with each stage
central basis of our attack model. IKC is a model of seven represented by an IKC divided in its seven phases.
phases that an attacker inescapably follows to plan and carry
B. Layered Security Architecture
out an intrusion. The IKC phases are as follows:
Information Gathering – Selection of targets, The detection of a complex attack in its earlier stages is
possible if we increase the difficulties for the attacker to
collecting information about the target, technologies
access the valuable assets. The attacker will need to invest
the target uses, potential vulnerabilities, etc. more resources and time to reach the targets. The likelihood,
that one or more sensors are activated and the attack is
Weaponization – developing malicious code to detected, increases with the number of interactions of the
explore identified vulnerabilities, coupling the attacker with the targeted system. A pattern to facilitate
developed code with unsuspected deliverable detection of a complex attack is to protect assets by using a
payloads like pdfs, docs, and ppts. layered model. Most valuable assets should be in the inner
layers. The logic is to force the attacker to execute an attack
Delivery- Transferring the weaponized payload to the with multiple stages. For each layer, at least once, the seven
target environment. phases of an IKC will need to be executed. So, there will be at
least seven opportunities for detecting an attack on a layer.
To be effective, the layered model should attend the
Exploitation - Use of vulnerability of a target system
following requirements:
to execute a malicious code.
The access to a layer will only be possible through
processes and applications of the immediately
Installation - Remote Access Trojan’s (RAT) are
outermost layer. The attacker will have first to get an
generally installed which allows adversary to
access to the outermost layer.
maintain its persistence in the targeted environment.
To circumvent the controls to get an access to a layer, Our framework using Hadoop is divided into 5 modules
the attacker will have to execute a kill chain from the namely, Logging Module, Log Management Module, Malware
outermost layer. Analysis Module, Intelligence Module and Control Module.
The probability of finding common vulnerabilities in
controls, that are used to defend the different layers, Logging Module
must be very low. The idea is to minimize the reuse This module of consists of sensors from the security
of knowledge about vulnerabilities of a layer to architecture. It typically consists of HIDS (Host intrusion
attack another layer. The defense can hinder the detection system) and NIDS (Network intrusion detection
system), Firewall logs, Web Server logs, Mail Server logs, etc.
attack, forcing the adversaries to collect more
The rules and configuration for log generation can be set by
information and to develop new weapons to bypass the administrator using the Control Module. This Module
each different layer. executes a normalization task [6] to enable uniformity in the
analysis process.
C. A Security Event Collection and Analysis System
Log Management Module
An effective detection is possible only with appropriate
sensors that detect different facets of an attack. One possible All the logs generated in the Logging Module are moved to
approach is to provide each layer with sensors to detect this module, stored and pre-processed in the Hadoop
different phases of an IKC. The sensors are triggered by rules Distributed File System (HDFS) [7]. The logs are accessed
established in accordance with patterns of a malicious using Hive queries and for point queries on a small amount of
behavior. Each layer must have its own set of sensors logs a MySQL data base is used.
configured to detect an IKC inside that layer. Alerts and logs
Intelligence Module
collected by the sensors should be stored and correlated to
identify stages and phases of attacks in progress. Intelligence module contains the algorithms for log correlation
The process of collecting and correlation requires an and is responsible for automatic IKC search based on potential
infrastructure that can become difficult to properly operate and malicious events detected. Trigger events are the events on
maintain. A small network (about 100 hundred hosts) can which the Intelligence Module that can initiate an IKC
generate around 100 GB of daily logs and alarms [4,10]. reconstruction. Trigger events can be rule based or a system
Considering that an APT attack can last months or even years, administrator input. Generally, a trigger is a NIDS or HIDS
a large organization may require a significant investment to high risk alert. A multi-stage attack may persist for a long time
establish a system for collecting and analyzing logs. period. In order to enable this type of analysis, the intelligence
In order to attend this need, a model of collecting data module has a campaign analysis component. With the
based on Big Data technology was designed. This model was campaign analysis previous attacks data are collected and
implemented using Hadoop. Apache Hadoop [5] is an open correlated in order to identify a potential multi-stage attack.
source framework that allows distributed processing of large
The Intelligence Module activities are explained in section IV.
collection of data using cluster of computers each having local
computation and storage. Hadoop provides high availability, Malware Analysis Module
fault tolerance and faster processing speeds of large
(structured, semi-structured or un-structured) data sets even Malware analysis module consist of a malware analysis
with cheap commodity hardware. virtualized Lab Environment with detection tools. Explaining
malware analysis in detail is out of scope of this paper. The
primary approaches for malware analysis are Code Analysis
and Behavioral Analysis. There are several tools that help to
perform such analysis of executables. The malware analysis
module provides a more detailed understanding of the possible
actions and effects of a malware.
Control Module
Using the control module, the administrator governs the
framework. The administrator can set new rules for the
logging module, manage the cluster of the log management
module, or test hypothesis with the intelligence module.