An - Approach - To - Understand - The - End - User - Behaviour Trough Log Analysis
An - Approach - To - Understand - The - End - User - Behaviour Trough Log Analysis
27
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
audit logs that track user authentication attempts and security format, and other binary formats. Some logs are designed in a
device logs that record possible attacks. way that they are readable to humans, whereas some others don‟t,
With the world wide deployment of network servers, some use standard formats, whereas others use proprietary
service station and other computing devices, the number of formats. Some logs are such that they are not stored on a single
threats against networks and systems have greatly increased in host, but are transmitted to other hosts for processing; a common
number, volume, and variety of computer security logs and with example can be SNMP traps.
the revolution of computer security logs, computer security log 3.5 Log Confidentiality and Integrity
management are required. Log management is essential to ensure Protection of Log records to maintain their integrity and
that computer security records are stored in sufficient detail for an confidentiality is very essential and challenging. For example,
appropriate period of time. Log management is the process for logs might intentionally or unintentionally capture sensitive
generating, transmitting, storing, analyzing, and disposing of information such as users‟ passwords and the content of e-mails.
computer security log data. The fundamental problem with log This raises security and privacy concerns relating both the
management is effectively balancing a limited quantity of log individuals that examine the logs and others that might be able to
management resources with a continuous supply of log data. Log access the logs through authorized or unauthorized means. Logs
generation and storage can be complicated by several factors, which are secured improperly in storage or in transit might also
including a high number of log sources; inconsistent log content, be susceptible to intentional and inadvertently alteration and
formats, and timestamps among sources; and increasingly large destruction. This could cause a variety of impacts, including
volumes of log data [5,6]. Log management also involves allowing malicious activities to go unnoticed and manipulating
protecting the confidentiality, integrity, and availability of logs. evidence to conceal the identity of a malicious party.
Another problem with log management is ensuring that security,
system, and network administrators regularly perform effective Protection of logs availability is also a very big issue. Many logs
analysis of log data. having a size limit when this limit is reached, the log might
overwrite old data with new data or stop logging all together both
3. TROUBLES IN LOG MANAGEMENT of which would cause a loss of log data availability. To meet data
In an association, many Operating Systems, security software, retention requirements, it‟s necessary to establish log archival i.e.
and other applications generate and preserve their independent keeping copies of log files for a longer period of time than the
log files. This complicates log management in the following ways original log sources can support. Because of the volume of logs, it
[5, 7] might be appropriate in some cases to reduce the logs by filtering
out log entries that do not need to be archived. The
confidentiality and integrity of the archived logs also need to be
3.1 Multiple Log Sources protected
Logs can be found on many hosts throughout the organization
that should be required to conduct log management throughout
4. ROLE OF EVENT LOG DATA IN
the organization. In addition, a single log source can generate EVIDENCE GATHERING
multiple logs for example, an application storing authentication Logs are composed of log entries; each entry contains information
attempts in one log and network activity in another log related to a specific event that has occurred within a system or
network. If the suspicious end user exploits web form as an
3.2 Heterogeneous Log Content access point for input attacks like cross-site scripting, SQL
Log file capture certain pieces of information in each entry, such injection and buffer overflow attack on a web application, it may
as client and server IP addresses, ports, date and time etc. For be detected using the log file [5].An interesting question is raised,
efficiency, log sources often record only the pieces of information why event data should be logged on a given system. Essentially
that they consider most important. It creates difficulty to make an there are four categories of reasons.
relationship between event records and different log sources
because they may not have any common attribute (e.g., source 1 4.1 Accountability
records the source IP address but not the username, and source 2 Log file data can be used to identify which type of accounts are
records the username but not the source IP address). Even the associated with certain events and that information can be used to
representation of log value varies with log source; these emphasize where training and/or disciplinary actions are needed.
differences may be slight, such as one date being in
YYYYMMDD format and another being in MMDDYYYY 4.2 Rebuilding
format, or they may be much more complex. What was happening before and during an event can be reviewed
chronologically by using log file data. For this it should be
3.3 Inconsistent Timestamps ensured that the clocks are regularly synchronized to a central
Usually every application who ganerates logs uses the local source to ensure that the date/time stamps are in synchronization.
timestamps i.e. the timestamps of the internal clock. If the host‟s
clock is not synchronized or inaccurate, then log file analysis is
more difficult, specially when the environment has multiple hosts. 4.3 Intrusion Detection
For example, timestamps may indicate that event “X” happened 2 Log data can be reviewed for detecting unusual or unauthorized
minutes after event “Y”, whereas event „X‟ has actually happened events, assuming that the correct data is being logged and
55 seconds before event “Y”. reviewed. But variation of unusual activities is a main problem
i.e. login attempts outside of designated schedules, failed login
3.4 Multiple Log Formats attempts, port sweeps, locked accounts, network activity levels,
Each application that creates logs may use its own format, eg. memory utilization, key file/data access, etc.
XML format or SNMP format, comma-separated or tab-separated
28
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
29
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
G Aggregate Function
Where pi is the probability of any attribute sample belongs to the
class Ci. Let us assume attribute X having m distinct values, {x1,
x2, x3,…..,xm}. Attribute X can be used to partition D into m
G subsets, {D1, D2, D3,………, Dm} where Dj contains those
30
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
31
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
samples in D that have a value xj of A. The entropy based on the client (NU-normal user, SU-suspicious user, SUW- suspicious
partition into the subset X is given by user for web, AT- attacker).
32
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
33
International Journal of Computer Applications (0975 – 8887)
Volume 5– No.11, August 2010
34