Chapter 1
Chapter 1
1
About your instructor
• Undergrad in Computer Science from the University of Wisconsin
• Working as the President at Technuf LLC – a high tech Cybersecurity company in Rockville, MD
• Email: [email protected]
3
About you!
4
Poll on Data Science
• How much background knowledge and experience do you have in Data Science, Artificial Intelligence,
Machine Learning, Cybersecurity, and/or similar topics? Please pick the option that best describes
you.
• B: I have done some independent reading but have not taken a course on any of these topics
• C: I have taken a course on at least one of these topics (including university courses, Coursera, etc.)
• D: I have taken multiple courses on these topics (including university courses, Coursera, etc.)
5
How is this all going to work?
• This is going to be somewhat a busy semester. Let’s support each other as
best as we can.
• In-class instruction
• In-class exercises (Poll)
• Q&A, announcements (Blackboard)
• Submissions and grades (Blackboard)
6
Participating in Class
• If you have a question, please “raise hand” and then speak when called on
7
Data analytics for
Cyber security
-Introduction-
Vandana P. Janeja
8
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved.
What is Cybersecurity?
Assets Affected
• Personal assets
• Public assets
• Corporate assets at risk
Data Analytics
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 9
What is Cybersecurity?
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 10
Aims of Cybersecurity: prevent, detect, and
respond to threats
Prevention of cyberattacks against critical assets
Detection of threats
Respond to threats in the event that they penetrate access to critical assets
Recover and restore the normal state of the system in the event that an attack
is successful
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 11
Assets Affected
Personal Public Corporate
• Phones (home and • Smart meters, • Customer database,
mobile), • Power grid, • Websites,
• Tablets, • Sewage controls, • Business applications,
• Personal computers • Nuclear power plant, • Business network,
(desktop and laptops), • Rail lines, • Emails,
• External physical hard • Airplanes and air traffic, • Off the shelf software,
drive, • Traffic lights, • Intellectual property
• Cloud drive, • Citizen databases,
• Email accounts, • Websites (county, state
• Fitness trackers, and federal),
• Smart watches, • Space travel programs
• Smart glasses, • Satellites
• Media devices (TIVO,
apple TV, cable box),
• Bank accounts,
• Credit cards,
• Personal gaming
systems
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 12
Motivation behind Cyber Threats
1 2 3 4 5
Stealing intellectual Gaining access to Making a political Performing cyber Damaging reputation,
property customer data statement espionage Making a splash for fun,
Impeding access to data
and applications
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 13
Why do we have security risks?
Organizational risks
(multiple partners, such as
Applications with several Logical errors in software in cyber-attacks at Target
dependencies, code (such as Heartbleed), and the Pacific Northwest
National Laboratories
[PNNL]),
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 14
Summary of Motivation, Risks and Security
Motivation Risks
• To steal Intellectual property • Internet protocol which is inherently not secure
• To damage reputation • Applications with several dependencies
• Gain access to data , which can then be sold • Logical errors in software code (ex. Heartbleed)
• Gain access to information, which is not • Organizational risks (multiple partners ex. Target,
generally available PNNL)
• To make a political statement • Lack of User awareness of cybersecurity risks (ex.
• To impede access to critical data and Social engineering, phishing)
applications • Personality traits of individuals using the systems
• To make a splash for fun
Attaining Security
• Protecting resources
• Hardening defenses
• Capturing data logs
• Monitoring systems
• Tracing the attacks
• Predicting risks
• Predicting attacks
• Identifying vulnerabilities
15
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved.
• According to a McAfee report, the monetary loss resulting from
cybercrime costs about $600 billion, which is about 0.8% of the world
What is the
Gross Domestic Product (GDP) (McAfee–-Cybercrime Impact 2018),
with malicious actors becoming more and more sophisticated.
level of • The loss due to cyber-attacks is not simply based on direct financial
loss but also based on several indirect factors, which may lead to a
major financial impact.
damage that • Example: Target cyber-attack (RSkariachan and Finkleeuters-Target
2014)
can occur? • Target reported $61 million in expenses related to the cyber-
attack out of which $44 million were covered by insurance.
• The direct financial impact to Target was $17 million.
• A 46 % drop in net profit in the holiday quarter,
• 5.5% drop in transactions during the quarter,
• Share price fluctuations led to further losses,
• Cards had to be reissued for several customers, and
• Target had to offer identity protection to affected customers.
• All these losses amount to much more than the total $61 million loss.
In addition, the trust of the customers was lost, which is not a
quantifiable loss and has long-term impacts.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 16
• Protecting resources,
• Hardening defenses,
• Capturing data logs,
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 17
Overall Areas of Cybersecurity
Network Security
Cyberphysical Security
Application Security
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 18
Sub areas of Cybersecurity
Application security: incorporating security Data and information security: securing Network security: securing the traditional
in the software development process. data from the risk of unauthorized access computer networks and security measures
and misuse adopted to secure, prevent unauthorized
access and misuse of either the public or
the private network.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 19
• Emerging challenges due to the coupling of the
cyber systems with the physical systems.
Sub areas of
• The power plants being controlled by a cyber
Cybersecurity system,
• risk of disruption of the cyber component or
• risk of unauthorized control of the cyber
system,
• gaining unauthorized control of the physical
systems.
Cyber physical
security
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 20
• Cross cutting across areas to learn from existing
threats and develop solutions for novel and
unknown threats towards networks,
Sub areas of infrastructure, data, and information
Cybersecurity • Example: Threat hunting proactively looks for
malicious players across the myriad data
sources in an organization
• Does not necessarily have to be a completely
machine-driven process and should account
for user behaviors
• Must look at the operational context.
• Provide security analysts a much focused field
Data analytics of vision to zero in on solutions for potential
threats
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 21
• Multiple types of networks and devices
• Computer networks, Cyber Physical Systems (CPS), Internet of
Things (IoT), sensor networks, smart grids, and wired or wireless
networks.
Hardware • Computer networks - Traditional type of networks
and Network • Groups of computers are connected in pre-specified
configurations. These configurations can be designed using
Landscape security policy deciding who has access to what areas of
networks. Another way networks form is by determining patterns
of use over a period of time. In both cases, zones can be created
for access and connectivity where each computer in the network
and sub-networks can be monitored.
• Cyber Physical Systems - an amalgamation of two interacting sub-
systems, cyber and physical
• used to monitor and perform the day-to-day functions of the
many automated systems that we rely on, including power
stations, chemical factories, and nuclear power plants, to name a
few.
• Ubiquitous connected technology - “smart” things - Internet of Things
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 22
• Data analytics deals with analyzing large amounts of data from
disparate sources to discover actionable information leading to
gains for an organization.
• Includes techniques from data mining, statistics, and business
Data management, among other fields.
• Big data
Analytics • Massive datasets (volume)
• Generated at a rapid rate (velocity)
• Heterogeneous nature (variety)
• Can provide valid findings or patterns in this complex
environment (veracity)
• Changing by location (venue)
• Every device, action, transaction, and event generates data. Cyber
threats leave a series of such data pieces in different environments
and domains. Sifting through these data can lead to novel insight
not why a certain event occurred and potentially allow the
identification of the responsible parties and lead to knowledge for
preventing such attacks in the future.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 23
Anatomy of an
attack
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 24
vulnerability in one of the lab's public-facing web servers
Compromised Workstations
Shared Network
resources Spear Phishing attack
Business Partners
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 26
Multi-dimensional view of Threats Events become relevant when they occur together
These events become relevant with proximities
rather than causation
The two items are in close Proximity, based on
• Source Proximity
• Destination Proximity
Spatial Distance • Temporal proximity or Delay
N1
N12
N2
N8
N3 N5 N4
N4 N6 N9 N10 N11 N3
0 4 8 12 16
Time
Goal : to identify potential “collusions” among the entities responsible for these two events
27
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved.
• Looking at one dimension of
the data is not enough in such
prolonged attack scenarios.
Why Data Analytics is
• For such a multipronged
important for attacks, we need a multilevel
cybersecurity: A case framework
study of • Brings together data
from several different
understanding the databases.
anatomy of an attack • Events of interest can be
identified using a
combination of factors
such as proximity of
events in time, in terms
of series of
communications and
even in terms of the
geographic origin or
destination of the
communication.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 28
• Intrusion Detection System (IDS) logs such as
Understanding SNORT
feature
combinations
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 29
• Extract associations to identify potentially
Understanding repeated or targeted communications
Collusions and
associations
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 30
• Time intervals accounts for time proximity
Understanding • Allows mining the data in proximity of time
network
evolution
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 31
How Can Data Analytics Help?
Creating robust
access control rules
by evaluating prior
usage and security
policies.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 32
Focus of this Course
What this course is not about: This course does not address the traditional views of security
configurations and shoring up the defenses, including, setting up computer networks, setting
up firewalls, web server management, and patching of vulnerabilities.
What this course is about: This course addresses the challenges in cybersecurity that data
analytics can help address, including analytics for threat hunting or threat detection,
discovering knowledge for attack prevention or mitigation, discovering knowledge about
vulnerabilities, and performing retrospective and prospective analysis for understanding the
mechanics of attacks to help prevent for preventing them in the future.
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 33
References
•Digital Attack Map: A Global Threat Visualization https://www.netscout.com/global-threat-intelligence Last accessed Nov, 2020
•Alexandra Whitney Samuel, Hactivism and future of Political Participation, http://www.alexandrasamuel.com/dissertation/pdfs/Samuel-Hacktivism-entire.pdf , Sept 2004
•CSMonitor-Estonia: Arthur Bright, Estonia accuses Russia of 'cyberattack', 2007, http://www.csmonitor.com/2007/0517/p99s01-duts.html , last accessed March 2020
•James. A. Lewis, Computer Espionage, Titan Rain and China, 2005, http://csis.org/files/media/csis/pubs/051214_china_titan_rain.pdf, Last accessed March 2020
•Reuters-Solarwinds: Jack Stubbs, Raphael Satter, Joseph Menn, U.S. Homeland Security, thousands of businesses scramble after suspected Russian hack, 2020,
https://www.reuters.com/article/global-cyber/global-security-teams-assess-impact-of-suspected-russian-cyber-attack-idUKKBN28O1KN
•FireEye: Pascal Geenens, FireEye Hack Turns into a Global Supply Chain Attack, 2020, https://securityboulevard.com/2020/12/fireeye-hack-turns-into-a-global-supply-chain-attack/
•Bloomberg-Target: Matt Townsend, Lindsey Rupp and Jeff Green, Target CEO Ouster, http://www.bloomberg.com/news/2014-05-05/target-ceo-ouster-shows-new-board-focus-on-cyber-attacks.html, 2014
•Google Dorking, Amy Gesenhues, http://searchengineland.com/google-dorking-fun-games-hackers-show-202191, 2014
•Risk Based Security-Sony, A Breakdown and Analysis of the December, 2014 Sony Hack
•, https://www.riskbasedsecurity.com/2014/12/05/a-breakdown-and-analysis-of-the-december-2014-sony-hack/ , 2014
•McAfee, The Economic Impact of Cybercrime—No Slowing Down, McAfee, Center for Strategic and International Studies (CSIS), 2018, https://www.mcafee.com/enterprise/en-us/solutions/lp/economics-cybercrime.html
•Reuters-Target: Target shares recover after reassurance on data breach impact , 2014, https://www.reuters.com/article/us-target-results/target-shares-recover-after-reassurance-on-data-breach-impact-idUSBREA1P0WC20140226 , Last
accessed March 2020
•M. Shashanka, M. Shen and J. Wang, "User and entity behavior analytics for enterprise security," 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 2016, pp. 1867-1874, doi: 10.1109/BigData.2016.7840805.
•Manyika, James, et al. "Big data: The next frontier for innovation, competition, and productivity." (2011).
•Chen, Min, Shiwen Mao, and Yunhao Liu. "Big data: a survey." Mobile Networks and Applications 19.2 (2014): 171-209.
•PNNL Attack: 7 Lessons: Surviving A Zero-Day Attack, 2011 http://www.darkreading.com/attacks-and-breaches/7-lessons-surviving-a-zero-day-attack/d/d-id/1100226 , Last accessed March 2020
04/16/2024 Data Analytics for Cybersecurity, ©2022 Janeja All rights reserved. 34