SlideShare a Scribd company logo
Using Machine-Learning to
Investigate Web Campaigns
at Large
2nd
Nov 18
JD-HITB PEK
Beijing, China
Dr. Marco Balduzzi
@embyte, madlab.it
Who am I?
●
HITB aficionado
●
Computer security
geek (since 2002)
●
Love research, Ph.D.
(in 2011)
●
“Community centric”
●
Work for Trend Micro
Web Defacement
=
Website Compromise
+
Homepage Hijacking
90s 2000 2010 Now
“Just-for-fun”
era
Digital activism
Geopolitical
factors kick in
Dark
propaganda
Evolution
“Just-for-fun” Era
Digital Activism
Geopolitical Factors
Coordinated campaigns:
From one to many targets
●
Death statement is
2nd of May 2011
●
Campaign is 6th-12th
of May
●
Targets:
The Israeli-
Palestinian
conflict
Data Collection
Public Repositories
Others
THE site :)
Collected Data
Source Name Website URL Acquired Records
Zone-H www.zone-h.org 12,303,240
Hack-CN www.hack-cn.com 386,705
Mirror Zone www.mirror-zone.org 195,398
Hack Mirror www.hack-mirror.com 68,980
MyDeface www.mydeface.com 37,843
TOTAL 12,992,166
Timeline Evolution
UptoSeptember
Data Format
Metadata
Raw content
Data Trustworthiness
Type Attribute Example Trustworthiness
Metadata URL http://target.gov High
Timestamp 2010-01-02 15:00 Medium
Nickname Neo Hacker Medium-Low
Webserver;
Reason;
Hack Mode
Nginx;
Political;
SQLi
Low
Raw content Main page HTML or TXT file High
Embeeded resources Various format High
External resources Various format Medium-High
General Trends
Topics Over The Years
Security Problems
Real World Events
Adoption of Malicious Content
Adoption of email & Twitter handlers
Detection Engineering
Key Observations
Key Observations
Template Customization
Key Observations
1.Actors cooperate in teams
Especially if driven by strong ideologies
2.Defacements are organized around campaigns
3.When a team prepares and runs a campaign, it
tends to re-use a common template that each
member can customize
Next Generation*
Defacement Explorer
(DefPloreX-NG)
(*) 1st
generation presented at BH Arsenal
Deface Page Analysis
Campaign Detection
Labeling & Visualization
Riga 1 Riga 2 Riga 3 Riga 4
0
2
4
6
8
10
12
Colonna 1
Colonna 2
Colonna 3
Pre-Filtering Search
Campaigns Search
The Indo-Pakistani conflict
Using Machine-Learning to Investigate Web Campaigns at Large - HITB 2018
Implementation Details
Features Engineering
Visual Features
(images)
Social Features
Structural Features
Format of Title
Visual Features (colors)
Features Engineering
Multimedia URLs
Email Addresses
Clustering
●
BIRCH
– Balanced Iterative Reducing and Clustering
Hierarchies
●
Do not materialize the entire distance matrix
– Statistical values are efficient to compute
– Quickly find the closest cluster for each new data
points
Clustering
●
Scalability of BIRCH vs. DBSCAN (10 runs)
Labeling
●
Each cluster is represented by a succinct report
– Time span
– Screenshot thumbnails (by perceptual hash)
– Name of actors and teams
– Keywords used in campaigns (e.g. #opfrance)
– Category of targets (e.g, news, governmental sites)
Findings
Organization of Actors
43% of actors join
one team, at least
70% of campaigns do collaborate
Half of joint campaigns
are larger than 3
Geopolitical Real-World Events
●
Successfully detected
The “Charlie Hebdo” case
The “Charlie Hebdo” case
Campaigns
Teams
Actors
Lonely wolfs
Affiliated actors
The Israeli-Palestinian conflict
Example of large-scale
joint campaigns.
While the entire Israeli-
Palestinian conflict
involves 12 campaigns,
opisreal and
opsavealaqsa
represent the most
aggressive and active
ones.
Anonymous Operations
Joint campaigns conducted by
anonymous-affiliated groups
against governmental sites.
Long-Term vs. Aggressive Campaigns
Campaign savegaza reacted to war events in the Gaza Strip
Example of a long-running
campaign named h4ck3rsbr. The
horizontal bars represent the
different sub-campaigns (60) with
their most targeted TLDs and teams
(next slide). This is a very generic
campaign, with different affiliates.
Using Machine-Learning to Investigate Web Campaigns at Large - HITB 2018
Conclusions
●
Dark propaganda
●
Prevailing phenomenon
●
Driven by geopolitical motivations
●
Key targets, influencing sites
●
Contribute to make the Internet a better world!
Conclusions
●
DefPloreX-NG
●
GitHub (old code, new ask):
https://github.com/trendmicro/defplorex
●
Paper*:
https://documents.trendmicro.com/assets/wp/
wp-web-defacement-campaigns-uncovered-
gaining-insights-from-deface-pages-using-
defplorex-ng.pdf
(*) Joint work with Federico Maggi, Ryan Flores, Lion Gu and Vincenzo Ciancaglini
Thanks! Questions?
Dr. Marco Balduzzi
@embyte, madlab.it

More Related Content

Similar to Using Machine-Learning to Investigate Web Campaigns at Large - HITB 2018 (20)

PDF
Hacktivism.pdf
Fahim Yusuf
 
PDF
Behind The Scenes Of Web Attacks
Maurizio Abbà
 
PPTX
International Cooperative: APT Hunting
Joshua Lawton, MBA
 
PDF
The Art of Cyber War: Cyber Security Strategies in a Rapidly Evolving Theatre
Radware
 
PPTX
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
mwe400
 
PPTX
Political hacktivism
Eva Sánchez
 
PDF
The Evolution of Advanced Persistent Threats_The Current Risks and Mitigation...
Lumension
 
PDF
Dark Web and Threat Intelligence
Marlabs
 
PPTX
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
mwe400
 
PPTX
AEJMC 2014 - Big Data and Education
mwe400
 
PPT
Cyber Warfare -
ideaflashed
 
PDF
OSINT for Attack and Defense
Andrew McNicol
 
PDF
Politics 3.0 : A New Democratic Model Forged by Civil Society and Digital Tec...
Sonia Eyaan
 
PDF
The Web Hacking Incidents Database Annual
guest376352
 
PPTX
In the Line of Fire-the Morphology of Cyber Attacks
Radware
 
PDF
ASFWS 2012 - Cybercrime to Information Warfare & “Cyberwar”: a hacker’s persp...
Cyber Security Alliance
 
PDF
Cybercrime In The Deep Web
Trend Micro
 
PDF
Cybercrime in the Deep Web (BHEU 2015)
Marco Balduzzi
 
PDF
AVTokyo 2013.5 - China is a victim, too :-) (English version)
Anthony Lai
 
PPT
Development of the CyberCemetery (2011)
Dr. Starr Hoffman
 
Hacktivism.pdf
Fahim Yusuf
 
Behind The Scenes Of Web Attacks
Maurizio Abbà
 
International Cooperative: APT Hunting
Joshua Lawton, MBA
 
The Art of Cyber War: Cyber Security Strategies in a Rapidly Evolving Theatre
Radware
 
Internet Archives as a Tool for Research: Decay in Large Scale Archival Records
mwe400
 
Political hacktivism
Eva Sánchez
 
The Evolution of Advanced Persistent Threats_The Current Risks and Mitigation...
Lumension
 
Dark Web and Threat Intelligence
Marlabs
 
From Big Data to Big Theory: Lessons Learned from Archival Internet Research.
mwe400
 
AEJMC 2014 - Big Data and Education
mwe400
 
Cyber Warfare -
ideaflashed
 
OSINT for Attack and Defense
Andrew McNicol
 
Politics 3.0 : A New Democratic Model Forged by Civil Society and Digital Tec...
Sonia Eyaan
 
The Web Hacking Incidents Database Annual
guest376352
 
In the Line of Fire-the Morphology of Cyber Attacks
Radware
 
ASFWS 2012 - Cybercrime to Information Warfare & “Cyberwar”: a hacker’s persp...
Cyber Security Alliance
 
Cybercrime In The Deep Web
Trend Micro
 
Cybercrime in the Deep Web (BHEU 2015)
Marco Balduzzi
 
AVTokyo 2013.5 - China is a victim, too :-) (English version)
Anthony Lai
 
Development of the CyberCemetery (2011)
Dr. Starr Hoffman
 

More from Marco Balduzzi (18)

PDF
Lost in Translation: When Industrial Protocol Translation goes Wrong [CONFide...
Marco Balduzzi
 
PDF
CTS @ HWIO2020 Awards Cerimony
Marco Balduzzi
 
PPTX
SCSD 2020 - Security Risk Assessment of Radio-Enabled Technologies
Marco Balduzzi
 
PDF
Attacking Industrial Remote Controllers (HITB AMS 2019)
Marco Balduzzi
 
PDF
Behind the scene of malware operators. Insights and countermeasures. CONFiden...
Marco Balduzzi
 
PDF
Plead APT @ EECTF 2016
Marco Balduzzi
 
PDF
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
Marco Balduzzi
 
PDF
AIS Exposed. New vulnerabilities and attacks. (HITB AMS 2014)
Marco Balduzzi
 
PDF
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
Marco Balduzzi
 
PDF
HITB2012AMS - SatanCloud: A Journey Into the Privacy and Security Risks of Cl...
Marco Balduzzi
 
PPTX
Attacking the Privacy of Social Network users (HITB 2011)
Marco Balduzzi
 
PPTX
Automated Detection of HPP Vulnerabilities in Web Applications Version 0.3, B...
Marco Balduzzi
 
PDF
The (in)security of File Hosting Services
Marco Balduzzi
 
PDF
HTTP Parameter Pollution Vulnerabilities in Web Applications (Black Hat EU 2011)
Marco Balduzzi
 
PDF
Abusing Social Networks for Automated User Profiling
Marco Balduzzi
 
PDF
Stealthy, Resilient and Cost-Effective Botnet Using Skype
Marco Balduzzi
 
PDF
New Insights into Clickjacking
Marco Balduzzi
 
PDF
Paper: A Solution for the Automated Detection of Clickjacking Attacks
Marco Balduzzi
 
Lost in Translation: When Industrial Protocol Translation goes Wrong [CONFide...
Marco Balduzzi
 
CTS @ HWIO2020 Awards Cerimony
Marco Balduzzi
 
SCSD 2020 - Security Risk Assessment of Radio-Enabled Technologies
Marco Balduzzi
 
Attacking Industrial Remote Controllers (HITB AMS 2019)
Marco Balduzzi
 
Behind the scene of malware operators. Insights and countermeasures. CONFiden...
Marco Balduzzi
 
Plead APT @ EECTF 2016
Marco Balduzzi
 
Detection of Malware Downloads via Graph Mining (AsiaCCS '16)
Marco Balduzzi
 
AIS Exposed. New vulnerabilities and attacks. (HITB AMS 2014)
Marco Balduzzi
 
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
Marco Balduzzi
 
HITB2012AMS - SatanCloud: A Journey Into the Privacy and Security Risks of Cl...
Marco Balduzzi
 
Attacking the Privacy of Social Network users (HITB 2011)
Marco Balduzzi
 
Automated Detection of HPP Vulnerabilities in Web Applications Version 0.3, B...
Marco Balduzzi
 
The (in)security of File Hosting Services
Marco Balduzzi
 
HTTP Parameter Pollution Vulnerabilities in Web Applications (Black Hat EU 2011)
Marco Balduzzi
 
Abusing Social Networks for Automated User Profiling
Marco Balduzzi
 
Stealthy, Resilient and Cost-Effective Botnet Using Skype
Marco Balduzzi
 
New Insights into Clickjacking
Marco Balduzzi
 
Paper: A Solution for the Automated Detection of Clickjacking Attacks
Marco Balduzzi
 
Ad

Recently uploaded (20)

PDF
Pas45789-Energs-Efficient-Craigg1ing.pdf
lafinedelcinghiale
 
PDF
123546568reb2024-Linux-remote-logging.pdf
lafinedelcinghiale
 
PPTX
Random Presentation By Fuhran Khalil uio
maniieiish
 
PPTX
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
PPTX
ipv6 very very very very vvoverview.pptx
eyala75
 
PDF
How to Fix Error Code 16 in Adobe Photoshop A Step-by-Step Guide.pdf
Becky Lean
 
PDF
World Game (s) Great Redesign via ZPE - QFS pdf
Steven McGee
 
PPTX
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 
PPTX
Template Timeplan & Roadmap Product.pptx
ImeldaYulistya
 
PDF
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
PPTX
1.10-Ruta=1st Term------------------------------1st.pptx
zk7304860098
 
PPTX
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
PPTX
Slides ZPE - QFS Eco Economic Epochs.pptx
Steven McGee
 
PDF
Slides PDF: ZPE - QFS Eco Economic Epochs pdf
Steven McGee
 
PPTX
Presentation on Social Media1111111.pptx
tanamlimbu
 
PDF
Technical Guide to Build a Successful Shopify Marketplace from Scratch.pdf
CartCoders
 
PPTX
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
PPTX
02 IoT Industry Applications and Solutions (1).pptx
abuizzaam
 
PDF
APNIC's Role in the Pacific Islands, presented at Pacific IGF 2205
APNIC
 
PPTX
Simplifying and CounFounding in egime.pptx
Ryanto10
 
Pas45789-Energs-Efficient-Craigg1ing.pdf
lafinedelcinghiale
 
123546568reb2024-Linux-remote-logging.pdf
lafinedelcinghiale
 
Random Presentation By Fuhran Khalil uio
maniieiish
 
Research Design - Report on seminar in thesis writing. PPTX
arvielobos1
 
ipv6 very very very very vvoverview.pptx
eyala75
 
How to Fix Error Code 16 in Adobe Photoshop A Step-by-Step Guide.pdf
Becky Lean
 
World Game (s) Great Redesign via ZPE - QFS pdf
Steven McGee
 
Internet_of_Things_Presentation_KaifRahaman.pptx
kaifrahaman27593
 
Template Timeplan & Roadmap Product.pptx
ImeldaYulistya
 
The Complete Guide to Chrome Net Internals DNS – 2025
Orage Technologies
 
1.10-Ruta=1st Term------------------------------1st.pptx
zk7304860098
 
本科硕士学历佛罗里达大学毕业证(UF毕业证书)24小时在线办理
Taqyea
 
Slides ZPE - QFS Eco Economic Epochs.pptx
Steven McGee
 
Slides PDF: ZPE - QFS Eco Economic Epochs pdf
Steven McGee
 
Presentation on Social Media1111111.pptx
tanamlimbu
 
Technical Guide to Build a Successful Shopify Marketplace from Scratch.pdf
CartCoders
 
ZARA-Case.pptx djdkkdjnddkdoodkdxjidjdnhdjjdjx
RonnelPineda2
 
02 IoT Industry Applications and Solutions (1).pptx
abuizzaam
 
APNIC's Role in the Pacific Islands, presented at Pacific IGF 2205
APNIC
 
Simplifying and CounFounding in egime.pptx
Ryanto10
 
Ad

Using Machine-Learning to Investigate Web Campaigns at Large - HITB 2018