01 Intro Kevin
01 Intro Kevin
Introduction
Fall 2023
Objectives
• Basic principles of network measurements
– Data collection methods
– Data analysis methods
• Target audience
– Students interested in the practice of network
operations and management
– Students interested in empirical research in
networks and networked systems/applications
(possibility of internships to solve network related
research problems)
Instructors
• Timur Friedman
– Professor at Sorbonne University
• Kevin Vermeulen
– Researcher at CNRS
• Hugo Rimliger
– Doctoral student at Sorbonne University
Classes (to be updated)
Lecture Instructor Topic
1 Kevin Introduction; Infrastructures: sources of
information
2 Timur Infrastructures: latency, loss, and geolocation
3 Timur Infrastructures: bandwidth
4 Timur Infrastructures: topology
5 Kevin Topology: past, present and future of the Internet
6 Kevin Traffic: sources of information, traffic matrix
7 Kevin Traffic: characterization over time, security
Mid-term exam
8 Kevin Traffic: security, privacy
9 Kevin Traffic: censorship
10 Kevin Traffic: CDNs and redirection
Class schedule (cont.)
Date Instructor Topic
11 Timur Infrastructures: geolocation
Break
12 Timur Infrastructures: tomography
13 Timur Applications: DNS, P2P, online games
14 Timur Conclusion
Final exam
WHY MEASURE NETWORKS?
“In God we trust; all others must bring data”
William Edwards Deming
Network monitoring is essential for
network operators
• Monitor service-level agreements
– Performance of traffic through network
• Fault diagnosis
– Quick detection of faults
– Root cause analysis
• Security
– Anomaly detection
– Intrusion detection
Network monitoring is essential for users
Inter-AS routing
(Border Gateway Protocol)
Enterprise
ISP1
Residential
ISP
ISP2
Enterprise
ISP1
Residential
ISP Other common technologies:
ISP2
– Cable
Mobile ISP – Fiber Campus
– Wireless
– Satellite
Internet technologies: Inter-domain links
Internet ISP1
exchange
point
(IXP) Enterprise
ISP2
ISP1
Residential
ISP ISP2
Private peering
– Direct interconnection
Mobile ISP Campus
Class scope
• IP-layer and above
• (Mostly) public Internet ?
? ISP1
Enterprise
? Residential
?
ISP
ISP2
–Delay, jitter
–Capacity, throughput
–Loss
–Topology
–End-to-end paths
–AS, or path segment traversing an AS
–Link; router
– Measuring below IP is tricky
Measuring Internet traffic
• Goal: infer usage from network traffic
–Link utilization
–Applications used
–Typical traffic patterns
–Misbehaving hosts, apps
–IP, TCP/UDP packets
–Per flow or connection statistics
–Per-interface counters
Measuring Internet applications
• Goal: infer application performance and usage
from network traffic or application
–Inspect payload of IP packets
–Instrument the application
–Crawl application
–Web page load time
–Video buffering rates
–Popularity of social network members
WHAT TYPES OF MEASUREMENTS
EXIST?
Measurement techniques
• Active
– Based on issuing probes, analyzing response
• Passive
– Observe existing traffic
• E.g., IP packets, routing messages
Example active RTT measurement: ping
probe m d
ICMP d
echo request t0
probe
round-trip time
reply (RTT) t1 reply
m ICMP
echo reply
Example passive measurement RTT
inference: tcptrace
m d
t0
RTT data 1
t1 ack 1
Comparison
Passive Active
•Only way to measure traffic •Measurements even when
•Measures user experience, taping traffic is not possible
behavior •Measures network,
•Measures protocol exchanges application performance
•Raise privacy concerns •Probing extra load
– Overload network
– Bias inferences
Measurement vantage point
• Point where measurement host connects to network
– Observations often depend on vantage point
Campus2
ISP1
Residential
ISP ISP2
spikes
gaps
Know the measurement tool
• Study precision and accuracy
• Examine outliers and spikes
• Monitor confounding factors
– Monitor’s CPU, memory, traffic
• Evaluate synthetic data, controlled settings
• Compare multiple methods
• Re-calibrate as needed
– E.g., changing environments
Know where data comes from
• Log meta-data with traces
– Any information required to fully understand
measurements
– Remember data often used for unexpected
purposes
• Examples of meta-data
– Version of measurement tool and parameters
– When, where trace was recorded
– Clock precision
– Drops, missing data
ETHICAL ISSUES
Avoid disruption
• Active probing can overload network/hosts
– “Denial of Service” attack
• Good practices
– Embed contact info in probes
– Throttle probing
– Spread load
– Keep blacklists of networks/hosts
Respect privacy
• Passive measurements can get personal info
• Good practices
– Get user informed consent when possible
– Comply with local data protection laws
– Anonymize data when possible
• Caveat: anonymization is not bullet-proof
Do no harm
• Measurement studies can harm
– Individual/organization privacy, reputation, well-being
• Good practices
– Identify potential harms/risks
– Maximize benefits and minimize risks
– Menlo report
• https://www.caida.org/publications/papers/2012/
menlo_report_actual_formatted/
menlo_report_actual_formatted.pdf
INFRASTRUCTURES: SOURCES OF
INFORMATION
Types of data about infrastructure
• Routing monitors
– BGP
– OSPF/IS-IS
• Active measurements
– Path topology
– Performance (delay, throughput, etc.)
• Passive measurements
– SNMP counters
– Wireless metrics (PHY rate, RSSI, etc.)
Public sources of data
• BGP data
– RouteViews, RIPE RIS
• Topology
– CAIDA’s Ark, RIPE Atlas
• Access, path performance
– M-Lab, Ookla, FCC/SamKnows
• Net neutrality
– Wehe
• Wireless data
– Crawdad
Things to keep in mind when using public
data
• Location of vantage points
– Where are they connected?
• Configuration of measurements
– Who are BGP peers?
– Which routers are being probed?
– Which destinations are probed?
– What is the probing frequency?
• Details about the data collection
– Which version of tools were used to collect the data?
– Were there any errors during collection?
Measurement platforms:
closer to core
• Looking glass servers
– Connected to major ISPs, IXPs
– Allow interactive queries
– BGP, ping, traceroute
• Distributed servers
– E.g. PlanetLab, M-Lab
– Deployed in university campus, data centers
– Well-connected, powerful machines
– Support running measurement scripts
Measurement platforms:
at the edge
• Low cost monitors
– E.g., RIPE Atlas (Plug computers),
SamKnows/Bismark (access points)
– Deployed close users (homes, offices)
– More diverse connectivity, constrained machines
• Software platforms
– E.g., Dasu (Bittorrent), Fathom (browser/Firefox)
– Easier to deploy, large number of users
– Not always on
Things to keep in mind when using
measurement platforms
• Platforms require credits to conduct measurements
– Ensure resource consumption is within limits
• Probing may trigger security alerts
– Use blacklists/whitelists for probing
• Monitors’ load may bias inferences
– Monitor load on measurement nodes
• Timing issues in different platforms
– Clocks across monitors often not synchronized
– Check precision/accuracy of timestamps
Summary
• Focus of this class: Internet measurements
– Infrastructure
– Traffic
– Applications
• Measurement techniques
– Active probing
– Passive observation
• Guidelines for sound measurements
• Measurements raise ethical issues
– Evaluate risk versus benefit
• Sources of information on infrastructure
Recommended reading
• V. Paxson, “Strategies for Sound Internet
Measurement”, IMC’04.
– http://www.icir.org/vern/papers/meas-strategies-imc04.pdf
References
• CAIDA’s DatCat
– http://www.datcat.org/
• BGP datasets
– RouteViews: http://www.routeviews.org/
– RIPE-RIS:
http://www.ripe.net/data-tools/stats/ris/routing-
information-service
– Cyclops: http://cyclops.cs.ucla.edu/
References
• Topology datasets
– CAIDA’s Ark: http://www.caida.org/projects/ark/
– Dimes: http://www.netdimes.org
– Northwestern’s EdgeScope:
http://aqualab.cs.northwestern.edu/projects/86-
edgescope-sharing-the-view-from-a-distributed-internet-
telescope
• Path performance/topology datasets
– iPlane: http://iplane.cs.washington.edu/
– M-Lab: http://www.measurementlab.net/
– FCC data: https://www.fcc.gov/measuring-broadband-
america/2012/raw-data-2012
References
• Platforms
– RIPE Atlas: https://atlas.ripe.net/
– PlanetLab: https://www.planet-lab.org/
– Bismark: http://projectbismark.net/