Measuring The Evolution of Internet Peering Agreements
Measuring The Evolution of Internet Peering Agreements
Agreements
Amogh Dhamdhere1, Himalatha Cherukuru2, Constantine Dovrolis2 , and Kc Claffy1
CAIDA1
{amogh,kc}@caida.org
Georgia Tech2
[email protected]
Abstract. There is much interest in studying the structure and evolution of the
Internet at the Autonomous System (AS) level. However, limitations of public
data sources in detecting settlement-free peering links meant that prior work focused almost exclusively on transit links. In this work, we explore the possibility
of studying the full connectivity of a small set of ASes, which we call usable
monitors. Usable monitors, while a subset of the ASes that provide BGP feeds to
Routeviews/RIPE collectors, are better suited to an evolutionary study than other
ASes. We propose CMON, an algorithm to classify the links of usable monitors as
transit or non-transit. We classify usable monitors as transit providers (large and
small), content producers, content consumers and education/research networks.
We highlight key differences in the evolution of connectivity of usable monitors,
and measure transitions between different relationships for the same pair of ASes.
1 Introduction
The Internet consists of thousands of autonomous systems (ASes) connected together
to provide end-to-end reachability. Connections between ASes are typically bilateral in
nature, with an underlying business relationship. At the two ends of the spectrum of
AS relationships, we have transit and settlement-free peering1. There has been great
interest recently in studying the evolution and dynamics of the Internet topology at
the AS-level. Unfortunately, existing publicly available data can reliably capture only
transit links, while most settlement-free peering links are invisible, especially those that
are topologically lower in the Internet hierarchy than the route monitors [17]. As a
result, evolutionary studies have focused on transit links [9, 18].
In this work, rather than studying only a subset of the connectivity of all ASes in
the Internet, we take the approach of focusing on the complete connectivity of a subset
of ASes. We use a subset of the ASes that provide routing feeds to Routeviews/RIPE
collectors, which we call usable BGP route monitors (or usable monitors for short);
we believe usable monitors are good candidates for an evolutionary study, as more of
their AS links are visible from the local view than from remote ASes. We propose a new
1
In a transit relationship, the customer pays the provider for carrying traffic, while no money is
exchanged in a settlement-free peering relationship.
We do not use BGP updates, as these reveal backup and transient links which we want to filter.
400
2010.01
2005.01
2001.01
350
300
250
200
150
100
50
0
0
0.1
0.2
0.3
0.7
0.8
0.9
To choose an appropriate value for the slack factor, we measure the number of
Routeviews/RIPE monitors that we classify as usable monitors for different values
of the slack factor. Figure 1 shows that when the slack factor exceeds 0.1, there is a
plateau effect, where the number of usable monitors does not increase sharply until
the slack factor reaches around 0.7, a trend that is seen in all snapshots (The figure
shows three snapshots from 2001, 2005 and 2010). For the purposes of this study, we
choose to be conservative in identifying usable monitors, and set the slack factor to 0.1.
This yields fewer usable monitors, but increases the confidence that we observe their
complete connectivity. For an evolutionary study, we need to balance the tradeoff between a long enough duration, and the number of ASes that are usable monitors for that
entire duration. We use 17 continuous snapshots from 2006-2010 for our study, which
gives us 58 ASes which were usable monitors for that entire duration3. We use peeringDB [2] and organization webpages to classify the 58 usable monitor ASes according
to their business type. The 58 usable monitors consist of 11 transit providers that advertise global presence and large traffic volumes (Large Transit Providers or LTPs), 14
transit providers that have regional presence (Small Transit Providers or STPs), 12
Content Consumers (CC), 6 Content Providers (CP), 2 Enterprise Customers (EC), and
11 Education/Research networks (ER).4
2.3 Visibility of links of monitor ASes
Figure 2 shows, for the topology snapshot in July 20095, the number of links of each
of the 58 usable monitors observed from the monitor itself, and the number of links
seen from remote route monitors. We find that all LTPs lie close to the diagonal
most of their links are visible from remote monitors. On the other hand, we find that
a large number of links of CPs and CCs are visible only from the local monitor. We
compute the invisibility fraction for each type of monitor AS, i.e., the fraction of links
of that type of monitor AS that are invisible from remote monitors. LTPs have the
smallest invisibility fraction (0.5%) nearly all of their links are visible from remote
monitors. The invisibility fraction for small transit providers (STPs) is 40%. On the
other hand, 75% of the links of CP monitors are invisible from remote monitors, while
the invisibility fractions for CCs and ERs are 55% and 60%, respectively. Our analysis
thus confirms the findings of Oliveira et al. [17], who relied on case studies to show that
most tier-1 network links are visible from remote monitors, but many possibly most
links of content providers are not. This partial visibility of the complete AS topology is
mostly due to the low visibility of the connectivity of CPs and CCs, and further confirms
the limitations of existing public BGP snapshots, each of which is only a partial view
of global interdomain connectivity.
4
5
The set of ASes that are usable monitors changes over time, hence we identify 58 ASes that
were full monitors throughout the study duration
Our datasets are available at www.caida.org/amogh/monitors/datasets.html
We observed qualitatively similar trends in other snapshots.
10000
1000
100
LTP
STP
EC
CP
CC
ER
10
1
1
10
100
number of links visible locally
1000
L1
L2
L3
(a)
(b)
Fig. 3: Intuition behind the heurisitics in CMON.
np (l) 2nc (l), then we classify l as a provider link; if nc (l) 2np (l), we classify l as
a customer link. If neither is true, then we classify l as UNK-TRANSIT, i.e., we are
certain that the link is a transit link, but cannot determine if it is a provider or customer.
Again, if X is itself a tier-1 network (or is in H), then networks in H would see the
customer form for customers of X (X would not have transit providers in this case).
3.3 Validation
To validate CMON, we used ground truth information from a set of 6 networks6 . As
CMON is designed specifically to classify the links of monitor ASes, our ground truth
must also be from monitor ASes. We had access to the full routing tables for 3 ground
truth networks, while the remaining networks are a usable monitor at Routeviews/RIPE.
We had access to partial ground truth information for two Routeviews/RIPE usable
monitors (ESNET and IIJ). We find that the accuracy of CMON is 90% for the set of
4 ground truth networks (26 errors out of 260 total links). In the case of 18 out of 26
errors, the ground truth was a customer, while CMON classified it as a non-transit or
UNK link. Evaluation results for CMON on the two partial ground truth networks are
promising. One of these networks indicated that they had one provider, which CMON
identified correctly. The second partial ground truth network indicated that they have
a single customer which again CMON identified correctly. We also performed some
sanity checks on the relationship inferences from CMON. We tested CMON on Routeviews/RIPE usable monitors that are well-known tier-1 networks: AT&T (AS7018),
Cogent (AS174), Level3 (AS3356), and Hurricane Electric (AS6939). CMON produces
only a few (or no) providers, and a large number of customers, which is the result we
expect for tier-1 networks. We plan to extend the validation of CMON using ground
truth we are collecting as part of CAIDA AS-rank project [1].
CMON does classify a number of links of transit providers as non-transit, while
we expect large tier-1 networks to have only a few settlement-free peering links. As
discussed in Section 3, the non-transit category includes backup transit links, or transit
links used for only one direction of traffic. For each non-transit link M-X of a monitor
M, we determine if X advertises at least one prepended route over this link. Prepending
indicates that X does not prefer to receive traffic from M over this link, and X uses the
link M-X either only for outbound traffic, or as a backup transit link. We find that for
6 out of 11 LTP monitors, more than 50% of non-transit links are prepended. These
fractions are quite high for some networks 85% for AT&T (AS7018), 64% for Savvis
(AS3561), 62% for Level 3 (AS3356), and 51% for Cogent (AS174). These fractions
are lower for STPs, with Telstra (AS1221) the largest at 50%. For all CPs and CCs,
this fraction is less than 12%, indicating that most non-transit links of these ASes are
settlement-free or paid-peering links.
Finally, we compared CMON with the algorithm by Gao [11] (GAO) using the same
ground truth data. We find that GAO has an accuracy of 80% on the ground truth, compared to 90% for CMON. In 45 out of 50 errors by GAO, the ground truth is a peering
link, while GAO classifies it as either a provider or customer link. The agreement between CMON and GAO is 72%. Note that GAO classifies certain links as siblings, a
6
Georgia Tech, SOX, SWITCH, Media Network Services, ESNET, and IIJ
relationship type that CMON would classify as non-transit. We treat such cases as disagreements between GAO and CMON. In 5589 out of 6750 disagreements between
CMON and GAO (out of a total of 24758 compared links), CMON classified the link as
non-transit, while GAO classified it as either provider or customer.
customers (2010)
10000
1000
STP
LTP
CP
CC
ER
100
10
1
non-transit (2010)
10
100
customers (2006)
1000
10000
10000
1000
STP
LTP
CP
CC
ER
100
10
1
1
10
100
non-transit (2006)
1000
10000
Fig. 4: Change in the number of customer and non-transit links of monitor ASes from 2006-2010.
The trend line shows the growth rate in the complete graph for the same duration.
CDF
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
LTP
EC
STP
ER
CP
CC
CDF
10
15 20 25 30 35 40
provider link duration (months)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
45
50
55
50
55
LTP
ER
CC
CP
STP
0
10
15 20 25 30 35 40
non-transit link duration (months)
45
Fig. 5: Link duration of transit and non-transit links for monitor ASes.
CDF
STP
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
prov
cust
transit
non-transit
0
10
15
20 25 30 35
link duration (months)
40
45
50
55
50
55
CDF
LTP
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
prov
cust
transit
non-transit
0
10
15
20 25 30 35
link duration (months)
40
45
Fig. 6: Comparing link durations of link types for STP and LTP monitors
0.94
0.18
0.12
0.94
0.66
cust
0.11
0.02
0.04
cust
0.03
0.04
0.96
none
0.90
none
0.70
0.03
0.02
non-transit
0.93
0.23
non-transit
0.79
0.04
prov
0.17
prov
0.04
0.07
STP
LTP
(a)
(b)
0.95
0.06
0.12
cust
0.03
none
0.80
0.02
0.94
prov
0.94
0.03
non-transit
0.06
CP/CC/ER
(c)
Fig. 7: Transition probabilities for links of STP, LTP, and CP/CC/ER monitors
We envision a top-down model that uses these state transition probabilities to predict
evolution dynamics in the AS topology at a macroscopic level. While bottom-up models
have been favored in the literature (see [5, 10] and references therein), those models are
highly complex, and must be parameterized with precise data about interdomain traffic,
economics, and geography. An interesting question is whether an evolution model using
only the state transition probabilities for different AS types, agnostic to underlying
factors, can still accurately model topology dynamics and evolution.
6 Related Work
Several measurement studies highlighted the incompleteness of AS topologies derived
from publicly available BGP data, particularly the limited visibility of settelement-free
peering links [6,8,13,16,20]. Given that the inferred topologies are incomplete, previous
work proposed methods to capture as much of the Internet topology as possible [13,20].
Due to the incompleteness problem, prior measurement studies of topology evolution
had to either focus on transit links, or on macroscopic properties of the AS graph. A recent study measured the average degree and effective diameter of the Internet AS graph
and concluded that the AS graph is densifying [14]. Siganos et al. [19] observed exponential growth and preferential attachment in the Internet from 1997-2001. Magoni et
al. [15] found exponential growth in the number of ASes and links during that same time
period. Oliviera et al. [18] tackled the problems of topology liveness and completeness,
i.e., how to differentiate genuine link births and deaths from routing transients. Dhamdhere et al. [9] studied the evolution of the Internet ecosystem (focusing mostly on transit
links) over the last decade. A previous study by Zhang et al. [21] investigated the effect of route monitor placement on topology inference and AS path prediction, without
using these route monitors to study topology evolution and dynamics. Our work differs
from previous work in two significant ways. First, our study is the first to use BGP route
monitors to study the evolution of the AS topology, including settlement-free peering
links. Second, we focus on the dynamics of the connectivity of individual ASes, and
not on macroscopic topological properties.
8 Acknowledgements
A. Dhamdhere, C. Dovrolis and K. Claffy were supported in this work by the National
Science Foundation (grant CNS-1017139).
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.