Unit II - 01 - Random Networks
Unit II - 01 - Random Networks
OF
NETWORK
FORMATION
Introduction
SOURCE: “Chains of Affection: The Structure of SOURCE: “The political blogosphere and the 2004 U.S.
Adolescent Romantic and Sexual Networks”, election: divided they blog”, Lada A. Adamic, and N.
Peter S. Bearman, James Moody, and Katherine Glance
Stovel.
The existence of so different networks, suggests the need to explore network models that can explain why and how different
networks are created.
• Having summaries as the degree, or the average path length is good to gain knowledge of the network. However,
since networks differ significantly on the number of links, and nodes, quantities as the average degree will vary across
the data sets.
• It is better to have an overall description of the distribution of values of the degrees. This is the degree distribution of
the network
• In the discrete formulation we may think of the degree distribution as returning the fraction of nodes with degree k.
• In general, we take them as a probability distribution describing the probability that if we select a random node in the
network, it will have degree k
• When we talk about random networks, or scale free networks we refer to the degree distribution that describes them
better, i.e., the models of networks correspond to the different processes that give rise to the different probability
distributions.
The degree distribution for the example on the left, where each
node has degree 2 is
1 if k = 2
𝑃 𝑘 =ቊ
0 otherwise
• These models are in the opposite extreme in the order-disorder axis, as they represent the disorder.
• There are two equivalent models due to Erdös and Renyì, we will only consider the G(n,p) model, in which there are N
nodes and each link between nodes is present with a probability p, giving rise to a binomial distribution
𝑘𝑚𝑎𝑥 𝑘 𝑁−1 𝑘
𝑃 𝑘 = 𝑝 (1 − 𝑝)𝑘𝑚𝑎𝑥 −𝑘 = 𝑝 1−𝑝 𝑁−𝑘−1
𝑘 𝑘
• Note that the number of nodes is fixed, and hence this is a static model
• They are the model used as reference in all the comparisons to detect any structure: degree correlation function,
modularity, assortativity,…
ERDÖS-RENYÌ MODEL WITH N=30 AND P=0.03 ERDÖS-RENYÌ MODEL WITH N=30 AND P=0.1
𝑃 𝑘 = 𝐴𝑘 −𝛾
• This type of models have fat-tailed distributions, i.e. there are more nodes with low degree and with high degree than
there would be if the links were created at random.
• There may also appear anomalies depending on the value of the degree exponent, 𝛾.
• This degree exponent can be estimated using a linear regression (not too recommended) and with a maximum likelihood
method (better).
• These networks have a creation mechanism that adds nodes at each iteration, hence they are dynamic processes.
Random Networks
• Properties
• Maximum and Minimum degrees
• Diameter,
• Clustering Coefficient
STATIC MODELS • Average Path Length
• Connectedness
• Giant Components
• A well-known limit of the binomial distribution is the Poisson distribution, which occurs when the sample size is
large, while the probability of success is small, keeping the product 𝑛𝑝 small.
• This limit, known as the rare events limit, will correspond to sparse networks with a large number of nodes.
𝑘ത = 𝑘𝑚𝑎𝑥 𝑝 = (𝑁 − 1)𝑝
Then the degree distribution will be
𝑘
𝑛−1 𝑝
𝑃 𝑘 = 𝑒− 𝑛−1 𝑝
𝑘!
• The clustering coefficient is the probability that two neighbours of a node are connected. However, in
a random network, the probability that any two nodes are connected is p, the same for any pair of
nodes, then we can immediately write that the average clustering coefficient is
𝑘ത
𝐶=𝑝=
𝑛−1
• The value we obtain from this computation will almost always differ from that obtained from a real
network since real networks have, in general, a larger magnitude since links in the neighbourhoods
of other nodes are more likely to be connected than due to random chance
• For random networks this value will be, typically, very small, in contrast with real networks
• In the paper by . Fronczak, P. Fronczak, and J.A. Holyst, “Average path length in random networks”
(2018), it is found, using a mean field approximation, that the average path length of a Erdös-Renyì
random network takes the general expression of
log 𝑛 − 𝛾 1
𝑑ҧ = −
ത
log(𝑘) 2
• In general, this magnitude is well described in random networks, providing a value that is close to that
of a real network.
• A small world is a network structure that is both highly locally clustered and has a short path length,
two network characteristics that are normally divergent (as we have seen in the random networks)
• The small world phenomenon is that in a network, the distance between two randomly chosen nodes
is short. Since the number of nodes that can be reached from a random node grows geometrically
with the average degree, when this is large enough and averaging over all the pairs of nodes, we have
that the average path length is
log(𝑁)
𝑑ҧ ≈
ത
log(𝑘)
• In Watts constructions, we can see small worlds as those networks that lie between the order and the
randomness, then these are the type of networks that may appear if we consider an interpolation
between a regular network and an ER-network.
It is a model to capture the nature of the social engagement by using a parameter 𝛼 ∈ [0, ∞), that captures the
balance between the constraints of the social structure and the freedom of individual agency
• Cavemen: highly connected groups without connections outside. This world is made from isolated cliques. The
propensity of two disconnected individuals to be connected is very small, but as soon as they share one friend,
this becomes very high.
• Solarian: no one has propensity to connect to anyone in particular. Fragmented world with isolated individuals
that remains the same until two individuals share most of their connections
1, 𝑚𝑖𝑗 ≥ 𝑘
𝑚𝑖𝑗 𝛼
𝑃𝑖𝑗 = 1 − 𝑝 + 𝑝, 𝑘 > 𝑚𝑖𝑗 ≥ 0
𝑘
𝑝, 𝑚𝑖𝑗 = 0
ത𝑘
𝑘 ത
min(𝑘− , ) ത
𝑘
In the Watts-Strogatz β-model we introduce 22 ത ത 𝑘− −𝑁
the probability of rewiring to nodes ത
𝑘/2 𝑘
𝑁 2−𝑁
(𝛽 𝑘/2) 2
ത
𝑃 𝑘 = 1−𝛽 𝛽 𝑒 −𝛽𝑘/2
𝑁 𝑘ത
𝑁=0 𝑘− −𝑁 !
2
• Start with a regular network and choose
each node in order
• In random networks some different phase transitions occur. These represent moments in their
evolution with respect to any of their parameters in which the network structure changes as the
formation process is modified
• To capture the dynamics of a random network, we can begin with N isolated nodes and go adding the
links according to the probability p. This leads to the study of the evolution of a random network that
will depend almost exclusively on the value of the average degree at each stage
Rembember that in random networks, changing the link probability implies changing the average degree, then
• In studying the properties of random networks Erdös and Renyì found that if the probability of a link is larger than
log 𝑁
𝑁
Then the network is connected with probability tending to 1, i.e. the fraction of nodes with no links tends to 0.
• It is a convenction to say that if the number of nodes in a component is fewer than 𝑛2/3 /2 the component is small, and large
otherwise.
• The term Giant Component refers to the only largest component (if there is one). In this component its size grows in
proportion to 𝑛. Formation of a Giant Component is a phase transition where the size of the largest component goes through
a sudden change where the average degree goes from 0 to not-0, growing up to 1.
1 1 1 log(𝑁)
Probability < > >
𝑁−1 𝑁−1 𝑁−1 𝑁
Structure Trees May have Cycles GC with tres and cycles GC with cycles