Data Mining and BI: Social Network Analytics: Random Graphs

This document discusses random graph models and how they can be used to model real-world networks. It introduces the Erdos-Renyi random graph model and describes its key properties like the binomial degree distribution and emergence of a giant component. It then discusses more realistic models like preferential attachment models, which incorporate growth over time and preferential attachment to high-degree nodes, resulting in power-law degree distributions often seen in real networks.

Uploaded by

marouli90

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views

Data Mining and BI: Social Network Analytics: Random Graphs

Uploaded by

marouli90

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Data Mining and BI: Social

Network Analytics
Random Graphs

Credits: Lada Adamic

Source: https://github.com/ladamalina/coursera-sna/tree/master/Week%202.%20Random%20Graph%20Models
Outline
● Introduction to random graphs
● Degree Distribution
● Giant Component
● Average Shortest Path
Network models
● Why model?
○ simple representation of complex network
○ can derive properties mathematically
○ predict properties and outcomes
● Also: to have a strawman
○ In what ways is your real-world network different from hypothesized model?
○ What insights can be gleaned from this?
Erdös and Rényi
Erdös-Renyi: simplest network model
● Assumptions
○ nodes connect at random
○ network is undirected
● Key parameter (besides number of nodes N) : p or M
○ p = probability that any two nodes share an edge
○ M = total number of edges in the graph
what they look like
Binomial degree distribution
● (N-1,p)-model: For each potential edge we flip a biased coin
○ with probability p we add the edge
○ with probability (1-p) we don’t

Can be approximated
by Poisson distribution
Degree Distribution
● What is the probability that a node has 0,1,2,3, … edges?
● Probabilities sum to 1
Quiz
● The maximum degree of a node in a simple (no multiple edges between the
same two nodes) N node graph is
○ N
○ N-1
○ N-2
Fact
● In an Erdos-Renyi random graph the maximal degree does not vary much
from the average
○ The degrees of the nodes tend to be similar
Fact
● Random networks do not have large hubs
Giant Component
● As N increases, a giant component emerges
○ I.e. a subgraph that comprises a fraction of the whole graph
● What is the average degree z at which the giant component starts to emerge?
○ 0
○ 1
○ 3/2
○ 3
Percolation threshold
● Percolation threshold: how many edges need to be
added before the giant component appears?
● As the average degree increases to z = 1,
a giant component suddenly appears average degree
Giant component: Another angle
● How many other friends besides you does each of your friends have?
○ By property of degree distribution the average degree of your friends, you excluded, is z
○ so at z = 1, each of your friends is expected to have another friend, who in turn have another
friend, etc.
○ the giant component emerges
Why just one giant component?
● What if you had 2, how long could they be sustained as the network
densifies?
Average Shortest Path
● How many hops on average between each pair of nodes?
● again, each of your friends has z = avg. degree friends besides you
● ignoring loops, the number of people you have at distance l is zl
Friends at distance l
Nl = zl

Scaling: Average shortest path lav

lav ~ logN/logz
What does it mean in practice
● Erdös-Renyi networks can grow to be very large but nodes will be just a few
hops apart
Logarithmic axes
● powers of a number will be uniformly spaced (20, 21, 22, 23, 24,...)
Erdös-Renyi avg. shortest path in log-log
Realism
● Consider alternative mechanisms of constructing a network that are also fairly
“random”.
● How do they stack up against Erdös-Renyi?
Other models
Introduction model
● Prob-link is the p (probability of any two nodes sharing an edge) that we are
used to
● But, with probability prob-intro the other node is selected among one of our
friends’ friends and not completely at random
Static Geographical model
● Each node connects to num-neighbors of its closest neighbors
Random encounter
● People move around randomly and connect to people they bump into
Growth model
● Instead of starting out with a fixed number of nodes, nodes are added over
time
Conclusion
● in some instances the ER model is plausible
● if dynamics are different, ER model may be a poor fit
Growth and preferential attachment
models
Example online Q&A site
Uneven participation
● Many people having replied few
Times Vs Few people having
replied many times
Real-world degree distributions
● Sexual networks
● Great variation in contact numbers
● Many people with small number of
partners Vs Few people with high
number of partners
Power-law distribution
● High skew (asymmetry)
● Straight line on a loglog plot (right) Vs linear plot (left)
Poisson distribution
● Little skew (asymmetry)
● Curved on a loglog plot (right) Vs linear plot (left)
Power law distribution
● Straight line on a log-log plot

ln(p(k))=c-αln(k)

● Exponentiate both sides to get that p(k), the probability of observing an node
of degree ‘k’ is given by:

p(k)=Ck-α

● C: normalization constant (probabilities over all k must sum to 1)

● α: power law exponent
2 ingredients in generating power-law networks
● nodes appear over time (growth)
● nodes prefer to attach to nodes with many connections (preferential
attachment, cumulative advantage)
Ingredient # 1: growth over time
nodes appear one by one, each selecting m other nodes at random to connect to

m=2
Random network growth
● one node is born at each time tick
● at time t there are t nodes
● change in degree ki of node i (born at time i, with 0 < i < t)

m/t

● There are m new edges being added per unit time (with 1 new node)
● The m edges are being distributed among t nodes
Age and degree
● On average ki(t)>kj(t)
● Older nodes on average have mode degrees
Ingredient #2: preferential attachment
● Preferential attachment
○ new nodes prefer to attach to well-connected nodes over less-well connected nodes
● Process also known as:
○ Cumulative advantage
○ Rich-get-richer
○ Matthew effect
Price's preferential attachment model for citation networks

● [Price 65]
○ each new paper is generated with m citations (mean)
○ new papers cite previous papers with probability proportional to their indegree (citations)
○ what about papers without any citations?
■ each paper is considered to have a “default” citation
■ probability of citing a paper with degree k, proportional to k+1
● Power law with exponent α = 2+1/m
Cumulative advantage: how?
● Copying mechanism
● Visibility
Barabasi-Albert model
● First used to describe skewed degree distribution of the World Wide Web
● Each node connects to other nodes with probability proportional to their
degree
○ the process starts with some initial subgraph
○ each new node comes in with m edges
○ probability of connecting to node i
● Results in power-law with exponent α = 3
Random Vs Preferential
Properties of the BA graph
● The distribution is scale free with exponent α = 3

P(k) = 2m2/k3

● The graph is connected

○ Every new vertex is born with a link or several links (depending on whether m = 1 or m > 1)
○ It then connects to an “older” vertex, which itself connected to another vertex when it was
introduced
○ And we started from a connected core
● The older are richer
○ Nodes accumulate links as time goes on, which gives older nodes an advantage since newer
nodes are going to attach preferentially – and older nodes have a higher degree to tempt them
with than some new kid on the block
Visualization
Summary: growth models
● Most networks aren't 'born', they are made
● Nodes being added over time means that older nodes can have more time to
accumulate edges
● Preference for attaching to 'popular' nodes further skews the degree
distribution toward a power-law

Oracle Database 12c SQL 1Z0 071 Exam
100% (4)
Oracle Database 12c SQL 1Z0 071 Exam
40 pages
2-6 Study Guide and Intervention: Special Functions
100% (2)
2-6 Study Guide and Intervention: Special Functions
2 pages
E3. A Simple Proof of Menger's Theorem
No ratings yet
E3. A Simple Proof of Menger's Theorem
3 pages
Problem 2.5: Solution
No ratings yet
Problem 2.5: Solution
7 pages
course 3-4
No ratings yet
course 3-4
46 pages
SNA Lecture2CGrowthModels
No ratings yet
SNA Lecture2CGrowthModels
36 pages
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
No ratings yet
Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory
47 pages
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
116 pages
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
No ratings yet
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
80 pages
9 Large Network
No ratings yet
9 Large Network
68 pages
Module4 Networkmodels
No ratings yet
Module4 Networkmodels
68 pages
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
No ratings yet
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
37 pages
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
No ratings yet
Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi
37 pages
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
No ratings yet
SoICT-Eng - ProbComp - Lec 9 - Random Network Models
80 pages
The Science of Social Networks
100% (4)
The Science of Social Networks
47 pages
Network Topology: ELEG 667-013 Spring 2003
No ratings yet
Network Topology: ELEG 667-013 Spring 2003
74 pages
03 CD Phan tich mang xa hoi - Mo hinh toan cho mang xa hoi
No ratings yet
03 CD Phan tich mang xa hoi - Mo hinh toan cho mang xa hoi
113 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
97 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
10 Models Erdos Renyi
No ratings yet
10 Models Erdos Renyi
9 pages
Gionis
No ratings yet
Gionis
191 pages
Advanced Topics in Data Mining Special Focus: Social Networks
No ratings yet
Advanced Topics in Data Mining Special Focus: Social Networks
35 pages
After CAT
No ratings yet
After CAT
443 pages
13 Network Models: Nadine Baumann and Sebastian Stiller
No ratings yet
13 Network Models: Nadine Baumann and Sebastian Stiller
32 pages
Graph Theory in The Information Age
No ratings yet
Graph Theory in The Information Age
13 pages
Unit II - 02 - Scale Free Networks
No ratings yet
Unit II - 02 - Scale Free Networks
16 pages
Nonlinear Barab#asi-Albert Network: Roberto N. Onody, Paulo A. de Castro
No ratings yet
Nonlinear Barab#asi-Albert Network: Roberto N. Onody, Paulo A. de Castro
12 pages
Unit II - 01 - Random Networks
No ratings yet
Unit II - 01 - Random Networks
26 pages
Preferential Attachment: The Rich Gets Richer
No ratings yet
Preferential Attachment: The Rich Gets Richer
14 pages
Meeting Strangers and Friends of Friends
No ratings yet
Meeting Strangers and Friends of Friends
48 pages
04 Sahoo
No ratings yet
04 Sahoo
24 pages
Mod 2 - Small World Networks
No ratings yet
Mod 2 - Small World Networks
32 pages
Barabasi-Albert Model - Social Network Analysis
No ratings yet
Barabasi-Albert Model - Social Network Analysis
22 pages
Rozgonyi Kristof
No ratings yet
Rozgonyi Kristof
30 pages
Complex Network
No ratings yet
Complex Network
6 pages
Notes RGC Nii
No ratings yet
Notes RGC Nii
508 pages
CS109/Stat121/AC209/E-109 Data Science: Network Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Network Models
20 pages
CS109/Stat121/AC209/E-109 Data Science: Network Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Network Models
20 pages
Ch6 SN6 Power6
No ratings yet
Ch6 SN6 Power6
20 pages
Lec 29
No ratings yet
Lec 29
13 pages
Social Network Analysis
No ratings yet
Social Network Analysis
38 pages
4 RandomNetworks Lastupdate2324
No ratings yet
4 RandomNetworks Lastupdate2324
41 pages
Random Graph Models of Social Networks: Paper Authors: M.E. Newman, D.J. Watts, S.H. Strogatz
No ratings yet
Random Graph Models of Social Networks: Paper Authors: M.E. Newman, D.J. Watts, S.H. Strogatz
21 pages
Graphs and Networks
No ratings yet
Graphs and Networks
5 pages
Network Science
No ratings yet
Network Science
11 pages
MA214 Lecture 10 v2
No ratings yet
MA214 Lecture 10 v2
23 pages
Biol Sistemas 2 Redes
No ratings yet
Biol Sistemas 2 Redes
82 pages
Network Dynamics 2013
No ratings yet
Network Dynamics 2013
93 pages
1604 A Theory of Network Security Principles of Natural Selection and Combinatorics
No ratings yet
1604 A Theory of Network Security Principles of Natural Selection and Combinatorics
60 pages
2021 - 3 - NS - Random Networks
No ratings yet
2021 - 3 - NS - Random Networks
73 pages
Lec 30
No ratings yet
Lec 30
12 pages
Complex-Network Modelling and Inference
No ratings yet
Complex-Network Modelling and Inference
26 pages
Introducción A La Teoría de Grafos
No ratings yet
Introducción A La Teoría de Grafos
19 pages
Intermediate Data Science NX
No ratings yet
Intermediate Data Science NX
48 pages
Informations About Networks
No ratings yet
Informations About Networks
12 pages
Preferential Attachment in The Growth of Social Networks: The Case of Wikipedia
No ratings yet
Preferential Attachment in The Growth of Social Networks: The Case of Wikipedia
5 pages
Class3_RandomNetworks_KCA
No ratings yet
Class3_RandomNetworks_KCA
76 pages
JV Small World
No ratings yet
JV Small World
74 pages
Scale Free Networks: Barış Ekdi
No ratings yet
Scale Free Networks: Barış Ekdi
8 pages
A Model For Social Networks: Riitta Toivonen, Jukka-Pekka Onnela, Jari Sarama Ki, Jo Rkki Hyvo Nen, Kimmo Kaski
No ratings yet
A Model For Social Networks: Riitta Toivonen, Jukka-Pekka Onnela, Jari Sarama Ki, Jo Rkki Hyvo Nen, Kimmo Kaski
10 pages
Small World
No ratings yet
Small World
47 pages
Graph Theory and Social Networks Spring 2014 Notes: Kimball Martin April 30, 2014
No ratings yet
Graph Theory and Social Networks Spring 2014 Notes: Kimball Martin April 30, 2014
148 pages
Social
No ratings yet
Social
67 pages
Introduction to Topology
From Everand
Introduction to Topology
Simone Malacrida
No ratings yet
Applying Function Point Analysis To Data Warehouse Analytics Systems
No ratings yet
Applying Function Point Analysis To Data Warehouse Analytics Systems
22 pages
Function Points & Counting Enterprise Data Warehouses: Release 1.0
No ratings yet
Function Points & Counting Enterprise Data Warehouses: Release 1.0
19 pages
Dm&bi - L10-Association Rules
No ratings yet
Dm&bi - L10-Association Rules
43 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
52 pages
Data Mining and BI: Social Network Analytics: Credits: Lada Adamic
No ratings yet
Data Mining and BI: Social Network Analytics: Credits: Lada Adamic
34 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
42 pages
Dr. Panagiotis Rizomiliotis
No ratings yet
Dr. Panagiotis Rizomiliotis
85 pages
Privacy and Security Issues in Cloud Computing The Role
100% (1)
Privacy and Security Issues in Cloud Computing The Role
15 pages
Semi-Thue System - Wikipedia
No ratings yet
Semi-Thue System - Wikipedia
1 page
4 4 Trig Properties Practice Packet
No ratings yet
4 4 Trig Properties Practice Packet
5 pages
HW 08
No ratings yet
HW 08
2 pages
DPP 4
No ratings yet
DPP 4
2 pages
Lesson 2 Evaluation of Limits of Functions
No ratings yet
Lesson 2 Evaluation of Limits of Functions
21 pages
DS Practical List
No ratings yet
DS Practical List
4 pages
乙部執分全攻略
No ratings yet
乙部執分全攻略
35 pages
Roadmap DSA
No ratings yet
Roadmap DSA
5 pages
Minimum Spanning Tree Thesis
100% (2)
Minimum Spanning Tree Thesis
7 pages
Chapter3 ProblemSolvingBySearching
No ratings yet
Chapter3 ProblemSolvingBySearching
61 pages
Lec31 32
No ratings yet
Lec31 32
24 pages
Nagindas Khandwala College of Commerce, Arts & Management Studies & Shantaben Nagindaskhandwala College of Science Malad (W), Mumbai - 64
No ratings yet
Nagindas Khandwala College of Commerce, Arts & Management Studies & Shantaben Nagindaskhandwala College of Science Malad (W), Mumbai - 64
26 pages
Data Structures Algorithms Mock Test
No ratings yet
Data Structures Algorithms Mock Test
6 pages
HW Practice 1.7 to 1.11
No ratings yet
HW Practice 1.7 to 1.11
4 pages
Ib Math Aa
No ratings yet
Ib Math Aa
27 pages
INTEGRATION Exponential & Trig Functions
No ratings yet
INTEGRATION Exponential & Trig Functions
15 pages
Differentiation and Integration Formulas
No ratings yet
Differentiation and Integration Formulas
2 pages
Unit 4 Graph and Tree Algorithms 1
No ratings yet
Unit 4 Graph and Tree Algorithms 1
16 pages
Dijkstra's Algorithm - Wikipedia
No ratings yet
Dijkstra's Algorithm - Wikipedia
10 pages
Newman Survey
No ratings yet
Newman Survey
91 pages
Chapter 2 PDF
No ratings yet
Chapter 2 PDF
19 pages
Btech Cs 5 Sem Design and Analysis of Algorithms Rcs 502 2018 19
No ratings yet
Btech Cs 5 Sem Design and Analysis of Algorithms Rcs 502 2018 19
2 pages
Ma 110 Quadratic Equation Notes
No ratings yet
Ma 110 Quadratic Equation Notes
29 pages
Dsdocument
No ratings yet
Dsdocument
2 pages
Trigonometry Sheet 1
No ratings yet
Trigonometry Sheet 1
29 pages
SECTION 10.8 Graph Coloring: Abc, C E
No ratings yet
SECTION 10.8 Graph Coloring: Abc, C E
3 pages
MATHEMATICS-27-08 - 11th (J-Batch) SOLUTION
No ratings yet
MATHEMATICS-27-08 - 11th (J-Batch) SOLUTION
6 pages

Data Mining and BI: Social Network Analytics: Random Graphs

Uploaded by

Data Mining and BI: Social Network Analytics: Random Graphs

Uploaded by

Data Mining and BI: Social

Credits: Lada Adamic

Scaling: Average shortest path lav

● C: normalization constant (probabilities over all k must sum to 1)

● The graph is connected

You might also like