0% found this document useful (0 votes)
11 views67 pages

Social

This document discusses various topics in social network analysis including: 1. Describing different types of networks such as ego-centered networks, small world networks, and preferential attachment networks. 2. Common tasks in social network analysis including measuring node importance, link prediction, diffusion modeling, and clustering. 3. Specific methods like centrality measures, tie strength, link prediction metrics, and applying SIR models to information diffusion.

Uploaded by

Rona Quinto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views67 pages

Social

This document discusses various topics in social network analysis including: 1. Describing different types of networks such as ego-centered networks, small world networks, and preferential attachment networks. 2. Common tasks in social network analysis including measuring node importance, link prediction, diffusion modeling, and clustering. 3. Specific methods like centrality measures, tie strength, link prediction metrics, and applying SIR models to information diffusion.

Uploaded by

Rona Quinto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

Social Network Analysis

Eytan Adar
590AI

Some content from Lada Adamic


Vocabulary Lesson

Relational Tie

Actor parentOf
Person supervisorOf
Group reallyHates (+/-)
Event …
… Dyad
Relation: collection of ties of a specific type
(every parentOf tie)
Vocabulary Lesson
If A likes B and B likes C then A likes C (transitivity)
If A likes B and C likes B then A likes C

Triad
Vocabulary Lesson

Social
Network
One mode
Vocabulary Lesson

Social
Network
Two mode
Vocabulary Lesson

Ego-Centered
ego
Network
(egonet)
Describing Networks
• Graph theoretic
– Nodes/edges, what you’d expect
• Sociometric
– Sociomatrix (2D matrix representation)
– Sociogram (the adjacency matrix)
• Algebraic
– ni  nj
– Also what you’d expect
• Basically complimentary
Describing Networks
MIT
Stanford
Describing Networks
• Geodesic
– shortest_path(n,m)
• Diameter
– max(geodesic(n,m)) n,m actors in graph
• Density
– Number of existing edges / All possible edges
• Degree distribution
Types of Networks/Models
• A few quick examples
– Erdős–Rényi
• G(n,M): randomly draw M edges between n nodes
• Does not really model the real world
– Average connectivity on nodes conserved
Types of Networks/Models
• A few quick examples
– Erdős–Rényi
– Small World
• Watts-Strogatz
• Kleinberg lattice model
Small world experiments then

MA

NE

Milgram’s experiment (1960’s):

Given a target individual and a particular property, pass the message to a person
you correspond with who is “closest” to the target.
Watts-Strogatz Ring Lattice Rewiring
Select a fraction p of
edges
Reposition on of their
endpoints

Add a fraction p of
additional edges leaving
underlying lattice intact

• As in many network generating algorithms


• Disallow self-edges
• Disallow multiple edges
Geographical search

MA

NE
Kleinberg Lattice Model

nodes are placed on a lattice and


connect to nearest neighbors

r
additional links placed with puv ~ d uv

Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’


(Nature 2000)
A little more on degree distribution
• Power-laws, zipf, etc.
Distribution of users among
web sites
CDF of users to sites

Sites ranked by popularity


A little more on degree distribution
• Pareto/Power-law
– Pareto: CDF P[X > x] ~ x-k
– Power-law: PDF P[X = x] ~ x-(k+1) = x-a
– Some recent debate (Aaron Clauset)
• http://arxiv.org/abs/0706.1062
• Zipf
– Frequency versus rank y ~ r-b (small b)
• More info:
– Zipf, Power-laws, and Pareto – a ranking tutorial
(http://www.hpl.hp.com/research/idl/papers/ranking/ranki
ng.html)
Types of Networks/Models
• A few quick examples
– Erdős–Rényi
– Small World
• Watts-Strogatz
• Kleinberg lattice model
– Preferential Attachment
• Generally attributed to Barabási & Albert
Basic BA-model
• Very simple algorithm to implement
– start with an initial set of m0 fully connected nodes
• e.g. m0 = 3

– now add new vertices one by one, each one with exactly m
edges
– each new edge connects to an existing vertex in proportion
to the number of edges that vertex already has →
preferential attachment
Properties of the BA graph
• The distribution is scale free with exponent a = 3 P(k) = 2
m2/k3
• The graph is connected
– Every new vertex is born with a link or several links (depending on
whether m = 1 or m > 1)
– It then connects to an ‘older’ vertex, which itself connected to another
vertex when it was introduced
– And we started from a connected core
• The older are richer
– Nodes accumulate links as time goes on, which gives older nodes an
advantage since newer nodes are going to attach preferentially – and
older nodes have a higher degree to tempt them with than some new
kid on the block
Common Tasks
• Measuring “importance”
– Centrality, prestige
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Data Collection / Cleaning

Analysis
Past
Small datasets
Data Collection / Cleaning Pretty explicit connections

Analysis
Understand the properties

Past
Present
Data Collection / Cleaning

Large datasets
Entity resolution
Implicit connections

Analysis

Understand the properties

Present
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Centrality Measures
• Degree centrality
– Edges per node (the more, the more important the node)
• Closeness centrality
– How close the node is to every other node
• Betweenness centrality
– How many shortest paths go through the edge node
(communication metaphor)
• Information centrality
– All paths to other nodes weighted by path length
• Bibliometric + Internet style
– PageRank
Tie Strength
• Strength of Weak Ties (Granovetter)
– Granovetter: How often did you see the contact that helped you find the job prior
to the job search
• 16.7 % often (at least once a week)
• 55.6% occasionally (more than once a year but less than twice a week)
• 27.8% rarely – once a year or less
– Weak ties will tend to have different information than we and our close contacts
do

weak ties will tend to have high


beweenness and low transitivity
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Link Prediction

?
Link Prediction in Social Net Data
• We know things about structure
– Homophily = like likes like or bird of a feather flock
together or similar people group together
– Mutuality
– Triad closure

• Various measures that try to use this


Link Prediction
• Simple metrics
– Only take into
account graph
properties

1

z ( x )  ( y ) log |  ( z ) |

Γ(x) = neighbors of x
Originally: 1 / log(frequency(z))

Liben-Nowell, Kleinberg (CIKM’03)


Link Prediction
• Simple metrics
– Only take into
account graph
properties

  l

l 1
| paths l 
x, y |

Paths of length l (generally 1)


from x to y
weighted variant is the number of
times the two collaborated
Liben-Nowell, Kleinberg (CIKM’03)
Link Prediction in Relational Data
• We know things about structure
– Homophily = like likes like or bird of a feather flock
together or similar people group together
– Mutuality
– Triad closure

• Slightly more interesting problem if we have


relational data on actors and ties
– Move beyond structure
Relationship & Link Prediction

advisorOf?

Employee /contractor
Salary
Time at company

Link/Label Prediction in Relational Data
• Koller and co.
– Relational Bayesian Networks
– Relational Markov Networks
• Structure (subgraph templates/cliques)
– Similar context
– Transitivity

• Getoor and co.


– Relationship Identification for Social Network Discovery
• Diehl/Namata/Getoor AAAI’07
– Enron data
• Traffic statistics and content to find supervisory relationships?
– Traffic/Text based
– Not really identification, more like ranking
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Epidemiological
• Viruses
– Biological, computational
– STDs, needle sharing, etc.
– Mark Handcock at UW
• Blog networks
– Applying SIR models (Info Diffusion Through Blogspace, Gruhl et al.)
• Induce transmission graph, cascade models, simulation
– Link prediction (Tracking Information Epidemics in Blogspace, Adar et
al.)
• Find repeated “likely” infections
– Outbreak detection (Cost-effective Outbreak Detection in Networks,
Leskovec et al.)
• Submodularity
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
D om ingo
C arlos
A lejandro
E duardo
F rank
H al
K arl
B ob
Ike
G ill
Lanny
M ike
John
X avier
U trecht
N orm
R uss
Q uint
W endle
O zzie
Ted
S am
Vern
P aul
Blockmodels
• Actors are portioned into positions
– Rearrange rows/columns
• The sociomatrix is then reduced to a smaller
image
• Hierarchical clustering
– Various distance metrics
• Euclidean, CONvergence of CORrelation (CONCOR)
• Various “fit” metrics
Im age m atrix
C ohesion C enter-periphery R anking
Girvan-Newman Algorithm
• Split on shortest paths (“weak ties”)

1. Calculate betweenness on all edges


2. Remove highest betweenness edge
3. Recalculate
4. Goto 1
Other solutions
• Min-cut based
• “Voltage” based
• Hierarchical schemes
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Network motif detection
• How many more motifs of a certain type exist
over a random network

• Started in biological networks


– http://www.weizmann.ac.il/mcb/UriAlon/
Basic idea
• construct many random graphs with the same
number of nodes and edges (same node
degree distribution?)
• count the number of motifs in those graphs
• calculate the Z score: the probability that the
given number of motifs in the real world
network could have occurred by chance
Generating random graphs
• Many models don’t preserve the desired
features
• Have to be careful how we generate
Other Structural Analysis

sisterOf

hte rOf
g
dau
c e Of Find all
e
ni
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Privacy
• Emerging interest in anonymizing networks
– Lars Backstrom (WWW’07) demonstrated one of
the first attacks
• How to remove labels while preserving graph
properties?
– While ensuring that labels cannot be reapplied
Network attacks
• Terrorist networks
– How to attack them
– How they might attack us
• Carley at CMU
Software
• Pajek
– http://vlado.fmf.uni-lj.si/pub/networks/pajek/
• UCINET
– http://www.analytictech.com/
• KrackPlot
– http://www.andrew.cmu.edu/user/krack/krackplot.shtml
• GUESS
– http://www.graphexploration.org
• Etc.
Books/Journals/Conferences
• Social Networks/Phs. Rev
• Social Network Analysis (Wasserman + Faust)
• The Development of Social Network Analysis
(Freeman)
• Linked (Barabsi)
• Six Degrees (Watts)
• Sunbelt/ICWSM/KDD/CIKM/NIPS
Questions?
Assortativity
• Social networks are assortative:
– the gregarious people associate with other gregarious people
– the loners associate with other loners
• The Internet is disassorative:
Disassortative:
Random hubs are in the periphery
Assortative:
hubs connect to hubs

You might also like