Social
Social
Eytan Adar
590AI
Relational Tie
Actor parentOf
Person supervisorOf
Group reallyHates (+/-)
Event …
… Dyad
Relation: collection of ties of a specific type
(every parentOf tie)
Vocabulary Lesson
If A likes B and B likes C then A likes C (transitivity)
If A likes B and C likes B then A likes C
…
Triad
Vocabulary Lesson
Social
Network
One mode
Vocabulary Lesson
Social
Network
Two mode
Vocabulary Lesson
Ego-Centered
ego
Network
(egonet)
Describing Networks
• Graph theoretic
– Nodes/edges, what you’d expect
• Sociometric
– Sociomatrix (2D matrix representation)
– Sociogram (the adjacency matrix)
• Algebraic
– ni nj
– Also what you’d expect
• Basically complimentary
Describing Networks
MIT
Stanford
Describing Networks
• Geodesic
– shortest_path(n,m)
• Diameter
– max(geodesic(n,m)) n,m actors in graph
• Density
– Number of existing edges / All possible edges
• Degree distribution
Types of Networks/Models
• A few quick examples
– Erdős–Rényi
• G(n,M): randomly draw M edges between n nodes
• Does not really model the real world
– Average connectivity on nodes conserved
Types of Networks/Models
• A few quick examples
– Erdős–Rényi
– Small World
• Watts-Strogatz
• Kleinberg lattice model
Small world experiments then
MA
NE
Given a target individual and a particular property, pass the message to a person
you correspond with who is “closest” to the target.
Watts-Strogatz Ring Lattice Rewiring
Select a fraction p of
edges
Reposition on of their
endpoints
Add a fraction p of
additional edges leaving
underlying lattice intact
MA
NE
Kleinberg Lattice Model
r
additional links placed with puv ~ d uv
– now add new vertices one by one, each one with exactly m
edges
– each new edge connects to an existing vertex in proportion
to the number of edges that vertex already has →
preferential attachment
Properties of the BA graph
• The distribution is scale free with exponent a = 3 P(k) = 2
m2/k3
• The graph is connected
– Every new vertex is born with a link or several links (depending on
whether m = 1 or m > 1)
– It then connects to an ‘older’ vertex, which itself connected to another
vertex when it was introduced
– And we started from a connected core
• The older are richer
– Nodes accumulate links as time goes on, which gives older nodes an
advantage since newer nodes are going to attach preferentially – and
older nodes have a higher degree to tempt them with than some new
kid on the block
Common Tasks
• Measuring “importance”
– Centrality, prestige
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Data Collection / Cleaning
Analysis
Past
Small datasets
Data Collection / Cleaning Pretty explicit connections
Analysis
Understand the properties
Past
Present
Data Collection / Cleaning
Large datasets
Entity resolution
Implicit connections
Analysis
Present
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Centrality Measures
• Degree centrality
– Edges per node (the more, the more important the node)
• Closeness centrality
– How close the node is to every other node
• Betweenness centrality
– How many shortest paths go through the edge node
(communication metaphor)
• Information centrality
– All paths to other nodes weighted by path length
• Bibliometric + Internet style
– PageRank
Tie Strength
• Strength of Weak Ties (Granovetter)
– Granovetter: How often did you see the contact that helped you find the job prior
to the job search
• 16.7 % often (at least once a week)
• 55.6% occasionally (more than once a year but less than twice a week)
• 27.8% rarely – once a year or less
– Weak ties will tend to have different information than we and our close contacts
do
?
Link Prediction in Social Net Data
• We know things about structure
– Homophily = like likes like or bird of a feather flock
together or similar people group together
– Mutuality
– Triad closure
1
z ( x ) ( y ) log | ( z ) |
Γ(x) = neighbors of x
Originally: 1 / log(frequency(z))
l
l 1
| paths l
x, y |
advisorOf?
Employee /contractor
Salary
Time at company
…
Link/Label Prediction in Relational Data
• Koller and co.
– Relational Bayesian Networks
– Relational Markov Networks
• Structure (subgraph templates/cliques)
– Similar context
– Transitivity
sisterOf
hte rOf
g
dau
c e Of Find all
e
ni
Common Tasks
• Measuring “importance”
– Centrality, prestige (incoming links)
• Link prediction
• Diffusion modeling
– Epidemiological
• Clustering
– Blockmodeling, Girvan-Newman
• Structure analysis
– Motifs, Isomorphisms, etc.
• Visualization/Privacy/etc.
Privacy
• Emerging interest in anonymizing networks
– Lars Backstrom (WWW’07) demonstrated one of
the first attacks
• How to remove labels while preserving graph
properties?
– While ensuring that labels cannot be reapplied
Network attacks
• Terrorist networks
– How to attack them
– How they might attack us
• Carley at CMU
Software
• Pajek
– http://vlado.fmf.uni-lj.si/pub/networks/pajek/
• UCINET
– http://www.analytictech.com/
• KrackPlot
– http://www.andrew.cmu.edu/user/krack/krackplot.shtml
• GUESS
– http://www.graphexploration.org
• Etc.
Books/Journals/Conferences
• Social Networks/Phs. Rev
• Social Network Analysis (Wasserman + Faust)
• The Development of Social Network Analysis
(Freeman)
• Linked (Barabsi)
• Six Degrees (Watts)
• Sunbelt/ICWSM/KDD/CIKM/NIPS
Questions?
Assortativity
• Social networks are assortative:
– the gregarious people associate with other gregarious people
– the loners associate with other loners
• The Internet is disassorative:
Disassortative:
Random hubs are in the periphery
Assortative:
hubs connect to hubs