SlideShare une entreprise Scribd logo
Visual Network Analysis
Tommaso Venturini
tommaso.venturini@sciences-po.org
Today’s menu
1. The complexity of complex networks
2. Beheading the complexity of networks
3. Visual network analysis with Gephi
Deploying
innovation
networks
Part I:
The complexity of complex networks
The bad news:
networks are complex
The power law
(pareto’s law)
characteristic
scale distribution
scale-free
distribution
The power law Barabási, Albert-László (2002)
Linked: The New Science of Networks
Networks of scientific
papers D. De Solla Prince, 1965
Science, 149(3683) : 510-515
The Kevin Bacon number
http://oracleofbacon.org/
The Paul Erdos
number http://www.ams.org/mathscinet/collaborationDistance.htm
l
The Erdos-Bacon
number http://en.wikipedia.org/wiki/Erd%C5%91s
%E2%80%93Bacon_number
The Erdos-Bacon
number
5 6 1
The complexity of complex
network
Complex network as
rhizomes
“Unlike trees or their roots, the rhizome connects any point to any other point”
Gilles Deleuze & Felix Guattari “A Thousand Plateaus”, 1980
“The main feature of a net is that every point can be connected with every other
point, and where the connections are not yet designed, they are, however,
conceivable and designable.
A net is an unlimited territory”
Umberto Eco, “Semiotics and the Philosophy of Language”, 1986
Deploying
innovation
networks
Part II:
Beheading network
complexity
Carving more than assembling
For example:
two types of network-maps
the (pseudo-) exhaustive ones
For example:
two types of network-maps
the good ones
the (pseudo-) exhaustive ones
An (pseudo-) exhaustive map of
the Web http://internet-map.net
A good
map of the Web politicosphere.blog.lemonde.fr
Making networks
readeable
A small complex network
Exploiting the power law to make
things readable
The layers of the Web
The layers of complex networks
(visibility)
impossible to miss
more or less visible
higher layer
lower layer
middle layer
almost invisible
The layers of complex networks
(connectivity)
Highly linked
locally and globally
Highly linked locally
Scarcely linked globally
higher layer
lower layer
middle layer
Scarcely linked
locally and globally
The reverse gravity of networks
(many ascending links)
higher layer
lower layer
middle layer
The reverse gravity of networks
(few descending links)
higher layer
lower layer
middle layer
Cutting above
and below
cutting
arbitrary choice
easy separation Everywhere
Nowhere
Somewhere
Ripping the sides
Ripping
constrained choice
difficult separation
Anatomy
of a corpus
Core
Tendrils
Nebula
higher layer
lower layer
middle layer
Part III:
Visual network with Gephi
Deploying
innovation
networks
Learn how
to use Gephi http://gephi.org/users/
Overview window
Gephi.org
Data laboratory window
Gephi.org
Preview window
Gephi.org
1. nodes position – layout
2. nodes size – ranking
3. nodes color – partitions
3 visual variables of analysis
Gephi.org
L’analyse du réseau en 6 questions
Application d’une spatialisation force-vecteur
1. Quelles sont les débats/communautés discursives ?
(identification des clusters de nœuds)
2. Quels sont les sites au centre des débats/communautés ?
(identification des nœuds centraux dans le réseau et les clusters)
3. Quels sont les sites qui connectent les débats/communautés ?
(identification des ponts/bridge entre les différents clusters)
Application d’une classement par degrée-entrant/sortant
4. Quels sont les sites leaders d’opinion du débat en ligne ?
(identification des autorités du graphe)
5. Quels sont les sites qui fédèrent le le débat en ligne ?
(identification des hubs du graphe)
Application d’une coloration par partition
6. Comment sont reparties les différentes catégories de sites ?
(évaluation de la cohérence topologie/catégorisation)
Application d’une spatialisation force-
vecteur (ForceAtlas 2)
• LinLog mode
(maximizes the legibility of clusters)
• Prevent overlap
(enhances legibility, but distorts spatialization)
• Scaling
(increases/decreases all distance proportionally)
• Gravity
(pulls everything towards the center, prevents
dispersions, but distorts spatialization)
• Approximate repulsion
(accelerate spatialization on large graphs, but
Quelles sont les débats/communautés discursives ?
(identification des clusters de nœuds)
Quelles sont les débats/communautés discursives ?
(identification des clusters de nœuds)
HeatGraph
(use Google Chrome) http://tools.medialab.sciences-po.fr/
heatgraph/
Quels sont les sites au centre des débats/communautés ?
(identification des nœuds centraux dans le réseau et les clusters)
Quels sont les sites qui connectent les débats/communautés ?
(identification des ponts/bridge entre les différents clusters)
Application d’un classement par degrée-
entrant/sortant
Quels sont les sites leaders d’opinion du débat en ligne ?
(identification des autorités du graphe)
Quels sont les sites qui fédèrent le le débat en ligne ?
(identification des hubs du graphe)
Application d’une coloration
par partitions
Comment sont reparties les différentes catégories de sites ?
(évaluation de la cohérence topologie/catégorisation)
Comment sont reparties les différentes catégories de sites ?
(évaluation de la cohérence topologie/catégorisation)
tommaso.venturini@sciences-po.org

Contenu connexe

ODP
IC05 cours 1
Sébastien
 
PPT
Global and china obd telematics industry report, 2014 2015
ResearchInChina
 
DOC
Configure h base hadoop and hbase client
Shashwat Shriparv
 
PPT
Hwswb
Saranya Ram
 
PPTX
Ch2 2014 Kristen Ricker Nixa High School
rickerkristen
 
PDF
Tarif jne-reg-2013-bdg-1-juni-2013
ridwansf2
 
PDF
Dutch media landscape 2015 Q4 update by Starcom
starcomNL
 
PPT
Synapseindia android apps (operating system)
Synapseindiappsdevelopment
 
IC05 cours 1
Sébastien
 
Global and china obd telematics industry report, 2014 2015
ResearchInChina
 
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Ch2 2014 Kristen Ricker Nixa High School
rickerkristen
 
Tarif jne-reg-2013-bdg-1-juni-2013
ridwansf2
 
Dutch media landscape 2015 Q4 update by Starcom
starcomNL
 
Synapseindia android apps (operating system)
Synapseindiappsdevelopment
 

En vedette (20)

PDF
Filtros de cabine
tuliovmg1
 
PDF
TBEX15 Asia Thailand Sara Meaney
TBEX
 
PPTX
Aplikom_Unsri_1. MyBiodata dan keunikan Matematika_Sutri Octaviana
sutrioctavianasitorus
 
XLS
Ususnmptn2011
timdatawol
 
PDF
Sulucionario electromagnetismo cheng
Saku Garcia
 
PPS
Dedicado a mis amig@s ciberentic@s
staro G.G
 
PPTX
Monografia fic
romercen
 
PPTX
Windows 8.1 Deployment - Tools, Tools, Tools
Roel van Bueren
 
PDF
Aspire one series service guide
Setyo Prasadja
 
PDF
Everybody Polyglot! - Cross-Language RPC with Erlang
Rusty Klophaus
 
PPT
Inside sina weibo
sinocismblog
 
PPTX
Miquel Martí i Pol
Quim Civil
 
PPTX
Poor Pigs
Tery Casey
 
PPT
Beautiful Women Of China
Ren
 
PPS
工作狂日记
zhang123456
 
PPTX
Rules around us
Оксана Димова
 
PDF
Changhong
TELE-audiovision eng
 
PDF
C++ Chapter I
Sorn Chanratha
 
DOC
1st Grade Unit 6: Blue jay finds a way
Sharnon Johnston-Robinett
 
DOC
Yg Ini 1
septiyan_123pradita
 
Filtros de cabine
tuliovmg1
 
TBEX15 Asia Thailand Sara Meaney
TBEX
 
Aplikom_Unsri_1. MyBiodata dan keunikan Matematika_Sutri Octaviana
sutrioctavianasitorus
 
Ususnmptn2011
timdatawol
 
Sulucionario electromagnetismo cheng
Saku Garcia
 
Dedicado a mis amig@s ciberentic@s
staro G.G
 
Monografia fic
romercen
 
Windows 8.1 Deployment - Tools, Tools, Tools
Roel van Bueren
 
Aspire one series service guide
Setyo Prasadja
 
Everybody Polyglot! - Cross-Language RPC with Erlang
Rusty Klophaus
 
Inside sina weibo
sinocismblog
 
Miquel Martí i Pol
Quim Civil
 
Poor Pigs
Tery Casey
 
Beautiful Women Of China
Ren
 
工作狂日记
zhang123456
 
Rules around us
Оксана Димова
 
C++ Chapter I
Sorn Chanratha
 
1st Grade Unit 6: Blue jay finds a way
Sharnon Johnston-Robinett
 
Publicité

Similaire à Visual Network Analysis (11)

PPTX
Social network metrics and trust based recommendation
Jimmy Siméon
 
PDF
Théorie Des Graphes
medialabSciencesPo
 
PDF
Approches et méthodes en visualisation de l'information: la cartographie du Web
ForumTelmi
 
PDF
Université d’été ferney voltaire 2014 – les réseaux atelier-pajek
Marion Maisonobe
 
PDF
Conclusion du cours Exploration du Web
Sébastien
 
PDF
Initiation à l'analyse de réseaux - formation fmr - séance 1
Marion Maisonobe
 
PPT
Géographie de l'information
jacomyma
 
PDF
20170320logiciels
Laurent Beauguitte
 
Social network metrics and trust based recommendation
Jimmy Siméon
 
Théorie Des Graphes
medialabSciencesPo
 
Approches et méthodes en visualisation de l'information: la cartographie du Web
ForumTelmi
 
Université d’été ferney voltaire 2014 – les réseaux atelier-pajek
Marion Maisonobe
 
Conclusion du cours Exploration du Web
Sébastien
 
Initiation à l'analyse de réseaux - formation fmr - séance 1
Marion Maisonobe
 
Géographie de l'information
jacomyma
 
20170320logiciels
Laurent Beauguitte
 
Publicité

Plus de INRIA - ENS Lyon (20)

PPTX
Actor-Network Theory as a Theory of Action
INRIA - ENS Lyon
 
PPTX
Sprinting with Data
INRIA - ENS Lyon
 
PDF
Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...
INRIA - ENS Lyon
 
PDF
Dr. Jekyll and Mr. Hyde IPCC and the Double Logic of International Expertise
INRIA - ENS Lyon
 
PPTX
Dancing Together: the Fluidification of the Modern Mind
INRIA - ENS Lyon
 
PPTX
Digital methods - 1 : Introduction
INRIA - ENS Lyon
 
PPT
Contropedia, and the question of analytically separating the medium and the m...
INRIA - ENS Lyon
 
PPTX
A Tale of Two Cities
INRIA - ENS Lyon
 
PPTX
Escaping greatdivide coimbra
INRIA - ENS Lyon
 
PPTX
What isa border_kings
INRIA - ENS Lyon
 
PPTX
Climaps by EMAPS et Europeana2015
INRIA - ENS Lyon
 
PPT
Medusa haidresser
INRIA - ENS Lyon
 
PPTX
Keynote speech at the Digitale Praxen conference at Frankfurt University
INRIA - ENS Lyon
 
PPTX
On Continuity in Social Sciences
INRIA - ENS Lyon
 
PPT
A Trip to Flatland: mapping or modeling in the social sciences
INRIA - ENS Lyon
 
PPT
How to follow actors through their traces. Exploiting digital traceability
INRIA - ENS Lyon
 
PPT
What’s in a controversy. Deploying the folds of collective action
INRIA - ENS Lyon
 
PPT
Who are the actors of controversies? appreciating the heterogeneity of collec...
INRIA - ENS Lyon
 
PPT
1. Why controversies? Learning to be constructivist
INRIA - ENS Lyon
 
PPTX
From Before the Cradle: mapping online debates on c-section and family planning
INRIA - ENS Lyon
 
Actor-Network Theory as a Theory of Action
INRIA - ENS Lyon
 
Sprinting with Data
INRIA - ENS Lyon
 
Actor‐Network Theory VS Network Analysis VS Digital Networks Are We Talking A...
INRIA - ENS Lyon
 
Dr. Jekyll and Mr. Hyde IPCC and the Double Logic of International Expertise
INRIA - ENS Lyon
 
Dancing Together: the Fluidification of the Modern Mind
INRIA - ENS Lyon
 
Digital methods - 1 : Introduction
INRIA - ENS Lyon
 
Contropedia, and the question of analytically separating the medium and the m...
INRIA - ENS Lyon
 
A Tale of Two Cities
INRIA - ENS Lyon
 
Escaping greatdivide coimbra
INRIA - ENS Lyon
 
What isa border_kings
INRIA - ENS Lyon
 
Climaps by EMAPS et Europeana2015
INRIA - ENS Lyon
 
Medusa haidresser
INRIA - ENS Lyon
 
Keynote speech at the Digitale Praxen conference at Frankfurt University
INRIA - ENS Lyon
 
On Continuity in Social Sciences
INRIA - ENS Lyon
 
A Trip to Flatland: mapping or modeling in the social sciences
INRIA - ENS Lyon
 
How to follow actors through their traces. Exploiting digital traceability
INRIA - ENS Lyon
 
What’s in a controversy. Deploying the folds of collective action
INRIA - ENS Lyon
 
Who are the actors of controversies? appreciating the heterogeneity of collec...
INRIA - ENS Lyon
 
1. Why controversies? Learning to be constructivist
INRIA - ENS Lyon
 
From Before the Cradle: mapping online debates on c-section and family planning
INRIA - ENS Lyon
 

Visual Network Analysis

Notes de l'éditeur

  • #2: 23/03/12
  • #3: 27/08/12
  • #5: 27/08/12
  • #6: 27/08/12
  • #13: And even from both of them.
  • #14:
  • #15: 27/08/12
  • #16: 27/08/12
  • #18: 27/08/12
  • #19: Because of the monstrous size of the Web, there are two types of maps of it. The maps that try to be exhaustive and to trace the entire Web or most of it (and fail)…
  • #20: … and the good ones.
  • #21: 27/08/12
  • #22: A good map of the Web is always limited in its ambition: it tries to represent a limited portion of the Web and the better this portion is delimited, the better is the map. In the example a very interesting map of the French political blogosphere, realized by Linkfluence (a research partner of the médialab).
  • #23: Indeed, the carving process that we just described is precisely what allows going from a pseudo-exhaustive (in fact, poorly delimited) network to a legible one.
  • #24: 27/08/12 This is a model of a tiny web corpus. It only has some 80 nodes and yet it looks as a plate of spaghetti (or an hairball).
  • #25: 27/08/12 Now that we know about power law, however, we can try to de-spaghetticize this graph. To do so, we will first change the size of the nodes according to their in-degree (the number of hyperlinks that they receive).
  • #26: 27/08/12 Secondly, we will re-order the nodes on the Y axe again according to their in-degree.
  • #27: 27/08/12 Focusing on visibility, the higher layer contains the websites that are highly visible, appear on the first page of search engines’ results and can be easily found by anyone; the middle layer contains the websites that are less visible, appear in the following pages of search engines’ results and can only be found by experts; the lower layers contains the websites that are almost invisible, don’t show up in search engines and are almost impossible to find.
  • #28: 27/08/12 Focusing on connectivity, the higher layer contains the websites that are highly connected both locally and globally; the middle layer contains the websites that are highly connected locally but poorly connected globally; the lower layers contains the websites that are poorly connected both locally and globally.
  • #29: 27/08/12 The three layers can also be distinguished by looking at the direction of the links. The World Wide Web is characterized by a very peculiar reverse gravity: where the less visible websites points toward the more visible ones (thereby making them even more visible)…
  • #30: 27/08/12 … but not the other way around.
  • #31: 27/08/12 The reason why we want to exclude these websites is because it is impossible to define where they are located. The websites that are too high in the in-degree hierarchy are connected to everyone and are therefore everywhere. The websites that are too low in the in-degree hierarchy are connected to none and are therefore nowhere. Only the websites in the middle are somewhere because they are connected to only someone. Of course where exactly this first cut is done depends entirely on the level of specificity of the research that you are doing. And this is why this cut is arbitrary and relatively easy.
  • #32: 27/08/12 This cut is more difficult because the thematic separation on the Web is as we said a question of density and rarefaction and separating one theme from another is more a question of ripping than of cutting.
  • #33: 27/08/12 Through this two operation is possible to delimit a thematic corpus. This corpus is composed of websites of the intermediary layer, but also of the upper and lower layers. In particular, the websites of the higher layer constitute the core of the corpus, which is surrounded by a nebula of websites of the middle layer and several tendrils in the lower layer.
  • #34: Now that we have extracted our scientometrics network from Scopus, we can analyse it with Gephi.
  • #35: Gephi is a very complex piece of software and here I will only have the time for a quick introduction. However, if you want to know more about Gephi and its usage, I strongly encourage you to have a look at the documentation on the Gephi’s website ( http://gephi.org/users/ ) which is extremely well done.
  • #36: Very quickly, Gephi has three main windows one for the ‘Overview’, which is the one where you can manipulate and analyze your graph (and the one on which you’ll spend most of the time).
  • #37: The second window is the ‘Data Laboratory’ where you have a table view of the nodes and the edges of your graph and their attributes.
  • #38: Finally the ‘Preview’ window allow you tweaking the visualization parameters of your graph and export the result of your work as a static image (pdf, png, svg).
  • #39: Back to the ‘Overview’ window there are three main palettes that we will employ in the analysis: 1. The ‘Layout’ palette, to change the position of the nodes 2. The ‘Ranking palette, to change the size of the nodes 3. The ‘Partitions’ palette, to change the color of the nodes
  • #40: 27/08/12 As you see the cells of the table are colored with four different colors that indicates the four steps of the analysis: 1. Identification of clusters (layout) 2. Characterization of clusters (layout) 3. Remarkable nodes (layout & ranking) 4. Categories projection (partitions)
  • #41: To identify the clusters, therefore, the first thing to do is to spatialize the network using a force-vector algorithm. The first action that we will do on our graph is to spatialize it with the ForceAtlas 2 layout. This algorithm can be tweaked by changing several parameters, the most important of which are - LinLog mode (maximizes the legibility of clusters) - Prevent overlap (enhances legibility, but distorts spatialization) - Scaling (increases/decreases all distance proportionally) - Gravity (pulls everything towards the center, prevents dispersions, but distorts spatialization) - Approximate repulsion (reduce the time required to spatialize large graphs, but distorts spatialization)
  • #42: … it is easy to identify the areas which contains no or few nodes, also called structural holes …
  • #43:
  • #45: - Central clusters (located in the middle of the network), because centrality in a spatialized graph is a sign of high and highly diverse connectivity. - Bridging clusters (located in-between two clusters), because this clusters play a crucial role in allowing the circulation of things in the network.
  • #46: - Central clusters (located in the middle of the network), because centrality in a spatialized graph is a sign of high and highly diverse connectivity. - Bridging clusters (located in-between two clusters), because this clusters play a crucial role in allowing the circulation of things in the network.
  • #47: To identify the clusters, therefore, the first thing to do is to spatialize the network using a force-vector algorithm. The first action that we will do on our graph is to spatialize it with the ForceAtlas 2 layout. This algorithm can be tweaked by changing several parameters, the most important of which are - LinLog mode (maximizes the legibility of clusters) - Prevent overlap (enhances legibility, but distorts spatialization) - Scaling (increases/decreases all distance proportionally) - Gravity (pulls everything towards the center, prevents dispersions, but distorts spatialization) - Approximate repulsion (reduce the time required to spatialize large graphs, but distorts spatialization)
  • #48: - The in-degree, corresponding to the number of incoming edges (the number of connection pointing toward the node). The in-degree of a node is also called its ‘authority score’, because receiving many connections is generally correlated to the fact that the node is considered ‘important’ or ‘remarkable’ by the other nodes of the network.
  • #49: The out-degree, corresponding to the number of outgoing edges (the number of starting from the node). The out-degree of a node is also called its ‘hub score’. Hubs are important in networks because the play a crucial role in the circulation of the information. Of course, in-degree and out-degree can only be computed in directed graphs (graph in which the connections have a direction). In non-directed graph (such as a graph of friendship, if we assume that friendship is always mutual), it is however possible to compute the degree of nodes (the number of edges connected to a each node).
  • #50: To identify the clusters, therefore, the first thing to do is to spatialize the network using a force-vector algorithm. The first action that we will do on our graph is to spatialize it with the ForceAtlas 2 layout. This algorithm can be tweaked by changing several parameters, the most important of which are - LinLog mode (maximizes the legibility of clusters) - Prevent overlap (enhances legibility, but distorts spatialization) - Scaling (increases/decreases all distance proportionally) - Gravity (pulls everything towards the center, prevents dispersions, but distorts spatialization) - Approximate repulsion (reduce the time required to spatialize large graphs, but distorts spatialization)
  • #51: But it is also interesting to observe if topology and classification are consistent (if most of the nodes of a given type are located within the same clusters and, conversely, if clusters are formed by nodes of the same type).
  • #52: If topology and classification are consistent, it is then interesting to zoom on the exceptions and have a closer look to the nodes that have and unusual position compared to the other nodes of the same type.