What Is Network Science
What Is Network Science
c Cambridge University Press 2013 1
doi:10.1017/nws.2013.2
EDITORIAL
What is network science?
ULRIK BRANDES
Department of Computer and Information Science, University of Konstanz, Germany
(e-mail:)[email protected])
GARRY ROBINS
Melbourne School of Psychological Sciences, University of Melbourne, Australia
(e-mail:)[email protected])
ANN McCRANIE
Department of Sociology, Indiana University Bloomington, USA
(e-mail:)[email protected])
STANLEY WASSERMAN
Departments of Psychological and Brain Science and Statistics, Indiana University Bloomington, USA
(e-mail:)[email protected])
Abstract
This is the beginning of Network Science. The journal has been created because network
science is exploding. As is typical for a field in formation, the discussions about its scope,
contents, and foundations are intense. On these first few pages of the first issue of our new
journal, we would like to share our own vision of the emerging science of networks.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
2 U. Brandes et al.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 3
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
4 U. Brandes et al.
abstraction representation
phenomenon network concept network data
network model
1. A specification of how the phenomenon (in general, i.e., more generally than
this particular instantiation) is abstracted to a network.
2. A specification of how this conceptual network is represented in data (e.g.,
measured or observed).
Claim 1
Network science is the study of network models.
We emphasize that, in our view, network models are unlikely to generalize across
domains. We hence remain open to, but rather sceptical about any Grand Unified
Network Theory (GrUNT) that ignores research contexts. On a related note, we find
it very unfortunate that many network studies are referred to as “social network
analysis” just because the methods applied are commonly used for social network
models. If the network under scrutiny models, say, gene regulation, the term is
hardly appropriate.
Network theory builds on the assumption that a cause, an effect, or an association
between aspects involve something that can be conceptualized as a network. Testing
of hypotheses derived from network theory requires instantiation of the network
model. While the abstraction yields the format in which a phenomenon will be
represented, its actual representation is in terms of data that is typically obtained
via empirical observation.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 5
According to our framework there are actually two aspects to network theory.
On the one hand, network theories can suggest and explicate, for given research
domains, how to abstract phenomena into networks. This includes, for example,
what constitutes an individual entity or a relationship, how to conceptualize the
strength of a tie, etc. In such applied network science, the corresponding theories
are epistemological—network theories bound to specific classes of phenomena. On
the other hand, network theories can deal with formalized aspects of network
representations such as degree distributions, closure, communities, etc., and how
they relate to each other. In such pure network science, the corresponding theories
are mathematical—theories of networks.
Claim 2
There are theories about network representations and network theories about
phenomena: both constitute network theory.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
6 U. Brandes et al.
3 Network data
We have argued that networks are abstractions represented in data, but we have yet
to discriminate them from other conceptualizations. We are now going to do so by
first looking at characteristics of standard types of data to be able to then highlight
the defining features of network data.
The input to data analysis consists of values of variables. Variables are generic
placeholders characterizing the essential features of an abstract concept, thus
allowing to formulate analytical steps generically as well. The instantiating values
are usually obtained via some form of observation such as measurement. Note,
however, that different original phenomena may yield the same representation in
data.
Our definition of what constitutes network data hinges entirely on how the
involved variables are related. It is thus independent of the phenomena being
represented, the ranges of values that can be assumed, and the techniques used to
analyze the data. The significance of the following claim will be established in three
steps below.
Claim 4
What sets network data apart is the incidence structure of its domain?
The third step will also unveil the essential correspondence between the signature
of network variables and the defining interest in network science.
Claim 5
At the heart of network science is dependence, both between and within variables.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 7
A x
a 1
b 4
c 1
d 2
e 3
f 6
(a) standard table: variables in columns indexed with unrelated entities
D x
(a, f ) 1
(d, e) 5
(b, c) 3
(b) dyadic: variables in columns indexed with unrelated pairs of entities
D x x(D) a b c d e f
(a, b) 0 a · 0 1 0 1 0
(a, c) 1 b 0 · 1 2 · ·
(a, d) 0 c 1 1 · 0 2 0
(a, e) 1 d 0 2 0 · 1 4
(a, f ) 0 e 1 · 2 1 · 2
(b, c) 1 f 0 · 0 4 2 ·
(b, d) 2
(c, d) 0
..
.
(c) network: variables in columns indexed with incident pairs of entities, or in matrices
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
8 U. Brandes et al.
In the above example, variables gender, education, and income are of different
types: While all are defined on the common domain A, the range of values they may
assume is different. Even more importantly, these ranges exhibit a different level of
structuring. The range of gender, for instance, is binary on a nominal scale, i.e.,
the only defined relation is an equality predicate. In other words, the comparison of
two values yields either equality or inequality, and this is the only information we
can get out of comparison. For instance, we cannot add or rank values of gender
variables.
Assume now that education refers to the highest degree obtained by an individual.
It may then be valid to compare two values and conclude that one indicates a higher
level of education than the other, and this relation could be transitive. In this case,
the range is ordered and the variable is on an ordinal scale of education. Finally,
we may compare income by amount, but it may also be meaningful to compute
differences and ratios. The range of the variable income can therefore be considered
to be a continuous ratio scale. If, however, 0 is not meaningful for a continuous
range as in, e.g., measuring IQ, it is not appropriate to calculate ratios and the scale
is called interval.
The interesting thing to observe is that a range is usually not just a set of possible
values but a set with additional relations such as an ordering or operations, i.e.,
structured. The structure of a range is crucial to know about because it determines
the kinds of analyses and interpretations that are justified.
While the range of attributes is structured, in much of science, the domain on
which variables are defined is assumed to have no structure, i.e., simply a set. This
may be for good reason. If we are interested in associations between, say, education
and income controlled for age, we actually do not want there to be relations
between individuals that also moderate the association. Much of statistics is in fact
concerned with detecting and eliminating such relations.
This is the single most important difference with network science, where the
domains of at least some variables are explicitly set up to have structure. The
potentially resulting dependencies are not a nuisance but more often than not they
constitute the actual research interest.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 9
A domain representing couples is structured but only minimally so. The only
relation is a pairing of individuals in dyads. For the statistical analysis to work, it is
usually desirable that these dyads be disjoint and independent. For example, having
several individuals appear in two or more distinct dyads may invalidate findings
about associations between, say, age differences and the number of common children.
In this respect dyadic data is not all that different from standard data tables.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
10 U. Brandes et al.
because they are good friends who share information relevant for their salary
negotiations. In other words, rather than being a dependence between two different
types of variables, now we have a dependence within the values for one type
of variable. This is a complex dependence because it cannot be aggregated or
averaged in distributional terms. It corresponds to the kind of dependency ana-
lyzed in spatial statistics (with proximity rather than friendship as the underlying
mechanism).
Yet, network dependence goes further because dependence does not just stop
at actor attribute variables. It may apply within the set of network variables as
well. Any network variable is defined on a domain of pairs of individuals (i.e.,
the dyads), and the incidence structure of the domain captures the potential for
within-variable dependencies. A network tie variable takes a value, often binary,
sometimes valued, indicating whether there is or not a tie between its two individuals.
The crucial point is that the presence of one tie may influence the presence of
another. In other words, ties are not necessarily moderating variables, but there
may be dependencies within the tie variable themselves. While this will appear an
unfamiliar point of view to some, it is merely a statement that networks may be sys-
tematically patterned. Without dependence among ties, there is no emergent network
structure.
In the explicit form of stochastic models, these ideas entered network analysis
from spatial statistics. They are deeply at the heart of network theory, even if seldom
overtly addressed. Entire sets of methodological approaches, such as exponential-
family random graph models, depend on modeling tie dependence appropriately.
With independence among network tie variables, we would be left only with
the simple random networks known as Bernoulli graphs, Erdős-Renyi graphs, or
the G(n, p) model. It should be noted that this view does not require a statistical
perspective; combinatorial invariants of graphs that represent networks are of
interest exactly for the same reason as descriptors of structural features.
Because almost all the networks that we observe bear little resemblance to simple
random graphs, tie dependence is empirically very common. For instance, a familiar
network process is that of preferential attachment, whereby actors “prefer” to be
attached to popular actors so that the rich get richer. The presence of many ties
centered on one popular individual may attract the presence of additional ties to
that same individual.
Dependence among ties is thus the means whereby network structure self-organizes
and evolves, or emerges, but it is not simple. This is why network science is often
referred to as the study of complex networks. It remains a research question to
establish plausible types of tie dependence. Theories or methods that wish away these
dependencies are ignorant of the structure of the domain, and thus contradictory to
a network model.
While the choice of representation is indeed a matter of convenience and hence
relative to any given scenario, we think that the data-oriented perspective of a
structured domain is leaner than, for instance., the common strategy of defining
networks in terms of graphs. Graphs may be one of the most common forms of
representation but they do not make the most distinguishing feature of network data
apparent. The edges display the structure of an observation, not of the conceptual
setup that led to it.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 11
3.4 Discussion
Our characterization of network data focuses on the structure of the domain of
variables irrespective of their range. We have built our exposition on this definition
because it carves out the essential distinction from standard data and allows for
a more uniform description of different network types. A further advantage is the
clear distinction between dyads for which no data is available (because they are not
in the domain) and dyads for which the data indicates the absence or nullity of a
relationship.
Statistics is often defined as the study of data, involving anything from its col-
lection, preparation, and management to its exploration, analysis, and presentation.
In this view, our definition of network science delineates a subarea of statistics
concerned with data of a peculiar format. This implies that, like general statistics,
network science is not tied to any particular substantive area. The disciplines for
which area editors have assumed responsibility should therefore not be viewed as
limiting the scope of submissions.
Areas such as machine learning or data mining are distinguished from other areas
of statistics by the tasks addressed. Operations research is distinguished mostly
by the subject matter and its consequences for data and tasks. In contrast, it is
the special data format that causes network science to be markedly different from
other areas in statistics. First, the abstraction to a different category of concepts
introduces a combinatorial structure on the domain of variables. This structure
leads to alternative representations that ask for more combinatorial approaches than
distributions and graph theory in particular. Second, there is an inherent focus on
interdependence and (in contrast to time series analysis: non-linear) within-variable
associations.
Claim 6
Network science is evolving into a mathematical science in its own right.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
12 U. Brandes et al.
Claim 7
Network science is itself more of an evolving network than a paradigm expanding
from a big bang.
It is our intention to help the field of network science develop a canon of research
moving forward by publishing the most promising and widest-reaching work in the
field.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 13
Acknowledgments
Part of the research that led to this editorial was funded by Deutsche Forschungs-
gemeinschaft under grant Br 2158/6-1 and the Social and Cognitive Networks
Academic Research Center of the US Army Research Laboratory.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
14 U. Brandes et al.
Engineering. Across various engineering disciplines there has been a growing interest
in understanding phenomena from a network perspective. These include contexts
as diverse as transportation networks, production networks, supply chain and
logistic networks, telecommunication networks, traffic networks, data networks,
mobile ad hoc networks, content distribution networks, peer-to-peer networks, sensor
networks, neural networks, nano networks, and regulatory networks. In all of these
diverse contexts, we seek theoretical, methodological, computational, and empirical
contributions that examine engineering issues such as network architecture, flows,
protocols, reliability, performance, optimization, routing, and congestions.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2
Editorial 15
and interaction of individual attitudes, traits and behaviors, and social network
ties, including network-based social influence. Finally, we are also interested in the
perception of social networks, network structures typical of different age groups,
or of other social categories; network-based social support and mental health; and
social networks and culture.
Downloaded from https://www.cambridge.org/core. Pontificia universidad catolica de valparaiso, on 22 Mar 2018 at 18:28:55, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/nws.2013.2