Multilevel Longitudinal Analysis of Social Network
Multilevel Longitudinal Analysis of Social Network
Johan Koskinen
University of Melbourne, Melbourne, Australia.
E-mail: [email protected]
Tom A.B. Snijders
University of Oxford, Oxford, United Kingdom; University of Groningen, Groningen, The
Netherlands.
arXiv:2201.12713v1 [stat.ME] 30 Jan 2022
E-mail: [email protected]
1. Introduction
Social network research deals with analysing the dependencies among people or other social
units, dependencies induced by the relational ties that bind them together (Wasserman and Faust,
1994; Brandes et al., 2013; Robins, 2015). These dependencies can best be studied in a dynamic
approach, where the existence of a given configuration of ties leads to the creation, or supports
the maintenance, of other ties. While many of the endogenous network dependencies, like
triadic closure and balance, are of interest in their own right, there is a growing interest in the
dynamic interdependence of networks with other structures, such as actor variables (Veenstra
et al., 2013), other networks for the same actor set (Huitsing et al., 2014; Elmer et al., 2017), or
two-mode networks (Lomi and Stadtfeld, 2014).
Dynamic network data can be of various kinds. A frequently followed design is the collection
of network panel data, i.e., the observation of all relational ties (in one or more networks) and
other relevant variables, within a given group of social actors (such as individuals, firms, coun-
tries, etc.), at two or more moments in time, the ‘panel waves’. For modelling panel data for a
single network, represented by a digraph, the Stochastic Actor-oriented Model (‘SAOM’) was
proposed by Snijders (2001). This was extended to a joint model for changing actor variables
2 Johan Koskinen and Tom A.B. Snijders
(vertex attributes) and tie-variables by Steglich et al. (2010) and to a model for the interde-
pendent dynamics of multiple networks, potentially combinations of one-mode and two-mode
networks, by Snijders et al. (2013). These joint dynamic models can be combined under the
heading of ‘co-evolution’, as summarized in Snijders (2017).
Collecting longitudinal network data is very time-intensive and demands great care, but data
sets of longitudinal networks in many ‘parallel’ groups are becoming increasingly common; ex-
amples (among many others) are the study ‘Networks and actor attributes in early adolescence’
executed by Chris Baerveldt and Andrea Knecht which will be used in this paper (Knecht, 2006;
Knecht et al., 2010), and CILS4EU (Kalter et al., 2013).
While the SAOM has proved useful in analysing networks in single groups, the methodology has
been limited in studying the extent to which network dynamics generalise to different contexts
and what might differ systematically across groups of actors. The investigation of heterogeneity
across groups more generally, in the way multilevel models have proven useful (e.g., Goldstein,
2011; Snijders and Bosker, 2012), has not been possible. However, to find scientific regularities
it is more attractive to study multiple groups that may be regarded as a sample from a population
and to generalize to populations of networks (Snijders and Baerveldt, 2003; Entwisle et al.,
2007). For the Exponential Random Graph Model a multilevel methodology was proposed by
Slaughter and Koehly (2016) (see also Schweinberger et al., 2020).
This paper proposes a multilevel extension of the SAOM for data sets composed of disjoint
groups of actors, for which only networks within each group are considered. The actors are
nested within the groups. Since ties combine pairs of actors, the combined structure of actors
and ties cannot be regarded as being nested. This extension employs random coefficients like
the multilevel models mentioned above and draws on the likelihood-based estimation frame-
works of Koskinen and Snijders (2007) and Snijders et al. (2010). It also permits the investi-
gation of observable group-level variables, such as compositional and contextual factors, like
in standard multilevel modelling. Our example is a co-evolution of friendship networks and
delinquent behaviour represented by two-mode networks, therefore the elaboration focuses on
the co-evolution model of Snijders et al. (2013).
Combinations of networks are occasionally refered to as ‘multilayer networks’ (Kivelä et al.,
2014; Magnani and Wasserman, 2017) or ‘multilevel networks’ (Snijders, 2016), but in this pa-
per we use the term ‘multilevel networks’ to express the link to the random coefficient multilevel
models in the sense mentioned above.
As the motivating example, we consider the dynamic relation between friendship and delinquent
behaviour, using the study ‘Networks and actor attributes in early adolescence’. The data set
was collected by Andrea Knecht, supervised by Chris Baerveldt (Knecht, 2006). The data was
collected in 126 first-grade classrooms in 14 secondary schools in The Netherlands in 2003-
2004, using written questionnaires. The entire data set contains four waves with about three
months in between.
We focus on the friendship network and on the four questions about delinquency: stealing, van-
dalism, graffiti, and fighting, for each of which self-reported frequencies were given with five
Multilevel Longitudinal Social Networks 3
categories. Written self-reports provide reliable measurements of delinquency for adolescents
(Köllisch and Oberwittler, 2004). The dynamic relation between a network such as friendship
and a changing actor variable such as the tendency to commit delinquent behaviour has two
sides: selection, changes of friendships as a function of the delinquent behaviour of the two in-
dividuals concerned; and influence, changes in delinquent behaviour of an actor as a function of
the network position of this actor and the delinquent behaviour of the others, especially those to
whom this actor has a friendship tie. A methodology to distinguish between selection and influ-
ence, using network and behaviour panel data, based on the SAOM, was proposed by Steglich
et al. (2010). The conclusions are not causal in the counterfactual sense, as demonstrated by
Shalizi and Thomas (2011), but in a temporal sense: does a change in behaviour follow on
some network configuration (‘influence’), or does a change in friendship follow on a behaviour
configuration (‘selection’). A further discussion of causality in network-behaviour systems was
given by Lomi et al. (2011).
The association between friendship and the tendency to delinquent behaviour was studied by
Knecht et al. (2010). This publication used the data set mentioned above, constructing an actor
variable representing delinquent behaviour as a sum score of the four items for the frequencies
of stealing, vandalism, graffiti, and fighting. It used the two-step multilevel method of Snijders
and Baerveldt (2003), in which first the SAOM is estimated for each classroom separately, after
which the results for the classrooms are combined. Since most of the classrooms were too small
for the satisfactory application of this — rather complicated — model, only 21 classrooms could
be used.
In the current paper we present an extension of this study, replacing the simplistic two-step
multilevel approach by an integrated random coefficient approach, which does not depend on
the condition of a convergent estimation algorithm for each classroom separately and therefore
can use a much larger part of the data set. Furthermore, we replace the model where delinquent
behaviour is represented by an actor variable with a model representing the four delinquency
items by a two-mode network. This allows a more detailed study of social influence. The ac-
tors are supposed to be influenced by their friends, which are those they mention as a friend
(friendship ties from the actor to the friends). In the former study, the tendency toward delin-
quent behaviour was regarded as a one-dimensional trait, measured by the sum score of the four
delinquent items; social influence was represented by the effect of the average of this trait over
the actor’s friends. The current study considers this together with another type of influence: the
effect of the friends’ behaviour for some specific delinquent behaviour on the same behaviour
of the actor.
The stochastic actor-oriented model (Snijders, 2017) is a family of longitudinal network models
for network panel data. While networks are only observed at discrete time points, the model
assumes that the networks evolve in continuous time. This is necessary for representing the
feedback between the tie variables that can occur in the time elapsing between the observation
moments. Some history of continuous-time models for social network panel data is presented
in Snijders (2001). Continuous-time models for discrete-time panel data are well known (e.g.,
4 Johan Koskinen and Tom A.B. Snijders
Bergstrom, 1988; Hamerle et al., 1993; Singer, 1996). Their use for network panel data in
sociology is argued also by Block et al. (2018).
and the probability that actor i ∈ N is selected for changing a tie variable in V ∈ {X, Z} is
λVi (θ, y)
.
λ++ (θ, y)
Given that i is selected for making a change in network V , the option set consists of all outgoing
tie variables in network V , together with the option ‘no change’. The set of outcomes reachable
in a mini-step by actor i in network V is denoted AVi (y), with
0 0 0
AX
i (x, z) ⊆ {(x , z) ∈ Y : ||x − x || ≤ 1, xkj = xkj , ∀j and ∀k 6= i}
and
0 0 0
AZ
i (x, z) ⊆ {(x, z ) ∈ Y : ||z − z || ≤ 1, zkh = zkh , ∀h and ∀k 6= i} .
Here ||B − C|| denotes the Hamming distance between adjacency matrices B and C . Usually
the subset “⊆” will be implemented as equality “=”, but the subset symbol is used because
there could be constraints on the state space, such as in the case of changing composition or
absorbing states.
Conditionally on y , and on i being selected to make a change in network V , the probability that
the outcome of the choice is y 0 is
if y 0 ∈ AVi (y), and 0 if y 0 6∈ AVi (y). Note that since y ∈ AVi (y), the probability of no change,
i.e., y 0 = y , is positive.
fiZ θ, (x, z) does not depend on x, the dynamics of the one-mode and two-mode networks
are independent. In our example the interest is in the interdependence between friendship and
delinquent behaviour, which is reflected by statistics that depend on both networks jointly.
The model can be interpreted as a sequential discrete-choice model where actors make choices
about their outgoing ties, using random utilities (Maddala, 1983), under the restriction that they
can change no more than one outgoing tie variable. From that perspective the model can be
interpreted as a process whereby actors chose to change their network ties or their behaviour to
what they deem most preferable, allowing for a random element in their decisions. The model
does not strictly require this interpretation and Snijders (2017) treats a wide variety of different
model specifications, including differential treatments of creating and terminating ties, more
elaborate specifications of the rate functions, and options for non-directed networks.
Of particular importance are cross-network effects sX Z
ki (x, z) and ski (x, z) depending on x as
well as z , reflecting the mutual dependence between the one-mode and the two-mode network.
In our application, where the networks are friendship and delinquent behaviours, the following
cross-network effects are used. As mnemonic indicators, we use ‘o’ for outgoing friendship ties,
’i’ for incoming friendship ties, and ’d’ for ties in the delinquency network. The subgraphs used
are illustrated in the pictograms, where nodes of the first mode are denoted by circles, nodes
of the second mode by squares, one-mode ties by straight arrows, and two-mode ties by curly
arrows. The superscript V indicates that the effect applies to V = X as well as V = Z . Note
that effects sVki refers to actors i, who consider changing some outgoing tie in network V . In the
pictograms, the parts with a tie i → j have the role of dependent variables for friendship, and
the parts with a tie i h have the role of dependent variables for delinquency.
(a) od: the product of the number of outgoing friendships and the number of delinquent be-
haviours of i,
X X
sVod,i (x, z) = xij zih . h i j
j h
(b) id: the product of the number of incoming friendships and the number of delinquent be-
haviours (note the exchange of i and j ),
X X
sX
id,i (x, z) = xij zjh , h j i
j h
X X
sZ
id,i (x, z) = xji zih . h i j
j h
(c) odd: a mixed triadic effect: the number of friendships of i weighted by the number of
Multilevel Longitudinal Social Networks 7
delinquent behaviours i and j have in common,
h
X
sVodd,i (x, z) = xij zih zjh .
j,h i j
(d) od_av: a mixed four-node effect that is not a subgraph count: the total number of delin-
quent behaviours reported by i multiplied by the average number of delinquent behaviours,
centered, reported by all i’s friends,
P P
j x ij z j` − z̄
P `
X
sZ
od_av,i (x, z) = zih ,
h j xij
h i j `
where z̄ is the average observed outdegree for Z in the group. Here 0/0 is defined as 0.
Effect ‘od_av’ is used only for explaining the dynamics of the Z network, the other three are
used for explaining the dynamics of both networks. Brief interpretations of these effects, for
positive parameter values, are the following.
For explaining the friendship dynamics (‘selection’):
(a) The ‘od’ effect indicates that those who engage in more delinquent behaviours will be
more active in nominating friends.
(b) The ‘id’ effect indicates that those who engage in more delinquent behaviours will be more
popular as friends.
(c) The ‘odd’ effect indicates that actors will tend to be friends with those who engage in the
same delinquent behaviours.
(a) The ‘od’ effect indicates that those who nominate more friends will tend to engage in more
delinquent behaviours.
(b) The ‘id’ effect indicates that those who are more popular as friends will tend to engage in
more delinquent behaviours.
(c) The ‘odd’ effect indicates that actors will tend to engage in the same delinquent behaviours
as their friends.
(d) The ‘od_av’ effect indicates that those whose friends on average are more delinquent will
also themselves tend to engage in more delinquent behaviours.
8 Johan Koskinen and Tom A.B. Snijders
The last two effects (‘odd’ and ‘od_av’) are the most clear expressions of the idea of social
influence, both implying that the probability distribution of changes in delinquent behaviour of
the actor is a function of the delinquent behaviour of the actor’s friends. Effect ‘odd’ is social
influence operating for specific acts of delinquent behaviour, while ‘od_av’ is a generalized
influence at the level of the sum scores of delinquency.
It is more efficient to work with the marginal model pAUG v | y(tm ) which is p∗AUG v, s | y(tm )
marginalised over holding times s. In the sequel we will assume constant rates λVi = λV for
both networks V = X, Z , in which case the augmented likelihood is
pAUG (v r ) | y 0 , θ = exp − λ+
+ (tm+1 − tm ) (3)
R R
λ+ (tm+1 − tm ) Y λVr Vr
× + p (θ, y r−1 , y r ) ;
R! λX + λZ ir
r=1
see Snijders et al. (2010) where also an approximation for non-constant rates is given.
Multilevel Longitudinal Social Networks 9
The Markov assumption implies that the likelihood for a sequence of augmented data v =
v(t1 ), . . . , v(tM ) , given observation y = y(t0 ), y(t1 ), . . . , y(tm ) is
M
Y −1
pAUG (v | y, θ) = pAUG v(tm+1 ) | y(tm ), θ . (4)
m=1
The model pAUG v|y(tm−1 ), θ when marginalised over all paths v that start in y 0 = y(tm−1 )
4. Hierarchical model
We assume that each group g follows the same specification, i.e., has the same expressions for
the rate and evaluation functions, although the number of actors ng = |Ng | may be different.
Each group g has associated with it a group-specific parameter θ[g] . Heterogeneity across groups
typically takes the form of contextual and compositional effects.
While comparing structure across networks is a natural thing to do and has attracted some
attention (e.g., Faust and Skvoretz, 2002), it is clear that comparing structure across different-
sized networks is non-trivial (Anderson et al., 1999). One key problem is the way the average
degree scales with network size, something that has been studied for cross-sectional networks
(Erdős and Rényi, 1960; Krivitsky et al., 2011; Shalizi and Rinaldo, 2013). We assume that the
variation in group sizes ng as well as in average degrees is limited. Based on a combination
of Krivitsky
P et al. (2011) and Snijders (2005, p. 243) we suggest that including an effect of
log(ng ) j xij will make the other parameters comparable, and that for this effect a parameter
of −1/2 would be expected if none of the other parameters reflects differential group sizes.
Components of θ[g] that are variable across g are similar to random slopes in regular multilevel
modeling (Goldstein, 2011; Snijders and Bosker, 2012). The question of whether to allow all
group-level parameters to vary across groups needs to be guided by specific case considerations
as well as computational aspects just as in multilevel models in general. We partition the pa-
rameter vector θ[g] for group g into subvectors γ [g] , of dimension p1 , containing the variable
parameters, and η , of dimension p2 , containing the constant parameters. We write the group-
wise parameters as the partitioned vector
[g]
[g] γ
θ = .
η
When p1 = 0 we have the so-called multi-group model (Ripley et al., 2021, Section 11.2). In
classical multilevel modeling it is usual to apply models with only a few random slopes. How-
ever, it seems that Bayesian estimation allows entertaining models with more random slopes
(Eager and Roy, 2017). For group-level covariates, such as interventions or indicators of group
composition, it is natural that their effects are fixed.
10 Johan Koskinen and Tom A.B. Snijders
We draw on standard hierarchical modelling approaches and assume that the group-level pa-
iid
rameters have a multivariate normal distribution γ [g] ∼ Np1 (µ, Σ). We assume that (µ, Σ) and
η are a priori independent with priors (µ, Σ) ∼ π(µ, Σ | Γ) and η ∼ π(η | µ0,η , Σ0,η ).
An exception to this should be made for the rate parameters λ, which are necessarily positive.
They reflect particular circumstances of groups and issues of study design, and will always be
included among the variable parameters γ [g] . The multivariate normal distribution is assumed
to be truncated to positive values for these parameters. The values of µ and Σ will in practice
be such that the non-truncated distribution has an extremely small probability for negative rate
parameters. An alternative is to employ a transformed normal or a Gamma distribution, which
is conjugate for the Poisson counts (Koskinen and Snijders, 2007). However, the multivariate
normal gives a simple unified treatment for all varying parameters.
With this hierarchical specification, denoting the multivariate normal density by φ, the joint
probability density function for data y [1] , . . . , y [G] , parameters γ [1] , . . . , γ [G] , and µ, Σ, η is given
by
YG
φ(γ [g] | µ, Σ) pSAOM (y [g] | γ [g] , η) .
π µ, Σ | Γ π(η|µ0,η , Σ0,η ) (5)
g=1
5. Prior specifications
We present the inference scheme for a specific choice of priors. Other prior specifications may
be considered (see Appendix B) but the MCMC scheme largely remains unchanged.
This is treated, e.g., in Gelman et al. (2014), Section 3.6, and O’Hagan and Forster (2004),
Chapter 14. Thus, the hyper-parameters of the prior are Λ0 , ν0 , κ0 . The expected value for the
inverse Wishart(Λ, ν ) distribution is
1
E Σ = Λ
ν−p−1
provided ν > p + 1, and the mode is (ν + p + 1)−1 Λ (O’Hagan and Forster, 2004). Thus,
the central tendency of the inverse Wishart(Λ, ν ) distribution may be taken to be about ν −1 Λ.
Parameter Λ is on the scale of the sum of squares of a sample of size ν from a distribution with
variance-covariance matrix Σ. The number of degrees of freedom ν0 can be regarded as the
Multilevel Longitudinal Social Networks 11
Λ0 ν0
µ0 κ0
Σ µ Σ0,η µ0,η
η
γ [1] γ [G]
Fig. 1. Dependence structure of hierarchical SAOM, representing only the first and last groups
g = 1, G.
effective sample size that has led to the prior information. The value of κ0 can be interpreted
as the proportionality between Σ, the uncertainty about the groupwise parameters γ [g] given the
average population value µ, and the prior uncertainty about µ. Having the same proportionality
of this kind for all parameters is rather restrictive, but as a first approach we prefer to use a
conjugate prior which leads to relatively simple procedures for this already complicated model.
6. Estimation
The dependence structure amongst all variables is given in Figure 1. Parameters can be es-
timated by an MCMC procedure, sampling the random variables indicated by the circles in
Figure 1, going up in the figure. The parameters in rectangular boxes are given hyperparame-
ters.
12 Johan Koskinen and Tom A.B. Snijders
Mini-steps
For all groups g independently, sequences v [g] of outcomes of mini-steps (ir , V r , y r ) are sam-
pled by an extension of the Metropolis-Hastings procedures of Koskinen and Snijders (2007)
and Snijders et al. (2010). The extension consists of the insertion of the determination of V r .
The target probability function is (4) for given y = y [g] and θ = (γ [g] , η).
Multilevel Longitudinal Social Networks 13
Groupwise varying parameters
Groupwise varying parameters γ [g] are sampled for given v [g] and η, µ, Σ, again for all groups
g independently, by Metropolis Hastings steps with target density
Here φ is the multivariate normal density and pAUG was given in (4). A random walk proposal
distribution is used, like in Schweinberger (2007, Ch. 5.4) and Koskinen and Snijders (2007,
Section 4.4). The covariance matrix for the proposals is C [g] as defined below in the section on
initial values, scaled to obtain approximately 25 % acceptance rates (Gelman et al., 1996).
Constant group-level parameters
The constant parameter η with prior density π(η | µ0,η , Σ0,η ) can be sampled in two ways, both
using Metropolis Hastings steps analogous to the sampling of the groupwise varying parameters.
The first way draws random walk proposals for η with additive perturbations from the multivari-
[0]
ate normal distribution with mean 0 and covariance matrix Cη given below, scaled to obtain
approximately 25 % acceptance rates. The target distribution is
G
Y
π(η | µ0,η , Σ0,η ) pAUG (v [g] | y [g] , γ [g] , η) .
g=1
The second way draws random walk proposals for additive changes in the entire vectors γ [g] , η ,
excluding the basic rate parameters. Now the perturbations come from the multivariate normal
distribution with mean 0 and covariance matrix C [0] given below, again scaled to obtain approx-
imately25 % acceptance rates. The proposal is to add this perturbation identically to the vectors
θ[g] , η for all j . The target distribution is
G
Y
π(η | µ0,η , Σ0,η ) φ(γ [g] | µ, Σ) pAUG (v [g] | y [g] , γ [g] , η) .
g=1
Global parameters
Given realisations of the varying group-level parameters γ [1] . . . , γ [G] , global parameters µ and
Σ can be updated using Gibbs-sampling steps from the full conditional posteriors, as explained
in Gelman et al. (2014), Section 3.6, and O’Hagan and Forster (2004), Chapter 14. The condi-
tional distribution of µ given γ [1] . . . , γ [G] , Σ is given by
[1] [G] G κ0 1
µ | Σ, γ , . . . , γ ∼ Np γ̄ + µ0 , Σ
κ0 + G κ0 + G κ0 + G
with γ̄ = (1/G) g γ [g] , in which we recognize the posterior mean as a weighted sum of the
P
group-level parameters and the prior mean.
14 Johan Koskinen and Tom A.B. Snijders
For the posterior variance-covariance matrix of γ [g] we have
where
κ0 G
Λ1 = Λ0 + Q + (γ̄ − µ0 )(γ̄ − µ0 )0 ,
κ0 + G
G
X
Q = (γ [g] − γ̄)(γ [g] − γ̄)0 .
g=1
The influence of the prior is mainly carried by Λ0 and the last term of Λ1 , which involves κ0
and µ0 . Since the central tendency of the inverse Wishart(Λ, ν ) distribution is about ν −1 Λ, this
shows that the posterior distribution of Σ for large values of G will be close to the variance-
covariance matrix of the γ [g] .
Combining the updates
Sequentially the within-group ministeps v , the group-level parameters γ , and the global param-
eters η, µ, Σ are updated. To achieve good mixing, more updates are required for v than for the
other parameters.
Initial values
Initial values are obtained in a procedure consisting of two stages. First, parameters are esti-
mated for the model where all parameters in θ[g] that are coefficients in the linear predictor are
assumed to be constant across groups, but the basic rate parameters are allowed to be group-
dependent, i.e., a multi-group model. This estimation uses the Robbins-Monro algorithm pro-
posed for obtaining method-of-moments estimates in Snijders (2001), in a brief version because
great precision is not necessary here. This yields an estimated value θ̂(0) , with estimated covari-
ance matrix C [0] . The components of this vector and matrix corresponding to η are denoted η̂ (0)
[0]
and Cη .
Second, for each of the groups g separately, starting from the provisional estimate θ̂(0) , and
keeping the components η̂ (0) constant, a small number of Robbins-Monro steps again following
Snijders (2001) are taken to improve the estimate of θ[g] . The result is used as initial value for
(1)
θ̂[g] . The covariance matrix Cg for the proposal distribution for θ[g] is a weighted combination
of the covariance matrix for this estimate and C [0] .
Data were collected in the first year of secondary school in 14 schools in the Netherlands in
2003-2004, with students being on average slightly older than 12 years at the first wave. There
were four waves, with three months in between. Allowing for the social processes to be unstable
Multilevel Longitudinal Social Networks 15
at the very start of the school year, we used the last three waves. These will be called waves 1-3
from now on, which yields period 1 as the period from wave 1 to wave 2 and period 2 as the
period from wave 2 to 3. Network X was the friendship network, Z the two-mode network of
delinquent behaviours with four second-mode nodes: stealing, vandalism, graffiti, and fighting.
Covariates used were sex (female=1, male=2), language spoken at home, and advice. The Dutch
secondary school system is tiered and ‘advice’ here is defined as the recommended secondary
school level according to the advice given in the last grade of primary school. It is ordered from
low to high with range 1–9.
A basic measure for network stability in a period is the Jaccard coefficient (Batagelj and Bren,
1995), defined for network X and period m as
P
min{xij (tm ), xij (tm+1 )}
P ij
ij max{xij (tm ), xij (tm+1 )}
8. Results
A measure for delinquency is the outdegree in the two-mode network, i.e., the number of delin-
quent behaviours reported by a student. For this variable and for the covariates the means,
within-classroom and between-classroom standard deviations (σ̂ and τ̂ ), and the intraclass cor-
relation coefficients (icc) (calculated according to Snijders and Bosker, 2012, Chapter 3) are
reported in Table 2. From the intraclass correlation coefficients we see that the classrooms are
quite homogeneous with respect to advice, not assortative with respect to sex, while for the level
of delinquency and whether the Dutch language is spoken at home assortativity is positive but
low.
Some descriptive statistics for the set of 81 friendship networks are presented in Table 3. These
include reciprocity, defined as the proportion of ties i → j that is reciprocated by j → i,
and transitivity, defined as the proportion of two-paths i → j → h that is closed by i → h.
Average degrees are about 4, average reciprocity is about 0.60, and average transitivity is about
0.56. These are quite usual figures for friendship networks. The Jaccard measure for network
stability ranges for friendship from 0.28 to 0.75, with a mean of 0.51. This indicates that a good
proportion of ties remains in place from one wave to the next.
Some descriptive statistics for the two-mode delinquency networks are given in Table 4. The
students report on average slightly less than one out of the four delinquent acts. The Jaccard
measure for stability ranges from 0.21 to 0.70, with a mean of 0.41. Here also there is some
change from one wave to the next, but not too much.
The combination means that, for a given delinquent act h, if more of i’s friend practise it then i
will have a higher probability of also starting to practise it and a lower probability of stopping
with it; by contrast, if i’s friends are more delinquent on average but none of them practises act h
then i will have a lower probability of starting to practise h, and a higher probability of stopping.
However, the former, positive influence effect is stronger than the latter, negative influence
effect, because its parameter is higher in absolute value and the range of the explanatory variable
corresponding to ‘odd’ — which is the number of friends practising act h — is equal to 9, which
is larger than the within-group range of the explanatory variable corresponding to ‘od_av’, equal
to 4.
Concluding, there is a weak social selection effect, where those who are more delinquent tend
to nominate fewer friends, and a rather strong social influence effect in the sense of practising
the same delinquent behaviours as one’s friends, but a weak effect of avoiding the delinquent
behaviours not practised by one’s friends if the friends are more delinquent otherwise. This
contrasts with the results of Knecht et al. (2010), who used the same data set but found no evi-
dence of social influence. That publication used a simpler two-stage multilevel network method
which allowed the inclusion of only 21 classrooms of this data set. Another difference is that the
earlier publication did not distinguish the four separate delinquent acts in a two-mode network,
but used an aggregate measure of delinquency. This leads to incomparability between the data
used in the analysis, because the two-mode network representation required dichotomization
of the four delinquency variables, while they were added, without dichotomization, in Knecht
et al. (2010).
9. Conclusions
Network analysis has typically been concerned with describing and modelling network pro-
cesses for individual networks only. We have proposed a modelling framework for generalising
network inference beyond the specifics of individual groups to a population of networks. The
model is a hierarchical extension of the Stochastic Actor-oriented Model (Snijders, 2017) for
longitudinal network panel data, using random coefficients to represent differences between
groups. This allows taking into consideration group-level effects, e.g., interventions or compo-
sitional characteristics, and their cross-level interactions with within-group effects. A further
possibility is to investigate the network dynamics in many small groups, e.g., of size 5 to 10, for
which an analysis per group does not give meaningful results; an example is Dolgova (2019).
The methods are implemented in the R package RSiena (Ripley et al., 2021). They have been
available in beta versions since a few years, which already led to applications, e.g., Boda (2018).
The MCMC algorithm proposed in this paper is a straightforward procedure, and future work
will be devoted to making it more efficient.
Multilevel Longitudinal Social Networks 21
Acknowledgements
This work was supported in part by award R01HD052887 from the US Eunice Kennedy Shriver
National Institute of Child Health and Human Development (John M. Light, Principal Investi-
gator). We are grateful to Ruth Ripley and her programming and support in the foundational
stages of this project at Nuffield College and the Department of Statistics at Oxford.
10. References
Aldous, D. (1983) Minimization algorithms and random walk on the d-cube. The Annals of
Probability, 11, 403–413.
Anderson, B. S., Butts, C. and Carley, K. (1999) The interaction of size and density with graph-
level indices. Social Networks, 21, 239–267.
Batagelj, V. and Bren, M. (1995) Comparing resemblance measures. Journal of Classification,
12, 73–90.
Bergstrom, A. R. (1988) The history of continuous-time econometric models. Econometric
Theory, 4, 365–383.
Block, P. (2015) Reciprocity, transitivity, and the mysterious three-cycle. Social Networks, 40,
163–173.
Block, P., Koskinen, J., Hollway, J., Steglich, C. and Stadtfeld, C. (2018) Change we can believe
in: comparing longitudinal network models on consistency, interpretability and predictive
power. Social Networks, 52, 180–191.
Boda, Z. (2018) Social influence on observed race. Sociological Science, 5, 29–57.
Brandes, U., Robins, G., McCranie, A. and Wasserman, S. (2013) What is network science?
Network Science, 1, 1–15.
Dolgova, E. (2019) On Getting Along and Getting Ahead: How Personality Affects Social Net-
work Dynamics. Ph.D. thesis, Erasmus University, Rotterdam. URL: https://repub.
eur.nl/pub/119150/Dissertation_Dolgova_final_printer2.pdf.
Eager, C. and Roy, J. (2017) Mixed effects models are sometimes terrible.
Elmer, T., Boda, Z. and Stadtfeld, C. (2017) The co-evolution of emotional well-being with
weak and strong friendship ties. Network Science, 5, 278–307.
Entwisle, B., Faust, K., Rindfuss, R. R. and Kaneda, T. (2007) Networks and contexts: Variation
in the structure of social ties. American Journal of Sociology, 112, 1495–1533.
Erdős, P. and Rényi, A. (1960) On the evolution of random graphs. A Matematikai Kutató
Intézet Kőzleményei, 5, 17–61.
Faust, K. and Skvoretz, J. (2002) Comparing networks across space and time, size and species.
Sociological Methodology, 32, 267–299.
Fujimoto, K., Snijders, T. and Valente, T. W. (2018) Multivariate dynamics of one-mode and
two-mode networks: Explaining similarity in sports participation among friends. Network
Science, 6, 370–395.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014)
Bayesian Data Analysis. Boca Raton, FL: Chapman & Hall / CRC, 3d edn.
Gelman, A., Roberts, G. O. and Gilks, W. R. (1996) Efficient Metropolis jumping rules. In
22 Johan Koskinen and Tom A.B. Snijders
Bayesian Statistics 5 (eds. J. Berardo, J. Berger, A. Dawid and A. Smith), 599–607. Oxford:
Clarendon Press.
Goldstein, H. (2011) Multilevel Statistical Models. London: Edward Arnold, 4th edn.
Greenan, C. C. (2015) Diffusion of innovations in dynamic networks. Journal of the Royal
Statistical Society, Series A, 178, 147–166.
Hamerle, A., Singer, H. and Nagl, W. (1993) Identification and estimation of continuous time
dynamic systems with exogenous variables using panel data. Econometric Theory, 9, 283–
295.
Holland, P. W. and Leinhardt, S. (1977) A dynamic model for social networks. Journal of
Mathematical Sociology, 5, 5–20.
Huitsing, G., Snijders, T. A., Van Duijn, M. A. and Veenstra, R. (2014) Victims, bullies, and
their defenders: A longitudinal study of the coevolution of positive and negative networks.
Development and Psychopathology, 26, 645–659.
Jeffreys, H. (1998) The theory of probability. OUP Oxford.
Kalter, F., Heath, A. F., Hewstone, M., Jonsson, J. O., Kalmijn, M., Kogan, I. and van Tubergen,
F. (2013) Children of immigrants longitudinal survey in four European countries (CILS4EU):
Motivation, aims, and design. Tech. rep., GESIS: GESIS Data Archive, Cologne.
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y. and Porter, M. A. (2014)
Multilayer networks. Journal of Complex Networks, 2, 203–271.
Knecht, A. (2006) Networks and actor attributes in early adolescence [2003/04]. Ics-codebook
no. 61, The Netherlands Research School ICS, Department of Sociology, University of
Utrecht, Utrecht. Persistent data set identifier urn:nbn:nl:ui:13-ehzl-c6.
Knecht, A., Snijders, T. A. B., Baerveldt, C., Steglich, C. and Raub, W. (2010) Friendship and
delinquency: Selection and influence processes in early adolescence. Social Development,
19, 494–514.
Köllisch, T. and Oberwittler, D. (2004) Wie ehrlich berichten männliche jugendliche über ihr
delinquentes verhalten? KZfSS Kölner Zeitschrift für Soziologie und Sozialpsychologie, 56,
708–735.
Koskinen, J. H. and Snijders, T. A. B. (2007) Bayesian inference for dynamic social network
data. Journal of Statistical Planning and Inference, 13, 3930–3938.
Krivitsky, P. N., Handcock, M. S. and Morris, M. (2011) Adjusting for network size and compo-
sition effects in exponential-family random graph models. Statistical Methodology, 8, 319–
339.
Lomi, A., Snijders, T. A. B., Steglich, C. and Torlò, V. J. (2011) Why are some more peer
than others? Evidence from a longitudinal study of social networks and individual academic
performance. Social Science Research, 40, 1506–1520.
Lomi, A. and Stadtfeld, C. (2014) Social networks and social settings: Developing a coevolu-
tionary view. KZfSS Kölner Zeitschrift für Soziologie und Sozialpsychologie, 66, 395–415.
Maddala, G. (1983) Limited-dependent and Qualitative Variables in Econometrics. Cambridge:
Cambridge University Press, 3rd edn.
Magnani, M. and Wasserman, S. (2017) Introduction to the special issue on multilayer networks.
Network Science, 5, 141–143.
O’Hagan, A. and Forster, J. (2004) Bayesian Inference, vol. 2B of Kendall’s Advanced Theory
of Statistics. London: Arnold.
Multilevel Longitudinal Social Networks 23
Ripley, R. M., Snijders, T. A. B., Bóda, Z., Vörös, A. and Preciado, P. (2021) Manual for Siena
version 4.0. Tech. rep., Oxford: University of Oxford, Department of Statistics; Nuffield
College. URL: http://www.stats.ox.ac.uk/siena/.
Robins, G. (2015) Doing social network research: Network-based research design for social
scientists. London etc.: Sage.
Schweinberger, M. (2007) Statistical Methods for Studying the Evolution of Networks and Be-
havior. Ph.D. thesis, University of Groningen, Groningen.
Schweinberger, M., Krivitsky, P. N., Butts, C. T. and Stewart, J. (2020) Foundations of finite-,
super-, and infinite-population random graph inference. Statistical Science, 35, 627–662.
Shalizi, C. R. and Rinaldo, A. (2013) Consistency under sampling of exponential random graph
models. Annals of Statistics, 41, 508–535.
Shalizi, C. R. and Thomas, A. C. (2011) Homophily and contagion are generically confounded
in observational social network studies. Sociological Methods & Research, 40, 211–239.
Singer, H. (1996) Continuous-time dynamic models for panel data. In Analysis of Change (eds.
U. Engel and J. Reinecke), chap. 6, 113–134. De Gruyter.
Slaughter, A. J. and Koehly, L. M. (2016) Multilevel models for social networks: Hierarchical
Bayesian approaches to exponential random graph modeling. Social Networks, 44, 334–345.
Snijders, T. A. B. (2001) The statistical evaluation of social network dynamics. In Sociological
Methodology – 2001 (eds. M. E. Sobel and M. P. Becker), vol. 31, 361–395. Boston and
London: Basil Blackwell.
— (2005) Models for longitudinal network data. In Models and Methods in Social Network
Analysis (eds. P. Carrington, J. Scott and S. Wasserman), chap. 11, 215–247. New York:
Cambridge University Press.
— (2016) The multiple flavours of multilevel issues for networks. In Multilevel Network Anal-
ysis for the Social Sciences (eds. E. Lazega and T. A. B. Snijders), 15–46. Cham: Springer.
— (2017) Stochastic actor-oriented models for network dynamics. Annual Review of Statistics
and Its Application, 4, 343–363.
Snijders, T. A. B. and Baerveldt, C. (2003) A multilevel network study of the effects of delin-
quent behavior on friendship evolution. Journal of Mathematical Sociology, 27, 123–151.
Snijders, T. A. B. and Bosker, R. J. (2012) Multilevel Analysis: An Introduction to Basic and
Advanced Multilevel Modeling. London: Sage, 2nd edn.
Snijders, T. A. B., Koskinen, J. H. and Schweinberger, M. (2010) Maximum likelihood estima-
tion for social network dynamics. Annals of Applied Statistics, 4, 567–588.
Snijders, T. A. B., Lomi, A. and Torlò, V. (2013) A model for the multiplex dynamics of two-
mode and one-mode networks, with an application to employment preference, friendship, and
advice. Social Networks, 35, 265–276.
Steglich, C. E. G., Snijders, T. A. B. and Pearson, M. A. (2010) Dynamic networks and behavior:
Separating selection from influence. Sociological Methodology, 40, 329–393.
Veenstra, R., Dijkstra, J. K., Steglich, C. and Van Zalk, M. H. (2013) Network–behavior dy-
namics. Journal of Research on Adolescence, 23, 399–412.
Wasserman, S. and Faust, K. (1994) Social Network Analysis: Methods and Applications. New
York and Cambridge: Cambridge University Press.
24 Johan Koskinen and Tom A.B. Snijders
Appendix
A. Statistics
A comprehensive list and definition of all effects currently employed in SAOMs is provided in
Ripley et al. (2021, Chapter 12). The effects used in this paper are defined as follows.
The effects for network Z are similar. The cross-network effects were defined in Section 3.2.1.
The effect of covariates in Table 5 are ego effects, unless indicated as ‘same’ or ‘similarity’.
B. Prior sensitivity
basic rate parameter friends outdegree (density) reciprocity transitive triplets transitive recipr. triplets
4
3
7
4
0
3
2
2
6
1
−2
1
µ
µ
0
5
0
0
−1
−4
−1
4
−2
−2
−2
−6
−3
−3
3
−1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4
3−cycles sex similarity 3 delinq alter delinq ego delinq ego x delinq alter
2
2
4
2
1
0
0
2
0
µ
µ
−1
−2
−2
0
−2
−2
−4
−3
−4
−2
−4
−4
−1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4
Fig. 2. Credibility intervals (95%, dark; 99% light) for µ with default prior and Λ0 = σ02 I for
different values of log(σ02 )
the Normal - Inverse Wishart prior with µ0 = 0, ν0 = 12, κ0 = 1, Λ0 = σ02 I , for different
values of σ02 (a very small number of draws, 300, have been used here).
For small values of σ02 , the credibility intervals are noticeably tighter than for increasingly large
values σ02 , when the prior variance overwhelms the data. The central tendencies (posterior
means) are remarkably constant as a function of σ02 , and are hardly pulled towards the prior
mean of zero, even for values of σ02 as small as 0.25 (the smallest value in the plots).
[g]
The influence on the group parameters γk of the same set of priors is illustrated in Figure 3.
Note the difference in vertical scale. The inference on these parameters is remarkably robust
to the prior variance. Only for extreme values of σ02 and for a few classrooms do we see a big
change in group-level parameters. The very wide intervals are due to two specific classrooms.
More specifically, in one classroom (number 20) ‘transitive reciprocated triples’ and ‘3-cycles’
were collinear, which manifests itself in extremely large intervals for these parameters when
large σ02 prevents this classroom from borrowing information from the other classrooms. An-
other classroom (number 11) had a similar issue with structural parameters and in addition a
‘sex similarity’ effect that is difficult to estimate because of the very skewed sex distribution in
this classroom. The issues with these two classrooms also manifests themselves in increasingly
poor mixing for γ [g] for large values of σ02 (results available upon request from the authors).
26 Johan Koskinen and Tom A.B. Snijders
basic rate parameter friends outdegree (density) reciprocity transitive triplets transitive recipr. triplets
15
6
2.0
10
●
●●
●●●
−2
5
●●●
●●
●●●●
●●
1.5
●
8
10
●
4
● ●
●
−3
●
●
1.0
●
●
6
θ
θ
● ●
●●●
●
● ●
−4
● ●●●● ●
2
●●●
● ●●
5
●
●●● ●
●
●
●●
0.5
● ●●
●●
●● ●●
●● ●●
4
● ●
●●
1
●●
●●●
● ●
−5
● ●
●●
0.0
0
●
●●●
2
●●●●
0
●●●●●●●
●●●●●
●
−1
−6
5 10 15 20 5 10 15 20 5 10 15 20 5 10 15 20 5 10 15 20
3−cycles sex similarity delinq alter delinq ego delinq ego x delinq alter
3
●●●●●●
3
0
●●●●●
●●●●●●●
2
●●●
15
●
●●
●●●●●
2
●●●
●●●●●
0
2
●●●●
●
−5
−2
1
10
1
−10
θ
θ
●●●
−4
●●●●
●●● ●●●
0
●●●● ●●●
●●●●●●● ●●
0
● ●●
●●
● ●●●●
● −6
5
−15
−1
● ●
−1
−8
●●●
−2
●●●
●●●●●
●●●●●●●
−2
−20
●●
0
●
−10
5 10 15 20 5 10 15 20 5 10 15 20 5 10 15 20 5 10 15 20
Fig. 3. Equal 95% tail prediction intervals for γ [g] for different values of σ02 , from σ02 = 0.25 (light
grey) to σ02 = 113 (dark grey). Groups ordered according to posterior predictive mean
Multilevel Longitudinal Social Networks 27
B.2. Reference prior
We may consider the influence of the prior for µ and Σ on the predictive distributions for γ [g]
by comparing these to posteriors from group-level parameters estimated independently. The
a priori dependence of µ on Σ can be decoupled by setting π(µ, Σ) = π(µ)π(Σ). When
G > p1 + 1, we may chose an improper prior for (µ, Σ) for reference.
Jeffreys rule (Jeffreys, 1998) is a principled choice for a reference prior. For the multivariate
normal distribution, it is given by
For the conjugate model this corresponds to κ0 → 0, ν0 → 0, and letting the determinant of Λ0
tend to 0. Jeffreys prior is still conjugate for µ and Σ, and as such does not alter the updating
scheme outlined in Section 6.
Figure 4 illustrates the inference obtained from (horizontal) fitting the model separately to each
classroom, assuming a constant prior, and (vertical) the predictive distributions obtained from
the hierarchical SAOM with Jeffreys prior. The two previously mentioned classrooms 11 and 20
are omitted for reasons mentioned above. Both analyses are based on the other 19 classrooms.
Figure 4 demonstrates a negligible influence of the prior on the distributions for γ [g] . Fig-
ure 5 presents the posterior densities for µk and shows that these are also centered on the raw,
[g]
un-weighted means γ̄k of γk from the separate estimations. This shows that imposing the mul-
tivariate normal model for the group-level parameters does not yield results that differ strongly
from the individual group-level inference. In order to make use of all classrooms we would
however require more additional classrooms to borrow strength across groups, and impose a
more informative prior for µ and Σ.
C. Posteriors
For the estimated model of Table 5, Figure 6 presents the posterior distributions of the pop-
ulation mean µk for the delinquency outdegree - activity effect and of the constant parameter
ηk for the effect (‘odd’) of same delinquency acts as friends. For both parameters it is evident
that they are positive with a high posterior probability. There is less posterior uncertainty about
[g]
µk than ηk but the former is a population mean of varying group-wise parameters γk whereas
the latter is a constant parameter. This variability across groups is illustrated in Figure 7 (right
[g]
panel), which shows boxplots (without outliers) of the posterior distributions of γk for groups
g ordered according to the posterior means, with horizontal credibility bands in grey for µk .
The variability in group-level means is greater than the variability in µk but γ [g] is clearly pos-
itive with high posterior probability for all groups g . The length of the 95%CI for delinquency
outdegree - activity is 0.072 and the length of the 95%CI for same language is 0.132 but both
28 Johan Koskinen and Tom A.B. Snijders
10
−1
5
15
4
1.0
8
−2
3
6
10
0.5
−3
2
4
1
−4
0.0
0
5
2
−5
−0.5
0
−2
5 10 15 −5 −4 −3 −2 −1 0 2 4 6 8 10 −0.5 0.0 0.5 1.0 −2 0 1 2 3 4 5
basic rate parameter friends outdegree (density) reciprocity transitive triplets transitive recipr. triplets
2
1
4
4
3
0
0
1
2
2
−1
0
−2
−2
−1
0
−4
−2
−3
−1
−2
−6
−2
−6 −4 −2 0 2 −2 −1 0 1 2 3 4 −3 −2 −1 0 1 −2 −1 0 1 2 −2 0 2 4
3−cycles sex similarity delinq alter delinq ego delinq ego x delinq alter
Fig. 4. Prediction-intervals for γ [g] fitted independently (horizontal axis) against predictions from
Hierarchical SAOM using Jeffreys prior (vertical) (excluding classrooms g = 11, 20)
1.5
2.5
1.2
0.4
1.5
2.0
1.0
0.3
0.8
1.5
1.0
0.2
1.0
0.5
0.4
0.5
0.1
0.5
0.0
0.0
0.0
0.0
0.0
2 4 6 8 10 −3.5 −2.5 −1.5 0.5 1.5 2.5 0.0 0.4 0.8 −1.0 0.0 0.5
basic rate parameter friends outdegree (density) reciprocity transitive triplets transitive recipr. triplets
1.5
2.0
1.5
1.5
1.5
1.5
1.0
1.0
1.0
1.0
1.0
0.5
0.5
0.5
0.5
0.5
0.0
0.0
0.0
0.0
0.0
−1.0 0.0 0.5 −0.5 0.5 1.5 −1.0 0.0 0.5 −1.0 0.0 1.0 −0.5 0.5 1.5
3−cycles sex similarity delinq alter delinq ego delinq ego x delinq alter
Fig. 5. Posteriors for µ fitted using Jeffreys prior with vertical line representing γ̄k and the
[g]
horizontal bar γ̄k ± 2sd(γk ) from independently fitting each group (excluding rogue g = 11, 20)
Multilevel Longitudinal Social Networks 29
^
µ k
^
η k
Parameter
Fig. 6. Posterior densities for µk for delinquency outdegree - activity and ηk for same delin-
quency acts as friends (‘odd’) with 95% credibility intervals in dark and light grey, respectively.
parameters are positive with high posterior probability. There is a greater variation in the group-
level parameters for same language, however (Figure 7, left panel), and Pr(γ [g] > 0 | y) ranges
from 0.30 to 1.00 with a median of 0.88.
30 Johan Koskinen and Tom A.B. Snijders
0.5
0.6
parameter
parameter
0.4
0.0
0.2
−0.5 0.0
group group
[g]
Fig. 7. Boxplots (without outliers) for posteriors of γk for same language and delinquency
outdegree - activity, ordered by posterior means. The horizontal dark and light grey bands
indicate the 90% and 99% credibility intervals for µk .