7) Link Prediction in Dynamic Networks Using Time Aware Network Embedding and Time Series Forecasting
7) Link Prediction in Dynamic Networks Using Time Aware Network Embedding and Time Series Forecasting
https://doi.org/10.1007/s12652-020-02289-0
ORIGINAL RESEARCH
Abstract
As most real-world networks evolve over time, link prediction over such dynamic networks has become a challenging issue.
Recent researches focus towards network embedding to improve the performance of link prediction task. Most of the network
embedding methods are only applicable to static networks and therefore cannot capture the temporal variations of dynamic
networks. In this work, we propose a time-aware network embedding method which generates node embeddings by captur-
ing the temporal dynamics of evolving networks. Unlike existing works which use deep architectures, we design an evolving
skip-gram architecture to create dynamic node embeddings. We use the node embedding similarities between consecutive
snapshots to construct a univariate time series of node similarities. Further, we use times series forecasting using auto regres-
sive integrated moving average (ARIMA) model to predict the future links. We conduct experiments using dynamic network
snapshot datasets from various domains and demonstrate the advantages of our system compared to other state-of-the-art
methods. We show that, combining network embedding with time series forecasting methods can be an efficient solution to
improve the quality of link prediction in dynamic networks.
13
Vol.:(0123456789)
A. Mohan, K. V. Pramod
moved towards using representation learning methods for In this paper, we present a word2vec based model for
network embedding. These methods are highly scalable and generating time aware embeddings from dynamic networks.
can handle the complex non-linear structure, which is an Word2vec (Mikolov et al. 2013a, b) is a very successful
inherent property of networked data. Deepwalk (Perozzi model for generating word representations which showed
et al. 2014), Node2vec (Grover and Leskovec 2016), SDNE tremendous improvements in solving many tasks related
(Wang et al. 2016), etc are some popular works which uses to natural language processing and text mining. Word2vec
representation learning for generating network embeddings. defines a shallow three layer neural network architecture
Most of the real-world networks like social networks whose objective is to maximize the probability of words
and citation networks evolve with time. Social network that tend to co-occur to be close in a low dimensional vector
evolves by adding new people and relationships, whereas space. Later, Word2vec provided inspirations for many net-
new papers and citations lead to the growth of citation work embedding researches (Perozzi et al. 2014; Grover and
network. Such networks can be mathematically modelled Leskovec 2016) which used neural network architectures to
either as a dynamic graph, which is represented as a series generate node embeddings. These works perform a random
of snapshots, or as temporal network with the duration of walk over the network to generate a sentence analogy in
interaction or interval timestamped on the edges. Even if word2vec and use a skip-gram architecture to maximize the
some researches (Casteigts et al. 2012; Blonder et al. 2012) probability that the nodes that tend to co-occur on truncated
focused on learning the network dynamics, identifying the walk lay closer in the embedding space. A general architec-
mechanism that leads to its evolution is a challenging issue. ture of skip-gram based procedure for network embedding
In this work, we investigate the way to incorporate the evolu- is shown in Fig. 1.
tion mechanism into network embedding which can generate In all existing works, the embeddings generated after
node embeddings that are evolved over time. Further, we training the complete snapshots are used to predict the future
create similarity based time series using the node embed- links. But, non-linear fluctuations in connectivity patterns
dings generated from dynamic network snapshots and uses may occur within a series of consecutive snapshots which
time series forecasting for predicting future links in the suc- may reduce the efficiency of such predictions. To avoid such
cessive snapshots. problems, we construct a time series based on node embed-
Generating embedding from a dynamic network is not ding similarities between consecutive snapshots and further
straight forward. Embedding snapshots independently and uses time series forecasting using ARIMA model (Brockwell
aligning them across consecutive time steps is computation- and Davis 2016) to perform link prediction.
ally expensive. Nodes and edges may vary across consecu- The main contributions of our work are as follows.
tive time steps which demand new techniques to be incorpo-
rated with static embedding methods. Successive snapshots i) We propose a representation learning architecture, which
may not differ much and the embeddings generated must is modification over the skip-gram architecture, that can
be stable across various snapshots. The size of many real- generate embeddings from dynamic networks. The pro-
world networks may grow exponentially over a period of posed method can learn time-aware embeddings from
time which demands the embedding method to be highly network snapshots that are captured over a period of
scalable. Some researchers have already attempted to gen- time.
erate node embedding from dynamic networks. All these ii) We utilize the time-aware embeddings to construct a
works either used matrix factorization based models (Zhu univariate time series based on the node embedding
et al. 2016) or deep neural architectures (Goyal et al. 2018) similarities and performs time series forecasting using
which are computationally complex. ARIMA model to predict future links.
13
Link prediction in dynamic networks using time-aware network embedding and time series…
iii) To the best of our knowledge, this is the first attempt 3 Related works
which combines both dynamic network embedding and
time series prediction based approaches for link predic- Apart from using graph algorithms, machine learning on
tion in dynamic networks. graphs is one interesting direction towards network mining
iv) We conduct experiments on real-world dynamic net- research. Network embedding is one such domain which
works and proves that the proposed method can provide gained attention in recent years. The basic purpose of net-
better link prediction accuracy compared to state-of-the work embedding is to encode the network structure to a
art methods. low dimensional vector space by preserving the topologi-
cal properties and thereby aims to perform the downstream
mining tasks like node classification and link prediction with
2 Background improved accuracy. The network embedding research started
by applying non-linear dimensionality methods to generate
In this section, we define the terminologies used in this work node embeddings followed by specialized graph factoriza-
and the problem definition. tion based methods. As representation learning emerged
as a successful method to solve many problems related to
Definition 1 A network is a graph G = (V, E) , where computer vision and text mining, the domain of network
V = {v1 , v2 ...vn } , is the set of vertices and e ∈ E is an edge embedding is also influenced by these methods. Among
between any two vertices. An adjacency matrix A defines these methods, word2vec (Mikolov et al. 2013a, b) is one
the connectivity of G, Aij = 1 if vi and vj are connected, else popular architecture which uses a skip-gram model to gener-
Aij = 0. ate word representations by training a neural network on a
text corpus. Many network embedding methods like Deep-
Definition 2 A dynamic network can be modeled as con- Walk (Perozzi et al. 2014), Node2vec (Grover and Lesko-
tact sequences, snapshots, or interval graphs. In this work, vec 2016) and Struc2vec (Ribeiro et al. 2017) are inspired
we model a dynamic network as a series of snapshots from Word2vec. Another approach includes methods based
G = {G1 , G2 , ...Gn } where Gi = (Vi , Ei ) and n is the num- on deep architectures like deep belief networks, Deep CNN
ber of snapshots. New nodes and edges may be added and and GAN. Nodes and edges of real-world networks may be
removed from the network over time, and we consider that attributed and heterogeneous. TADW (Yang et al. 2015),
the network snapshots at different time steps can capture TriDNR (Pan et al. 2016) and DeepGL (Rossi et al. 2017)
this information. are some works on attributed network embedding and Meta-
path2vec (Dong et al. 2017) ,HNE (Chang et al. 2015) and
Problem 1. Network embedding: Given a network Hin2vec (Fu et al. 2017) are some works on heterogeneous
G = (V, E) , the task is to learn a transformation function network embedding. All these methods are applicable only
f ∶ Vi → Ki ∈ Rd , where d << |V| , such that f preserves the to static networks.
proximity information of the network. d defines the number A dynamic network provide a more precise way of repre-
of dimensions of the real-valued vector. Informally, if two senting complex interactions compared to a static network.
nodes are connected or share similar neighborhoods in the Analysis of dynamic networks has got wide applications in
network, their corresponding vector representations will various domains. Embedding dynamic networks demand
occupy nearer positions in the vector space. developing new techniques or adapting existing methods to
Problem 2. Time-aware network embedding (TANE): It is incooperate dynamic network properties. Authors of (Zhu
an extension to network embedding which can better repre- et al. 2016) attempt to incorporate a temporal regularizer
sent dynamic networks in vector space. Given a as a series of into the matrix factorization framework so as to generate
network snapshots at n time steps, G = {G1 , G2 ...Gn } , Time- a temporal latent space from snapshot networks. Dyngem
Aware Network Embedding tries to learn a mapping function (Goyal et al. 2018) uses a deep belief network which gener-
f ∶ Vi → Ki ∈ Rd , where d << |V| , from the network snap- ates node embeddings from a series of network snapshots.
shots which captures both network proximities and temporal Dynamictriad (Zhou et al. 2018) considers triadic closure
dynamics of the network. In the embedding space thus gen- as a basic mechanism which leads to network evolution
erated, two vectors, that correspond to two nodes are closer and uses it to generate node embeddings. DDNE (Li et al.
implies that the nodes preserve similar neighborhoods, and 2018) is another work which uses a GRU encoder and a deep
their interactions have constantly evolved. neural network decoder for node embedding generation. To
Problem 3. Link prediction in dynamic network: Given capture the network dynamics effectively, the system needs
a as a series of network snapshots at t discrete time steps, to be trained using large number of network snapshots. Such
G = {G1 , G2 ...Gn } , the task is to the predict the future graph a task will be computationally complex if we are using
at time t + 1 , i.e Gt+1.
13
A. Mohan, K. V. Pramod
matrix factorization and deep architecture based methods. structure of the network. A random walk on a graph G can
In this paper, we propose a modified skip-gram architecture be represented as a sequence of vertices v1 , v2 , v3 , .., vk such
which is a shallow neural network that takes as input, node that the adjacent nodes in the sequence are connected in G.
sequences generated using random walk from a sequence of A random walk of length k starts from any node, randomly
network snapshots, and generates node embeddings that are select a neighbor, visits the neighbor and the process con-
constantly evolved over time. tinues until k edges are covered. The next vertex is selected
Various approaches (Ma et al. 2017; Ahmed et al. 2018; with uniform probability from the set of neighbors. The
Yasami and Safaei 2018; Wu et al. 2018) exist in literature to same can be modeled as a markov chain, where the states of
perform link prediction on dynamic networks. As a dynamic the chain are the vertices of the graph.
∑
network is one which evolve over time, time series modeling Given the adjacency matrix Aij and d(i) = j Aij , the
and prediction is a good approach towards studying link pre- degree of node i
diction in dynamic networks. Time series forecasting has The transition probability between two connected vertices
been successfully used in many applications where we aim i to j can be represented as
A
to predict the future values of a variable using the past obser- Tij = d(i)ij
vations of the same variable. ARIMA (Brockwell and Davis This can be also represented as
2016) is a successful model to capture the non-linearities in T = D−1 A , where D is the diagonal degree matrix,
time series data. Recent works on time series forecasting use Dii = d(i) and Dij = 0 if i ≠ j.
soft computing based methods which include fuzzy neural Let P0ij represents i 1-hot row vectors with the value of ith
networks (Soto et al. 2018), fuzzy aggregation models with entry is 1 and all others 0, which indicates that the random
modular neural networks (Soto et al. 2019), and evolutionary walk can start at any vertex, and let the stationary transition
computing approaches (Gupta et al. 2018). Many research- probability between two connected vertices to be Tij . By
ers approached the problems related to dynamic networks assuming random walk without restart, the transition prob-
from perspective of time series forecasting. (Wu et al. 2018) ability after a single walk can be computed as
studied the dynamics of a time-varying network as the prob-
lem of tracking a time series of finite dimensional vectors. P1ij = P0ij × Tij
The evolution of a temporal network can be well described
using a time series of event sequences (Jo and Hiraoka 2019) The transition probability after k steps of random walk (walk
which follow non-linear patterns. A detailed study (Zou of lenght k) can be computed as
et al. 2019) of complex network approaches to nonlinear k
time series analysis also exist in the literature. Recent stud- ∏
Pkij = P0ij × Tij
ies (Güneş et al. 2016; Özcan and Öğüdücü 2016, 2017) i=1
show that time series forecasting based methods outperform
other traditional methods for performing link prediction in The ith row of the matrix represent the transition probability
dynamic networks. Most of these methods use node simi- from vertex i to all other vertices after a random walk of
larities and centrality measures for time series construction, length k.
and linear models for prediction which fails to capture the If the graph is strongly connected, it can be observed
non-linear temporal variations of connectivity patterns in that the probability distribution reaches a steady state. For
dynamic networks. The proposed system is to generate time a random walk to capture the local structure, it should be
aware embeddings from a dynamic network, performs time long enough to gather the topological information and also
series construction using node embedding similarity scores should not be too long to avoid stationary distribution. So in
and predicts future similarity scores using ARIMA model. this work, we are using a truncated random walk where the
length of the walk is fixed. The k-step transition probability
between i and j for a random walk with fixed length depends
4 System design on the node degrees d(i) and d(j).
For vertices which are closer in the graph, the value of Pkij
4.1 Random walk will be high. Fomally, we can say that two vertices i and j
are similar if
A random walk is a discrete-time stochastic process which
is widely used in graph theory domain, particularly in appli- ∀n Pkin ≈ Pkjn
cations like graph partitioning (Spielman and Teng 2004),
ie i and j have similar transition probability distribution w.r.t.
community detection (Pons and Latapy 2005) and Pagerank
all other nodes. The distance between i and j can be repre-
(Chung and Zhao 2010). In this work, we consider random
sented as
walk as a sampling technique to capture the local community
13
Link prediction in dynamic networks using time-aware network embedding and time series…
√
√ n vocabulary. Estimating softmax function is computationally
√∑ (Pkim − Pkjm )2
dij = √ expensive, and word2vec approximates softmax using two
m=1
d(m) strategies, hierarchical softmax (Morin and Bengio 2005),
and negative sampling(Goldberg and Levy 2014). These
The error value dij can be also interpreted as the distance strategies reduce the time complexity of the skip-gram
between two probability distributions Pkim and Pkjm . When we model and speed up the training process.
have a truncated random walk of length k, (k << n) , and
when m is in the local neighborhood of i and j, the error 4.3 Proposed system
value drops to minimum. This shows that the truncated ran-
dom can capture the local community structure of the graph. The workflow of the proposed system is shown in Fig. 2.
The input to the TANE algorithm is consecutive network
4.2 Skip‑gram architecture snapshots, and the algorithm generates node embeddings
from each snapshot by capturing the temporal dynamics of
A skip-gram model is a shallow neural network architecture the evolving network. These embeddings are fed to a time
used by word2vec to generate word embeddings. The skip- series construction phase which generates a univariate time-
gram for network embedding (Perozzi et al. 2014) deploys series using node embedding similarity between each dis-
the same three-layer neural network architecture used by connected node pairs. The time series thus constructed is
word2vec. Here the objective is to maximize the probability passed to the ARIMA model to predict the future similarity
of neighboring nodes in the random walk, given the repre- scores between disconnected vertices and thereby uses it to
sentation of a node. Skip-gram generates a d dimensional predict the future links.
representation, 𝜙(vi ) ∈ Rd for each node vi , by maximizing
the co-occurrence probability of its neighbors in the walk. 4.3.1 TANE algorithm
The optimization function can be represented as,
In the case of dynamic networks, the network structure may
max log P(vi−w , ..., vi+w |𝜙(vi )) (1) vary at different snapshots and the basic skip-gram architec-
where vi−w , ..., vi+w denotes the neighbors of vi in the node ture cannot capture these changes. Even if we train the skip-
sequence, and w is the context size. The output layer of gram model for one snapshot, for the next snapshot we need
skip-gram neural architecture is a softmax function which to retrain the entire model even if there is a small change
computes the co-occurence probability of output and input in connectivity between two consecutive snapshots. This is
words, wo and wi as a tedious task when we have a large number of snapshots.
Moreover the training the model independently with differ-
exp(v�wo ⊤ vwi ) ent snapshots may also lead to sub-optimal results. This is
p(wo �wi ) = ∑N (2) because the embeddings thus generated for the same node
�⊤
w=1 exp(vw vwi )
with similar connectivity patterns across different snapshots
where v′w and vw are the vector representations of the con- may be different as they are not aligned to the same vector
text and input word respectively, and N is the size of the space. Our proposed architecture (TANE) is designed such
that the neural network can be trained snapshot by snapshot,
13
A. Mohan, K. V. Pramod
by preserving the weight learned at each snapshot. At each starts from the first snapshot, and the embedding and context
time step, TANE aims to expand the neural architecture of matrices are initialized as zero. The skip-gram objective is
skip-gram by preserving the input to hidden and hidden to optimized by training a three-layer neural network which is
output layer weights of the previous time step and thereby optimized using stochastic gradient descent (SGD) (Bottou
generating time aware network embeddings. The procedure 2010). For successive snapshots, the skip-gram architecture
is described in Algorithm 1.The algorithm takes as input is expanded by initializing the embedding and context vec-
n network snapshots G = {G1 , G2 ...Gn } , the context win- tors from the corresponding matrices of the previous snap-
dow size w, the embedding dimension d, walk per vertex 𝛾 shot and by initializing the vectors of newly observed ver-
and walk length t. The algorithm generates an embedding tices to zero. Further, the vocabulary is built incrementally
matrix v x d as output, where v is the total number of verti- by adding the newly observed vertices and by maintaining
ces across all network snapshots. The procedure involves a the occurrence count of each vertex. At each snapshot, the
two-step process. A random walk on the network snapshots skip-gram objective is optimized to ensure that the embed-
to incrementally build the vertex vocabulary and a skip-gram ding generated over successive snapshots have captured the
objective optimization which incrementally learns the net- temporal dynamics of the evolving network.
work embeddings from the snapshots. Vocabulary building
6: else
7: Load V oc, Ev (Si−1 ), Cv (Si−1 )
8: for u ∈ walks(Si ) do
9: if u ∈ V oc then
10: Eu (Si )=Eu (Si−1 )
11: Cu (Si ) = Cu (Si−1 )
12: else
13: Eu (Si ) = Cu (Si ) = 0
14: V oc = BuildV ocabulary(walks(Si ))
15: SkipGram(Ev , walks, w)
16: Save V oc, Ev (Si ), Cv (Si )
1: procedure BuildVocabulary(walks)
2: for u ∈ walks do
3: if u ∈ V oc then
4: count(u)++
5: else
6: addvocab(V oc,u)
7: size(V oc)++
1: procedure SkipGram(Ev ,walks,w)
2: for vk ∈ walks do
3: for v ∈ walks(k − w : k + w) do
4: L(E) = − log P (v |E(vk ))
5: stochasticgradientdescent(L(E))
13
Link prediction in dynamic networks using time-aware network embedding and time series…
5 Experiments
Figure 3 shows the skip-gram architecture for TANE cor-
responding to two consecutive network snapshot Si−1 and Si We conduct experiments with four real-world networks to
at time i − 1 and i respectively. For those vertices in Si which evaluate the quality of embeddings generated by the TANE
are already present in previous snapshot Si−1 , the weight vec- algorithm. MAP and ROC are the measures which are used
tors are loaded from the matrices of snapshot Si−1 , and the for evaluation. The results are compared with that of the
vectors of newly seen vertices are initialized to zero.The baseline methods. All experiments were conducted using a
objective is optimized for each snapshot by preserving the machine with Ubuntu 16.04 operating system, 16 GB RAM
embedding generated at each time step. This approximates and a hexa-core processor with 3.2 GHz speed. We modi-
the optimization of a global objective function which will fied the C implementation of google word2vec for develop-
learn the parameters of network embedding across all snap- ing TANE algorithm and used python packages including
shots and thereby generates stable and time-aware network neworkx and scikit-learn for graph processing, time series
embedding. prediction and link prediciton evaluation.
13
A. Mohan, K. V. Pramod
Table 1 Statistics of various datasets used 10–15% of the links to form the training set, generate node
Dataset # of nodes # of edges # of embeddings from the training set, use hammard product of
times- the node embedding to form the edge embedding and build
tamps a classifier based on positive and negative edges. Hidden
edges are used to test the accuracy of the classifier. To test
Enron 150 2609 26
link prediction performance with dynamic networks, we hide
Haggle 274 28,244 8
10–15% of links of each network snapshot from time 1 to
Hep-ph 28,093 4,596,803 10
‘t-1’, generates embeddings and predict the links at time ‘t’.
Radoslaw 167 82,927 10
To test link prediction performance with TANE-TS, we build
the time series using similarity of dynamic embeddings from
snapshot 1 to t-1 and forecast the predicted scores for pos-
emails. The dataset covers a period from January 2010 to sible links in snapshot t. As TANE maps networks to a low
October 2010 and consists of 167 vertices and 82,927 edges. dimensional space, the dimensionality of nodes is an impor-
A summary of various datasets used is shown in Table 1. tant which affects the performance of link prediction task.
Area under curve (AUC): AUC is a widely used evalu- We conducted experiments using different values of node
ation metric for link prediction. This metric can be inter- dimensionality w.r.t each dataset and found 128 as the opti-
preted as probability that a randomly chosen missing link is mum value which is used for performance comparison. Now,
given a higher score than a randomly chosen non-existent we present the baseline methods along with the analysis and
link, provided the rank of all the non-observed links. The comparison of results obtained from conducting link predic-
value of this metric is bounded between 0 and 1, and higher tion experiments on four real world networks.
the value implies better the model. Among n independent
comparisons, if there are n′ times the missing link having a 6.1 Baseline methods
higher score and n′′ times they have the same score, AUC
score (Liu et al. 2011; Lü and Zhou 2011) is calculated as: We selected baselines from different category of works.
These include link prediction using static embedding and
n� + 0.5n��
AUC = (4) dynamic network embedding methods, and link prediction
n using node similarity based time series forecasting methods.
Mean average precision (MAP): This metric is an extension A short introduction to specific baselines are listed below.
of average precision (AP) where the average of all APs is Node2vec (Grover and Leskovec 2016): Node2vec per-
calculated to get MAP score. It estimates the precision of forms random walk on static network to generate node
every node and computes the average over all nodes. It is sequences and uses skip-gram with negative sampling to
calculated as: generate node representations. Unlike deepwalk, node-
2vec performs a baised random walk which provides more
Σj Precision@j(i).Δi(j) Σi𝜖Q AP(i)
AP(i) = MAP = (5)
|{Δi(j) = 1}| |Q|
0.9
The performance of the proposed system is evaluated with Node2vec
the network at a future time as well as by taking only the new TANE-TS
links that occur in the future time. A link (u, v) is said to be 0.8 0.79
a new link if it is not present in the last snapshot.
0.8
0.78
0.77 0.78
AUC
0.74
0.66
We conducted link prediction experiments with time-aware 0.64
embeddings (TANE) independently and also by combin- 0.62
ing time-aware embeddings with time series forecasting 0.6
13
Link prediction in dynamic networks using time-aware network embedding and time series…
AUC
0.74
0.7 0.71
flexibility in exploring node neighborhoods. The learned 0.7 0.69 0.69
representations can be used for link prediction using vector 0.66
based similarity measures. 0.65
SDNE (Wang et al. 2016): is designed to preserve non-
linear connectivity patterns in networks while generating 0.6
embeddings. SDNE attempts to capture the first order and Enron Haggle Hep-ph Radoslaw
second order proximity between nodes and deploys a deep Dataset
belief network for generating representations.
DynGem (Goyal et al. 2018): It is an extension of SDNE
Fig. 5 AUC comparison of TANE-TS with time series prediction
to dynamic networks. It dynamically expands the deep neu- methods
ral architecture and learn stable embeddings from a series of
network snapshots by preserving the embeddings generated
at each time step. The embeddings generated from complete Table 4 MAP comparison of proposed system with time series pre-
training is used for link prediction. diction methods
Time series of node Similarities (TS-Sim) (Güneş et al. Method Enron Haggle Hep-ph Radoslaw
2016): This work first computes node similarities using com-
TS-CN 0.049 0.121 0.061 0.057
mon neighbor, Adamic-Adar and Jaccard Coefficient. A time
TS-AA 0.073 0.194 0.065 0.064
series is constructed using node similarities and the similar-
TANE-TS 0.095 0.245 0.078 0.075
ity of future links is predicted using ARIMA model.
6.2 Performance
10.3%, 14.4% and 18.1% over the baseline methods (SDNE)
The experiments conducted for evaluating the performance on Enron, Haggle, Hep-ph, and Radoslaw respectively. A
of the proposed system are threefold. We first present the similar improvement of can be also obtained on the MAP
performance improvement of TANE-TS over static network score for the given datasets, which is presented in Table 2.
embedding methods followed by dynamic embedding meth- As the link prediction problem is directly related to evolution
ods. Further we compare the system with time series predic- of the networks, we can observe that embedding the network
tion based approaches. Links with top 20% highly predicted by considering its evolution (proposed system) have good
scores are considered as predicted links. We conduct experi- advantage over static network embedding methods.
ments by both considering only the new links in the final Table 3 plots the AUC and MAP comparison of TANE-
snapshot and also for all links in the final snapshot. A link TS with dynamic network embedding method (DynGem).
(u, v) is said to be a new link in the current snapshot if it is The results show that the TANE algorithm when combined
not present in the previous snapshot. with time series forecasting can achieve a performance
Figure 4 compares the AUC scores of TANE-TS against improvement of 1.3%, 2.2%, 1.4% and 2.7% on AUC and
the static network embedding methods (Node2vec and 10.4%, 21.8%, 23.8% and 7.1% on MAP compared to dyn-
SDNE) w.r.t. four datasets. Results show that the proposed gem w.r.t. Enron, Haggle, Hep-ph, Radoslaw respectively.
method gives an AUC performance improvement of 8.1%, In DynGem, the final embeddings generated after training
13
A. Mohan, K. V. Pramod
Enron Enron
Haggle Haggle
Hep-ph Hep-ph
Radoslaw Radoslaw
0.85 0.25
0.2
0.8
0.15
MAP
AUC
0.75
0.1
0.7
5 · 10−2
0.65
0
16 32 64 128 16 32 64 128
Node Dimensions Node Dimensions
the deep model with all network snapshots are used in link Enron
prediction. In the proposed system, we are trying to learn Haggle
13
Link prediction in dynamic networks using time-aware network embedding and time series…
6.3 Parameter sensitivity
AUC
0.78
final snapshot for different dimensions for different data- (1,0,1) (1,1,2) (1,2,2) (2,0,3) (2,2,3) (3,1,3) (3,2,3)
sets are shown in Figs. 6 and 7 respectively. Results show (p, d, q ) for Hep-ph
that each vertex when represented as a higher dimensional
vector (either 64 or 128) provides better AUC and MAP
scores compared to low dimensional vector representations 0.855
(16 and 32). AUC score for Enron, Hep-ph and Radoslaw
b
datasets remains relatively close to 0.80 and 0.79 and 0.78
respectively when d = 64 and d = 128. For Haggle dataset,
0.85
AUC
0.85
AUC and MAP scores show some variations over various
node dimensions and give an AUC and MAP score of .85
0.8 and .245 respectively when nodes are represented as 128
dimensional vectors.
The walk length of the random walk is one among the
AUC
13
A. Mohan, K. V. Pramod
in Fig 9. It was found that for Enron and Hep-ph, the AUC between two consecutive snapshots and implementing bet-
increases up to 14 and 8 snapshots respectively and then ter transfer learning techniques for neural network training
reach an almost steady state. As the number of snapshots can improve the scalability of the proposed system. The
becomes high, the networks become dense and may lead evolution of a network can be modelled as a temporal graph
to the saturation of AUC value. For haggle and radoslaw, with time-stamped edges rather than a series of snapshots,
an optimum AUC value of 0.85 and 0.78 respectively is and embedding such networks can further address the scal-
obtained when the embeddings are generated by training 8 ability issues. Combining node embedding similarity fea-
and 10 snapshots respectively. tures with neighbourhood based similarity features to form
Dynamics of real-world networks may differ in the range a multivariate time series may further improve the prediction
from small time-scales (fine-grained dynamics) to long time accuracy. System performance may be further improved by
periods (coarse-grained dynamics). We conducted experi- using advanced neural models like long short-term memory
ments on enron dataset by generating snapshots at various (LSTM) or gated recurrent units (GRU) for time series fore-
levels of granularity, and the results are shown in Table 6. casting. In future, we also plan to perform link prediction in
The results show that the system gives better prediction dynamic networks using graph neural network models like
accuracy while taking small time scales for generating net- graph convolutional networks and graph attention networks.
work representations.
Finally we preset the influence of ARIMA model parame-
ters on the performance of the proposed system. The param- Funding Not applicable.
eter here we consider are the number of past time periods
(p), the number of non-seasonal difference operations (d), Compliance with ethical standards
and the number of lagged forecast errors (q). The AUC val-
Conflict of interest The authors declare that there are no known con-
ues against the different combinations of p,d,q values for flicts of interest associated with this work and there has been no sig-
Hep-ph and Haggle are shown in Fig. 10 a and b respec- nificant financial support or funding for this work that could have in-
tively. For hep-ph, 2 or 3 past values are required to build a fluenced its outcome.
robust model, where as for haggle dataset, the system can
achieve good performance with 1 or 2 past values. We may
conclude that more powerful time series model is required References
to get accurate predictions when we deal with more complex
datasets like Hep-ph. Ahmed NM, Chen L, Wang Y, Li B, Li Y, Liu W (2018) DeepEye: link
prediction in dynamic networks based on non-negative matrix
factorization. Big Data Min Anal 1(1):19–33
Al Hasan M, Zaki MJ (2011) A survey of link prediction in social
7 Conclusion and future works networks. In: Social network data analytics, Springer, pp 243–275
Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral tech-
niques for embedding and clustering. Adv Neural Inf Process Syst
Predicting the future links of a network by considering its 14:585–591
evolving nature is an exciting direction of research in net- Blonder B, Wey TW, Dornhaus A, James R, Sih A (2012) Temporal
work mining. As network embedding emerged as an impor- dynamics and network analysis. Methods Ecol Evol 3(6):958–972
tant technique to improve the performance of many network Bottou L (2010) Large-scale machine learning with stochastic gradi-
ent descent. In: Proceedings of COMPSTAT’2010, Springer, pp
mining tasks, we investigate the effect of network embed- 177–186
ding in link prediction on dynamic networks. We propose Brockwell Peter J, Davis RA (2016) Introduction to time series and
a method which combines time-aware network embedding forecasting. Springer, Berlin
and time series forecasting to perform link prediction on Cai H, Zheng Vincent W, Chang K (2018) A comprehensive survey of
graph embedding: problems, techniques and applications. IEEE
dynamic networks. The method uses a modified skip-gram Trans Knowl Data Eng 30(9):1616–1637
architecture to generate time-aware node embeddings, con- Cao S, Lu W, Xu Q (2015) Grarep: learning graph representations
structs a time series of node embedding similarities and with global structural information. In: Proceedings of the 24th
uses ARIMA model to predict the similarity scores between ACM international on conference on information and knowledge
management, pp 891–900
future links. We observe that the dynamic network embed- Casteigts A, Flocchini P, Quattrociocchi W, Santoro N (2012) Time-
ding can well capture the temporal dynamics of the net- varying graphs and dynamic networks. Int J Parallel Emerg Dis-
work, and the ARIMA model can predict future links with trib Syst 27(5):387–408
good accuracy. The benefits of the proposed are justified Chaintreau A, Hui P, Crowcroft J, Diot C, Gass R, Scott J (2007)
Impact of human mobility on opportunistic forwarding algo-
by conducting experiments with various real-world network rithms. IEEE Trans Mobile Comput 6:606–620
datasets. Chang S, Han W, Tang J, Qi G-J, Aggarwal CC, Huang TS (2015)
The current study can provide various directions for Heterogeneous network embedding via deep architectures. In:
future work. Network topology will not change much
13
Link prediction in dynamic networks using time-aware network embedding and time series…
Proceedings of the 21th ACM SIGKDD international conference SIGKDD international conference on knowledge discovery and
on knowledge discovery and data mining, pp 119–128 data mining, pp 1105–1114
Chung F, Zhao W (2010) PageRank and random walks on graphs. In: Özcan A, Öğüdücü ŞG (2016) Temporal link prediction using time
Fete of combinatorics and computer science, Springer, pp 43–62 series of quasi-local node similarity measures. In: 2016 15th IEEE
Dong Y, Chawla NV, Swami A (2017) metapath2vec: Scalable repre- international conference on machine learning and applications
sentation learning for heterogeneous networks. In: Proceedings of (ICMLA), pp 381–386
the 23rd ACM SIGKDD international conference on knowledge Özcan A, Öğüdücü ŞG (2017) Supervised temporal link prediction
discovery and data mining, pp 135–144 using time series of similarity measures. In: 2017 Ninth interna-
Fu T-y, Lee W-C, Lei Z (2017) HIN2Vec: explore meta-paths in het- tional conference on ubiquitous and future networks (ICUFN),
erogeneous information networks for representation learning. In: pp 519–521
Proceedings of the 2017 ACM on conference on information and Pan S, Jia W, Zhu X, Zhang C, Wang Y (2016) Tri-party deep network
knowledge management, pp 1797–1806 representation. Network 11(9):12
Gehrke J, Ginsparg P, Kleinberg J (2003) Overview of the 2003 KDD Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of
cup. Acm SIGKDD Explor Newslett 5(2):149–151 social representations. In: Proceedings of the 20th ACM SIGKDD
Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov international conference on Knowledge discovery and data min-
et al.’s negative-sampling word-embedding method. arXiv pre- ing, pp 701–710
print arXiv:1402.3722 Pons P, Latapy M (2005) Computing communities in large networks
Goyal P, Ferrara E (2017) Graph embedding techniques, applications, using random walks. In: International symposium on computer
and performance: a survey. Knowl-Based Syst 151:78–94 and information sciences, Springer, pp 284–293
Goyal P, Kamra N, He X, Liu Y (2018) DynGEM: deep embedding Ribeiro LFR, Saverese PHP, Figueiredo DR (2017) struc2vec: Learn-
method for dynamic graphs. arXiv preprint arXiv:1805.11273 ing node representations from structural identity. In: Proceedings
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for of the 23rd ACM sigkdd international conference on knowledge
networks. In: Proceedings of the 22nd ACM SIGKDD interna- discovery and data mining, pp 385–394
tional conference on Knowledge discovery and data mining, pp Rossi RA, Zhou R, Ahmed NK (2017) Deep feature learning for
855–864 graphs. arXiv preprint arXiv:1704.08829
Güneş İ, Gündüz-Öğüdücü Ş, Çataltepe Z (2016) Link prediction using Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by
time series of neighborhood-based node similarity scores. Data locally linear embedding. Science 290(5500):2323–2326
Min Knowl Discov 30(1):147–180 Soto J, Melin P, Castillo O (2018) A new approach for time series pre-
Gupta C, Jain A, Tayal DK, Castillo O (2018) ClusFuDE: forecasting diction using ensembles of IT2FNN models with optimization of
low dimensional numerical data using an improved method based fuzzy integrators. Int J Fuzzy Syst 20(3):701–728
on automatic clustering, fuzzy relationships and differential evolu- Soto J, Castillo O, Melin P, Pedrycz W (2019) A new approach to mul-
tion. Eng Appl Artif Intell 71:175–189 tiple time series prediction using mimo fuzzy aggregation models
Jo H-H, Hiraoka T (2019) Bursty time series analysis for temporal with modular neural networks. Int J Fuzzy Syst 21(5):1629–1648
networks. In: Temporal Network Theory, Springer, pp 161–179 Spielman DA, Teng S-H (2004) Nearly-linear time algorithms for graph
Klimt B, Yang Y (2004) The enron corpus: a new dataset for email partitioning, graph sparsification, and solving linear systems. In:
classification research. In: European conference on machine learn- Proceedings of the thirty-sixth annual ACM symposium on theory
ing, Springer, pp 217–226 of computing, pp 81–90
Li T, Jiawei Zhang SY, Philip YZ, Yan Y (2018) Deep dynamic net- Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In:
work embedding for link prediction. IEEE Access 6:29219–29230 Proceedings of the 22nd ACM SIGKDD international conference
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for on knowledge discovery and data mining, pp 1225–1234
social networks. J Am Soc Inf Sci Technol 58(7):1019–1031 Wu T, Cheng-Shang C, Wanjiun L (2018) Tracking network evolution
Liu Z, Zhang Q-M, Lü L, Zhou T (2011) Link prediction in com- and their applications in structural network analysis. IEEE Trans
plex networks: a local naïve Bayes model. EPL Europhys Lett Netw Sci Eng 6(3):562–575
96(4):48007 Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representa-
Lü L, Zhou T (2011) Link prediction in complex networks: a survey. tion learning with rich text information. In: IJCAI, pp 2111–2117
Phys A Stat Mech Appl 390(6):1150–1170 Yasami Y, Safaei F (2018) A novel multilayer model for missing link
Ma X, Sun P, Qin G (2017) Nonnegative matrix factorization algo- prediction and future link forecasting in dynamic complex net-
rithms for link prediction in temporal networks using graph com- works. Phys A Stat Mech Appl 492:2166–2197
municability. Pattern Recognit 71:361–374 Zhou L, Yang Y, Ren X, Wu F, Zhuang Y (2018) Dynamic network
Martínez V, Berzal F, Cubero J-C (2017) A survey of link prediction embedding by modeling triadic closure process. In: Proceed-
in complex networks. ACM Comput Surv 49(4):69 ings of the 32nd AAAI conference on artificial intelligence, pp
Michalski R, Palus S, Kazienko P (2011) Matching organizational 571–578
structure and social network extracted from email communica- Zhu L, Guo D, Yin J, Steeg GV, Galstyan A (2016) Scalable temporal
tion. In: International conference on business information systems, latent space inference for link prediction in dynamic social net-
Springer, pp 197–206 works. IEEE Trans Knowl Data Eng 28(10):2765–2777
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estima- Zou Y, Donner RV, Marwan N, Donges JF, Kurths J (2019) Complex
tion of word representations in vector space. arXiv preprint network approaches to nonlinear time series analysis. Phys Rep
arXiv:1301.3781 787:1–97
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distrib-
uted representations of words and phrases and their composition- Publisher’s Note Springer Nature remains neutral with regard to
ality. Adv Neural Inf Process Syst 26:3111–3119 jurisdictional claims in published maps and institutional affiliations.
Morin F, Bengio Y (2005) Hierarchical probabilistic neural network
language model. In: Aistats, vol 5, pp 246–252
Ou M, Cui P, Pei J, Zhang Z, Zhu W (2016) Asymmetric transitivity
preserving graph embedding. In: Proceedings of the 22nd ACM
13