0% found this document useful (0 votes)
8 views

CS514 2024fall Midterm

Uploaded by

chunfeng277
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

CS514 2024fall Midterm

Uploaded by

chunfeng277
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

UIUC-CS514 “Advanced Topics in Network Science” (Fall 2024)

Midterm Exam

(Friday, Oct. 25, 2024, 100 marks)

IMPORTANT Notes
• Please directly write down your solutions under every problem with brief explanations if
necessary.

• The exam starts at 9:30 am and ends at 10:45 am. Feel free to skip some ‘hard’ problems and
distribute your time wisely.

• Please prepare your icard or ID when we collect your papers.

• Feel free to use the last blank page as scratch paper.

Name: NetID: Score:

1 2 3 4 5 6 Total

1
Problem 1. (22 pts) Random Walk with Restart. Given an undirected graph as Figure 1
shows. We want to apply the power iteration (“OnTheFly”) method of random walk with restart
to calculate proximity. Power iteration can be described as r ← (1 − c) · W̃r + c · e, where c ∈ (0, 1)
is the restart probability and e is the starting vector.

Figure 1: An undirected graph

1. (9 pts) What is its adjacency matrix? What is its degree matrix? What is its normalized
adjacency matrix with symmetric normalization?

2. (5 pts) If the random walk probability from one node to another is evenly split based on the
degree of the source node, how should we appropriately normalize the adjacency matrix to
describe such a case? Show the normalized adjacency matrix.

3. (3 pts) If we want to measure the proximity between node 1 and other nodes, how should we
initialize the ranking vector r?

2
4. (5 pts) If we set restart probability as 0 (c = 0), is the convergence of power iteration guar-
anteed for any row normalized adjacency matrix? If so, prove the convergence. If not, give
a row normalized adjacency matrix and initialization of r so that power iteration with c = 0
will not converge.

3
Problem 2. (18 pts) Matrix Low-Rank Factorization.

1. (2 pts) Among SVD, CMD and Colibri-S, which one gives the best approximation if the
squared error is measured?

2. (6 pts) Given the following matrix


 
1 2 2 0
2 4 4 0
A= 2 4
,
4 0
0 0 0 1

what is the singular value decomposition of A?

3. (10 pts) For the same matrix in Problem 2.2, if we apply Colibri-S instead of SVD with
sampled indices I = {0,1,2} and threshold ϵ = 0.5, what are the matrices L, M, and R so
that A ≈ LMR is the low rank decomposition of A? What is the squared reconstruction
error?

4
Problem 3. (10 pts) Tensor Tools. Given the following tensor A.

1. (3 pts) What is the mode-3 matricization of A?

2. (3 pts) What is the vectorization of A?

3. (4 pts) Given a vector x = [1, 1], what is A ×3 x?

5
Problem 4. (16 pts) Large-scale Information Network Embedding (LINE).
1. (6 pts) LINE defines the objective function of first-order proximity as:
X
O1 = − wij log p1 (i, j)
(i,j)∈E

1
p1 (i, j) =
1 + exp(−ui · uj )
where ui and uj are the embeddings for node i and j, respectively. What is the trivial solution
of 1-dimensional embeddings u by minimizing O1 ? If we want to avoid this trivial solution,
what techniques can we apply (give one of them)?

2. (10 pts) Given a directed graph as Figure 2 shows.


If we adopt the model distribution of second-order proximity as
exp(u′j · ui )
p2 (vj |vi ) = P|V | ,
k=1 exp(u′k · ui )
given 1-dimensional context embeddings [0, 1, 1, 0], for the following 1-dimensional node
embeddings: (1) P
[0, 0, 0, 0], (2) [1, 1, 1, 1], (3) [0, 0, 0, 1], (4) [1, 0, 0, 1], in terms of the
value of O2 = − (i,j)∈E log2 p2 (vj |vi ), rank them from lowest to highest.

Figure 2: A directed graph

6
Problem 5. (16 pts) Graph Convolutional Network (GCN). Given an undirected graph as
Figure 3 shows.

Figure 3: An undirected graph

1. (6 pts) What is its normalized adjacency matrix for GCN propagation?

2. (10 pts) The given graph has one-dimensional node features X = [0, 1, 0, 1]. A one-layer
GCN with no activations is defined as
1 1
Z = D̃− 2 ÃD̃− 2 Xθ,

where Z is the output. For node labels Y = [0, 1, 0, 1], what is the optimized θ to minimize
mean squared error?

7
Problem 6. (18 pts) OddBall. According to the paper Akoglu, Leman, Mary McGlohon, and
Christos Faloutsos. “Oddball: Spotting anomalies in weighted graphs.” PAKDD 2010, answer the
following questions.

1. (3 pts) For an undirected, weighted graph with the following adjacency matrix, how many
nodes and edges are there respectively in the ego-net of node 1 (index starts from 0)?
 
0 0 1 2
0 0 1 2
 
1 1 0 2
2 2 2 0

2. (10 pts) For a node’s undirected, unweighted ego-net, assume there are no connections between
the node’s neighbors, what kind of anomaly type is it? Find the relationship between λ and
N for this ego-net, where λ is the principal eigenvalue of the weighted adjacency matrix of
ego-net, N is the number of neighbors of the node.

3. (5 pts) What is the anomaly type of the graph with the following adjacency matrix (the center
node’s index is 0)? Explain why.
 
0 1 1000 1
 1 0 1 0
 
1000 1 0 1
1 0 1 0

8
9
Blank Page

10

You might also like