0% found this document useful (0 votes)
71 views

Community-Affiliation Graph Model For Overlapping Network Community Detection

The document discusses a study of overlapping communities in networks. It finds that traditionally, community detection methods assume overlaps are less dense than communities, but the study finds overlaps are actually more dense. It then proposes a new community detection method called Community-Affiliation Graph Model that can detect overlapping, nested, and non-overlapping communities more accurately.

Uploaded by

shahinma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Community-Affiliation Graph Model For Overlapping Network Community Detection

The document discusses a study of overlapping communities in networks. It finds that traditionally, community detection methods assume overlaps are less dense than communities, but the study finds overlaps are actually more dense. It then proposes a new community detection method called Community-Affiliation Graph Model that can detect overlapping, nested, and non-overlapping communities more accurately.

Uploaded by

shahinma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2012 IEEE 12th International Conference on Data Mining

Community-Afliation Graph Model for Overlapping Network Community Detection


Jaewon Yang Stanford University [email protected] Jure Leskovec Stanford University [email protected]

AbstractOne of the main organizing principles in realworld networks is that of network communities, where sets of nodes organize into densely linked clusters. Communities in networks often overlap as nodes can belong to multiple communities at once. Identifying such overlapping communities is crucial for the understanding the structure as well as the function of real-world networks. Even though community structure in networks has been widely studied in the past, practically all research makes an implicit assumption that overlaps between communities are less densely connected than the non-overlapping parts themselves. Here we validate this assumption on 6 large scale social, collaboration and information networks where nodes explicitly state their community memberships. By examining such ground-truth communities we nd that the community overlaps are more densely connected than the non-overlapping parts, which is in sharp contrast to the conventional wisdom that community overlaps are more sparsely connected than the communities themselves. Practially all existing community detection methods fail to detect communities with dense overlaps. We propose Community-Afliation Graph Model, a model-based community detection method that builds on bipartite node-community afliation networks. Our method successfully captures overlapping, non-overlapping as well as hierarchically nested communities, and identies relevant communities more accurately than the state-of-the-art methods in networks ranging from biological to social and information networks.

(a)

(b)

(c)

Figure 1. Conventional view of (a) two non-overlapping and (b) two overlapping communities. Notice that the nodes in the overlap are less connected. (c) Our ndings suggest densely connected community overlaps. Top: network; Bottom: corresponding adjacency matrix.

I. I NTRODUCTION Nodes in networks organize into densely linked groups that are commonly referred to as network communities, clusters or modules [8], [27]. There are many reasons why networks organize into communities. For example, in social networks communities emerge since society organizes into groups, families, friendship circles, villages and associations [6], [28]. In the graph of the World Wide Web topically related pages link more densely among themselves and communities naturally emerge [7]. And in biological networks communities emerge since proteins belonging to a common functional module are more likely to interact with each other [9], [13]. Communities in networks are thought of as groups of nodes that share a common functional property or role, and the goal of network community detection is to identify such sets of functionally related nodes from the unlabeled network
1550-4786/12 $26.00 2012 IEEE DOI 10.1109/ICDM.2012.139

alone [8]. The understanding and models of network communities has evolved over time [8], [17], [30]. Early works on network community detection were heavily inuenced by the research on the strength of weak ties [11]. This lead researchers to think of networks as consisting of dense clusters that are linked by a small number of long-range ties (Figure 1(a)) [10]. Graph partitioning [27], modularity [21] as well as betweenness centrality [10] based methods all assume such view of network communities and thus search for edges that can be cut in order to separate the clusters. Later it was realized that such denition of network communities does not allow for community overlaps. In many networks a node may belong to multiple communities simultaneously which leads to overlapping community structure [1], [2], [23]. The overlapping nature of communities can lead to communities that have more external than internal connections. To deal with this community detection algorithms based on identifying overlapping cliques [23], articulation points, as well as hierarchical clustering of the edges [1] have been proposed. However, practically all present overlapping community detection approaches have a hidden underlying assumption that was left unnoticed. In particular, present overlapping community detection methods assume that community overlaps are less densely connected than non-overlapping parts of communities (Figure 1(b)). In other words, this assumption means that the more communities a pair of nodes shares, the less likely it is they are connected. One possible reason that this assumption went unnoticed and untested could simply be due to the challenges
1218 1170

of evaluating community detection the lack of reliable ground-truth makes the evaluation extremely difcult. Present work: Empirical observations. Here we validate the above assumption by studying the connectivity structure of ground-truth communities [30]. Recently we identied a set of 6 different large social, collaboration, and information networks where we can reliably dene the notion of groundtruth communities [30]. Networks we study come from a number of different domains and research areas. In all these networks nodes explicitly state their ground-truth community memberships [30]. The availability of reliable groundtruth communities has two important consequences. It allows us to empirically study the structure of true communities and validate present assumptions. Moreover, the groundtruth also allow us to move from qualitative to quantitative evaluation of network community detection methods [30]. In this paper we study the overlaps of ground-truth communities and discover that the probability of nodes sharing an edge increases as a function of the number of communities they have in common. We nd an increasing relationship between the number of shared communities of a pair of nodes and the probability of them being connected by an edge. A direct consequence of this is that parts of the network where communities overlap tend to be more densely connected than the non-overlapping parts of communities (Figure 1(c)). This observation stands in sharp contrast to present structural denitions of network communities and also means that present methods [1], [2], [23] are not able to correctly identify such community overlaps. Present community detection algorithms would either mistakenly identify the overlap as a separate cluster or merge two overlapping communities into a single cluster. Present work: Model-based Community detection. We then proceed and ask the following question: What underlying process causes community overlaps to be denser than the communities themselves? To answer this question, we build on models of afliation networks [4], [15] and develop the Community-Afliation Graph Model (AGM) which reliably reproduces the organization of networks into communities and the overlapping community structure [31]. In our model communities arise due to shared group afliations [4], [28], [6]. The central idea of generating social networks based on the afliation network is that links among people stem from common group afliations [4]. We model the probability of an edge between a pair of nodes as a function of the communities that the two nodes share. Community assignments in our model are probabilistic which allows for exibility in the structure of community overlaps: The AGM can model overlapping, non-overlapping, as well as hierarchically nested communities in networks. Based on the AGM we then develop a community detection method that successfully detects overlapping, nonoverlapping, as well as nested communities in networks.

We achieve this by tting AGM (i.e., discovering the nodecommunity afliation graph) to an unlabeled undirected network. Using the Markov Chain Monte Carlo method and convex optimization, we develop a tting algorithm for identifying node community afliations. We also present a method that automatically determines the number of communities in a given network. Experiments on social, collaboration, information and biological networks reveal that AGM discovers overlapping as well as non-overlapping community structure more accurately than present state-of-the-art methods [1], [23], [2], [26]. The success of our approach relies on the exibility of the AGM, which allows for modeling overlapping, nonoverlapping as well as hierarchically nested communities in networks. In summary, our work has three main contributions: The observation that community overlaps are more densely connected than the non-overlapping parts. Community-Afliation Graph Model that explains the emergence of dense community overlaps and accurately models network community structure. Model-based community detection method that detects overlapping, non-overlapping, as well as nested communities in networks. II. N ETWORKS WITH G ROUND - TRUTH C OMMUNITIES We examine a collection of 6 large social, collaboration and information networks where nodes explicitly state their community memberships [30]. Members of these groundtruth communities share properties or attributes, common purpose or function. We did our best to identify networks in which such ground-truth communities can be reliably dened and identied. Table I gives the dataset statistics. First we consider 4 online social networks: the LiveJournal blogging community [3], the Friendster online network [19], the Orkut social network [19], and the Youtube social network [19]. In each of these networks users create explicit groups which other users then join. Such groups serve as organizing principles of nodes in social networks and are focused on specic interests, hobbies, afliations, and geographical regions. For example, LiveJournal categorizes communities into the following types: culture, entertainment, expression, fandom, life/style, life/support, gaming, sports, student life and technology. For example, there are over 100 communities with Stanford in their name, and they range from communities based around different classes, student ethnic communities, departments, activity and interest based groups, varsity teams, etc. Figure 2 gives the distribution (Complementary CDF) of ground-truth community sizes and the number community memberships of nodes in LiveJournal. First notice a clear power-law distribution of the community size distribution. The exponent of the cumulative distribution is 1.3, which is slightly higher than what has been reported in the past [5]

1171 1219

100 10 10 10 10 10 10
-1

LiveJournal Complementary CDF

100 10 10
-1 -2

LiveJournal

-2 -3 -4 -5 -6

10-3 10-4 10 10
-5 -6 -7

100

101

102

103 104 Group Size

105

106

10

100

101 102 Memberships

103

Figure 2. LiveJournal ground-truth communities: (a) Community size distribution, (b) Distribution of the number of communities a node belongs to. Dataset N E C S A LiveJournal 4.0 M 34.9 M 310 k 40.06 3.09 Friendster 120 M 2,600 M 1.5 M 26.72 0.33 Orkut 3.1 M 120 M 8.5 M 34.86 95.93 Youtube 1.1 M 3.0 M 30 k 9.75 0.26 DBLP 0.43 M 1.3 M 2.5 k 429.79 2.57 Amazon 0.34 M 0.93 M 49 k 99.86 14.83 Table I D ATASET STATISTICS . N : NUMBER OF NODES , E : NUMBER OF EDGES , C : NUMBER OF COMMUNITIES , S : AVERAGE COMMUNITY SIZE , A : COMMUNITY MEMBERSHIPS PER NODE . M DENOTES A MILLION AND k DENOTES ONE THOUSAND .

(based on detected rather than ground-truth communities). On the other hand the distribution of the number of community memberships of a node seems to follow a lognormal distribution of average 3.09. Overall, there are over three hundred thousand explicitly dened communities in the LiveJournal network. Friendster, Youtube and Orkut online social networks dene topic-based communities in the same way as LiveJournal. Users create explicit groups that others then join. Each user can join to zero, one or more such groups. We consider each such group as a ground-truth community. Friendster is the largest network we consider in this study. It contains 120 million nodes, 2.6 billion edges and 1.5 million ground-truth communities. The second type of network data we consider is the Amazon product co-purchasing network [16], where the notion of community is quite different from that in the social networks. Here the nodes of the network represent products and edges link commonly co-purchased products. Each product (i.e., node) belongs to one or more hierarchically organized product categories and products from the same category dene a group which we view as a ground-truth community. This means members of the same community share a common function or role, and each level of the product hierarchy denes a set of hierarchically nested and overlapping communities. Finally, we also consider the collaboration network of DBLP [3] where nodes represent authors/actors and edges connect nodes that have co-authored a paper. In DBLP we use publication venues as ground-truth communities which serve as proxies for highly overlapping scientic communities around which the network then organizes. In this network communities heavily overlap and tend to be larger than in other networks we consider here (Table I). The size of the networks we consider here ranges from

hundreds of thousands to hundreds of millions of nodes and edges (Table I). The number of ground-truth communities varies from hundreds to millions and there is also a nice range in group sizes and the node membership distribution. Overall, the networks range from those with modular to highly overlapping community structure and represent a wide range of edge densities, numbers of explicit communities, as well as amounts of community overlap (Table I). We refer the reader to [30] for further discussion on the choice of denition of ground-truth for each network. We were very careful to dene ground-truth communities based on common functions or roles around which networks organize into communities [6], [11]. Note that this is fundamentally different from Ahn et al. [1], who evaluated communities based on attribute similarity of the members. The problem with this approach is that it folds all social dimensions (family, school, interests) around which separate communities form into a single similarity metric. In contrast, we harness explicitly labeled functional groups as labels of ground-truth communities [30]. All the networks we use are complete and publicly available at http://snap.stanford.edu. Even though our networks come from very different domains and have very different motivation for formation of communities the results we will present are consistent and robust across all of them.Our work is consistent with the premise that is implicit in all network community literature: members of real communities share some (latent/unobserved) functional property that serves as an organizing principle of the nodes and gives them a distinct structural connectivity pattern in the network. We use these groups around which communities organize to explicitly dene ground-truth [30]. Data preprocessing. To represent all networks in a consistent way we drop edge directions and consider each network as an unweighted undirected static graph. Because members of a particular group may be disconnected in the network, we consider each connected component of the group as a separate ground-truth community. However, we allow ground-truth communities to be nested and to overlap (i.e., a node can be a member of multiple groups at once). III. E MPIRICAL O BSERVATIONS The availability of reliable ground-truth communities [30] allows us to empirically study the structure of communities and community overlaps. Based on empirical ndings, we will then develop a new method for detecting overlapping communities. We study the structure of community overlaps by asking what is the probability that a pair of nodes being connected if they share k common community memberships, i.e., the nodes belong to the overlap of same k communities. Figure 3 plots this probability for all six datasets. We discover an increasing relationship for all datasets. This means that, the more communities a pair of nodes

Complementary CDF

1172 1220

1 0.8 Edge probability 0.6 0.4 0.2 0 0 2 Data AGM 4 6 8 10 Common memberships 12 14 Edge probability

1 0.8 0.6 0.4 0.2 0 0 Data AGM 2 Common memberships 4

pA
A

pB pC
B C

(a) LiveJournal
1 0.8 Edge probability 0.6 0.4 0.2 0 0 2 4 Data AGM 6 8 10 12 14 16 Common memberships 18 20 Edge probability 1 0.8 0.6 0.4 0.2 0 0

(b) Friendster
Data AGM

(a) Community Afliation Network

4 6 8 Common memberships

10

(c) Orkut
1 0.9 0.8 Edge probability 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 2 4 6 Common memberships Edge probability Data AGM 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

(d) Youtube
Data AGM

(b) Overlap Figure 4. (a) Bipartite community afliation graph. Circles: communities, Squares: nodes in (b) (one square is shown for two squares in (b)). Edges indicate node community memberships. (b) Network generated by AGM.

10

Common memberships

(e) DBLP (f) Amazon Figure 3. Edge probability between two nodes given the number of communities that two nodes share. We observe that the edge probability is an increasing function of the number of common communities in all the networks.

has in common, the higher the probability of them being connected. In LiveJournal, for example, if a pair of nodes has 8 groups in common, the probability of friendship is nearly 80%. To appreciate how strong the effect of shared communities is on the edge probability, note that all of our networks are extremely sparse. The background probability of a random pair of nodes being connected is 105 , while as soon as a pair of nodes shares two communities, their probability of linking increases from 105 to 101 . That is by 4 orders of magnitude! We note that all other data sets exhibit similar behavior the probability of a pair of nodes being connected approaches 1 as the number of common communities increases. While in online social networks the edge probability exhibits a diminishing-returns-like growth, in DBLP, it appears to follow a threshold-like behavior. Discussion. The above result is very intuitive. While nodes belong to multiple communities (people have friends, families and co-workers), links often exist as a result of one dominant reason (people are in the same family, work together, or share common hobbies and interests). Thus, the more communities people have in common, the more opportunities there are to create links. So, people sharing multiple interests have a higher chance of becoming friends [18], researchers with many common interests are more likely to work together [24], and proteins belonging to multiple common functional modules are more likely to interact [9], [13]. Communities thus serve as organizing principles of nodes in social networks and are created based on shared afliation, role, activity, social circle, interest or function.

Our nding suggests communities overlap as illustrated in Figure 1(c). Since the probability of an edge increases as a function of the number of shared communities this means that nodes in the overlap of two (or more) communities are more likely to be connected. This view of network formation is consistent with works that predate the strength of weak ties literature. In particular, dense community overlaps are consistent with the works of Simmel [28] on the web of afliations, and Feld [6] on the focused organization of social ties. In both of these views networks consist of overlapping tiles or social circles that serve as organizing principles of nodes in networks. We also point the contrast between our nding and the currently predominant view of network communities. Current understanding of network communities is based on two fundamental social network theories: triadic closure and strength of weak ties [11] which leads to the picture of network communities as illustrated in Figure 1(a). It also suggests that homophily in networks operates in small pockets where nodes gather in dense non-overlapping clusters (Fig. 1(a)). Moreover, in some networks communities tend to overlap by nodes belonging to multiple communities at once (and thus residing in the overlap) [23]. Applying the conventional view in this case leads to the structure of community overlaps as illustrated in Figure 1(b): Community overlaps are less densely connected than the groups themselves. Our results show the contrary is true. Last, as a consequence this also means that present overlapping community detection methods [1], [2], [23] are not able to correctly identify such overlaps. They would either mistakenly identify the overlap as a separate cluster or merge two overlapping communities into a single cluster. IV. C OMMUNITY-A FFILIATION G RAPH M ODEL We proceed by formulating a simple conceptual model of networks that naturally leads to densely overlapping

1173 1221

communities. We then design a model tting procedure that detects communities from a given unlabeled network. We present the Community-Afliation Graph Model (AGM), a probabilistic generative model for graphs that reliably reproduces the organization of networks into overlapping communities. Our model is based on two main ingredients. The rst ingredient is based on Breigers foundational work [4] which recognized that communities arise due to shared group afliations [4], [28], [6]. We represent node community memberships with a bipartite afliation network that links nodes of the social network to communities that they belong to. The second ingredient of our model is based on the fact that people belong to multiple communities (people have friends, families and co-workers) but the links between them often exist as a result of one dominant reason. We can model this by having each community also carry a single parameter that captures the probability that nodes belonging to that community to share a link. This means every community that a pair of nodes shares gets an independent chance of connecting the nodes. Thus, naturally, the more communities a pair of nodes shares, the higher the probability of linking. Figure 4(a) illustrates the essence of our model. We start with a bipartite graph where the nodes at the bottom represent the nodes of the social network, the nodes on the top represent communities, and the edges indicate node community memberships. We denote the bipartite afliation network as B (V, C, M ), where V the set of nodes of the underlying network G, C the set of communities, and M the edge set. Now, given the afliation network B (V, C, M ), we want to generate a social network G(V, E ). To achieve this we need to specify the process that generates the edges E of G given the afliation network B . We consider a simple parameterization where we assign a parameter pc to every community c C . The parameter pc models the probability of an edge forming between two members of the community c. In other words, we simply generate an edge between a pair of nodes that belongs to community c with probability pc . Each community c creates edges independently. However, if the two nodes are connected by more than one community, the duplicate edges are not included in the graph G(V, E ). Denition 1: Let B (V, C, M ) be a bipartite graph where V is a set of nodes, C is a set of communities, and an edge (u, c) M means that node u V belongs to community c C . Let also {pc } be a set of probabilities for all c C . Given B (V, C, M ) and {pc }, the Community-Afliation Graph Model generates a graph G(V, E ) by creating edge (u, v ) between a pair of nodes u, v V with probability p(u, v ): p(u, v ) = 1
kCuv

where Cuv C is a set of communities that u and v share (Cuv = {c|(u, c), (v, c) M }). Note that this simple process already ensures that pairs of nodes that belong to multiple common communities are more likely to link. This is due to the fact that nodes that share multiple community memberships receive multiple chances to create a link. For example, pairs of purple nodes in the overlap of communities A and B in Figure 4(a) get two chances to create an edge. First they can be connected with probability pA (due to their membership in community A) and then also with probability pB (due to membership in B ). While pairs of nodes residing in the non-overlapping region of A link with probability pA , nodes in the overlap link with probability 1 (1 pA )(1 pB ) which is greater than either of pA or pB . We also point out that the Community-Afliation Graph Model is very similar to the model of Lattanzi and Sivakumar [15]. However, there are two crucial differences. First, [15] posed a model where edge creation probability decreases with community size. AGM relaxes this and allows communities to have arbitrary edge probabilities, in order to exibly model the community structure of real-world networks. Second while [15] focuses on generating synthetic networks with desirable properties, our work aims to detect the community structure by developing an efcient tting algorithm for AGM. -community. In the formulation of Equation 1, AGM does not allow for the edges between the nodes that do not share any common communities. To allow for edges between nodes that do not share any common communities, we assume an additional community, called the community, which connects any pair of nodes with a very small probability . We nd that setting to the background probability of a pair of nodes being connected by an edge ( = 2|E |/|V |(|V | 1)) works well in practice. In case of our datasets, 108 . Flexibility of the AGM. Last, we also point out the exible nature of the Community-Afliation Graph Model, which allows for modeling a wide range of network community structures. Figure 5 illustrates the structure of afliation network for three possible community structures. Figure 5(a) shows an afliation graph of a network with two non-overlapping communities. (Note the presence of community which allows for edges between communities A and B .) Figure 5(b) shows an example of hierarchical community structure where communities A and C are nested inside community B . Finally, Figure 5(c) illustrates an afliation network corresponding to a pair of overlapping communities. This means that the exibility of the afliation network structure allows the AGM to simultaneously model non-overlapping, hierarchically nested as well as overlapping communities in networks. In [31] we further evaluate the ability of AGM to generate

(1 pk ),

(1)

1174 1222

(a) Non-overlapping (b) Nested (c) Overlapping Figure 5. AGM allows for rich modeling of network communities: (a) non-overlapping, (b) nested, (c) overlapping. In (a) we assume that nodes in two communities connect with small prob. (refer to the discussion in the main text).

realistic networks with realistic community structure. Our results show that AGM is able to generate networks with heavy-tailed degree distributions, high clustering as well as realistic overlapping, non-overlapping and hierarchical community structures. V. C OMMUNITY D ETECTION WITH C OMMUNITY-A FFILIATION G RAPH M ODEL Due to space constraints we now summarize the remainder of our work. For details we refer the reader to the full version of the paper [29]. We further build on the AGM and develop a novel community detection algorithm. We achieve this by tting AGM to a given undirected unlabeled network G by maximizing the likelihood L(B, {pc }) = P (G|B, {pc }) [29]. This means that given G we nd the afliation graph B and parameters {pc } that best match the structure of G. And by nding B we identify network communities as B models the afliations of nodes to communities. In the evaluation based on ground-truth community membership and node meta data [1], our algorithm outperforms three different state of the art community detection methods [1], [23], [2] by 40% on average. Due to space constraints we invite the reader to [29] for further details. Acknowledgements. This research has been supported in part by NSF IIS-1016909, CNS-1010921, CAREER IIS-1149837, IIS-1159679, DARPA XDATA, DARPA GRAPHS, Albert Yu & Mary Bechmann Foundation, Boeing, Allyes, Samsung, Intel, Alfred P. Sloan Fellowship and the Microsoft Faculty Fellowship. R EFERENCES
[1] Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann. Link communities reveal multi-scale complexity in networks. Nature, 2010. [2] E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. JMLR, 2007. [3] L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD 06, 2006. [4] R. L. Breiger. The duality of persons and groups. Social Forces, 1974. [5] A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Phys. Rev. E, 2004. [6] S. L. Feld. The focused organization of social ties. American Journal of Sociology, 1981. [7] G. Flake, S. Lawrence, and C. Giles. Efcient identication of web communities. In KDD 00, 2000. [8] S. Fortunato. Community detection in graphs. Physics Reports, 2010.

[9] A. Gavin et al. Proteome survey reveals modularity of the yeast cell machinery. Nature, 2006. [10] M. Girvan and M. Newman. Community structure in social and biological networks. PNAS, 2002. [11] M. S. Granovetter. The strength of weak ties. American Journal of Sociology, 1973. [12] S. Gregory. Fuzzy overlapping communities in networks. J. of Stat. Mech., 2011. [13] N. Krogan et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 2006. [14] A. Lancichinetti and S. Fortunato. Community detection algorithms: A comparative analysis. Phys. Rev. E, 2009. [15] S. Lattanzi and D. Sivakumar. Afliation networks. In STOC 09, 2009. [16] J. Leskovec, L. Adamic, and B. Huberman. The dynamics of viral marketing. ACM TWeb, 2007. [17] J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-dened clusters. Internet Mathematics, 2009. [18] M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 2001. [19] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC 07, 2007. [20] M. Molloy and B. Reed. A critical point for random graphs with a given degree sequence. Random Structures and Algorithms, 1995. [21] M. Newman. Modularity and community structure in networks. PNAS, 2006. [22] M. Newman and G. Barkema. Monte Carlo Methods in Statistical Physics. Oxford University Press, 1999. [23] G. Palla, I. Der enyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 2005. [24] W. W. Powell, D. R. White, K. W. Koput, and J. OwenSmith. Network dynamics and eld evolution: The growth of interorganizational collaboration in the life sciences. Am. J. of Sociology, 2005. [25] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi. Dening and identifying communities in networks. PNAS, 2004. [26] M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. PNAS, 2008. [27] S. Schaeffer. Graph clustering. Computer Science Rev., 2007. [28] G. Simmel. Conict and the web of group afliations. Simon and Schuster, 1964. [29] J. Yang and J. Leskovec. Community-Afliation Graph Model for Overlapping Network Community Detection. Extended version. [30] J. Yang and J. Leskovec. Dening and evaluating network communities based on ground-truth. In ICDM 12, 2012. [31] J. Yang and J. Leskovec. Structure and Overlaps of Communities in Networks In SNAKDD 12, 2012. [32] E. Zheleva, H. Sharara, and L. Getoor. Co-evolution of social and afliation networks. In KDD 09, 2009.

1175 1223

You might also like