0% found this document useful (0 votes)

11 views25 pages

Best Paper_ Credit Scoring with Social Network Data

This document analyzes the impact of using social network data in credit scoring, focusing on the accuracy of customer scores and the formation of social ties among consumers. The study finds that while network-based scoring can improve accuracy, it may also lead to social fragmentation and discrimination among consumers, particularly affecting low-income individuals. The implications for management and public policy are discussed, highlighting the potential for both benefits and drawbacks of social credit scoring practices.

Uploaded by

Chrysanthos Dellarocas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views25 pages

Best Paper_ Credit Scoring with Social Network Data

Uploaded by

Chrysanthos Dellarocas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Vol. 35, No. 2, March–April 2016, pp.

234–258
ISSN 0732-2399 (print) ISSN 1526-548X (online) http://dx.doi.org/10.1287/mksc.2015.0949
© 2016 INFORMS

Credit Scoring with Social Network Data

Yanhao Wei
Department of Economics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, [email protected]

Pinar Yildirim, Christophe Van den Bulte

Marketing Department, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
{[email protected], [email protected]}

Chrysanthos Dellarocas
Information Systems Department, Questrom School of Business, Boston University, Boston, Massachusetts 02215, [email protected]

M otivated by the growing practice of using social network data in credit scoring, we analyze the impact
of using network-based measures on customer score accuracy and on tie formation among customers. We
develop a series of models to compare the accuracy of customer scores obtained with and without network data.
We also investigate how the accuracy of social network-based scores changes when consumers can strategically
construct their social networks to attain higher scores. We find that those who are motivated to improve their
scores may form fewer ties and focus more on similar partners. The impact of such endogenous tie formation on
the accuracy of consumer scores is ambiguous. Scores can become more accurate as a result of modifications in
social networks, but this accuracy improvement may come with greater network fragmentation. The threat of
social exclusion in such endogenously formed networks provides incentives to low-type members to exert effort
that improves everyone’s creditworthiness. We discuss implications for managers and public policy.
Keywords: social networks; credit score; customer scoring; social status; social discrimination; endogenous tie
formation
History: Received: July 18, 2014; accepted: June 21, 2015; K. Sudhir served as the senior editor and Yuxin Chen
served as associate editor for this article. Published online in Articles in Advance October 26, 2015.

1. Introduction FICO score. In the past few years, however, the credit
When a consumer applies for credit, attempts to refi- scoring industry has witnessed a dramatic change in
nance a loan or wants to rent a house, potential lenders data sources (Chui 2013, Jenkins 2014, Lohr 2015). An
often seek information about the applicant’s financial increasing number of firms rely on network-based data
background in the form of a credit score provided by a to assess consumer creditworthiness. One such com-
credit bureau or other analysts. A consumer’s score can pany, Lenddo, reportedly assigns credit scores based
influence the lender’s decision to extend credit and the on information in users’ social networking profiles,
terms of the credit. In general, consumers with high such as education and employment history, how many
scores are more likely to obtain credit, and to obtain followers they have, who they are friends with, and
it with better terms, including the annual percentage information about those friends (Rusli 2013).1 Similar
rate (APR), the grace period, and other contractual to Lenddo, a growing number of start-ups specialize in
loan obligations (Rusli 2013). Given that consumers use using data from social networks. Such firms claim that
credit for a range of undertakings that affect social and their social network-based credit scoring and financing
practices broaden opportunities for a larger portion of
financial mobility, such as purchasing a house, starting
the population and may benefit low-income consumers
a business or obtaining higher education, credit scores
who would otherwise find it hard to obtain credit.
have a considerable impact on access to opportunities
Our study is motivated by the growing use of such
and hence on social inequality among citizens.
practices and investigates whether a move to network-
Until recently, assessing consumers’ creditworthiness
based credit scoring affects financing inequality. In
relied solely on their financial history. The financial
particular, we address the following questions. First,
credit score popularized by the Fair Isaac Corporation
(FICO), for example, relies on three key data to deter- 1
Network data can be collected from a variety of sources. Lenddo,
mine access to credit: consumers’ debt level, length for instance, obtains applicants’ consent to scan a variety of their
of credit history, and regular and on-time payments. online social accounts (Facebook, Gmail, Twitter, LinkedIn, Yahoo,
Together, these elements account for about 80% of the Microsoft Live) and sometimes also their phone activity.

234
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 235

from the perspective of lenders, is there an advantage This result supports concerns about social credit scoring
to using network-based measures rather than measures from consumer advocates and regulators such as the
based only on an individual’s data? Second, as use Consumer Financial Protection Bureau (CFPB) and the
of social network data becomes common practice, Federal Trade Commission (FTC) (Armour 2014).
how may consumers’ endogenous network formation In §§2 and 3, we study environments wherein all
influence the accuracy of credit scores? Third, how consumers, independent of their type, have similar
does peer pressure operate in network-based credit needs for financing. We relax this assumption in §4 and
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

scoring? Finally, and most important for public policy, introduce a formulation with discrete risk types that
how do these scores influence inequality in access to may vary in their needs for financing. When studying
financing? this environment, we pay particular attention to the
strategic formation of social ties. An important result
1.1. Main Insights is the emergence of social exclusion or discrimination
Access to financing is correlated with one’s credit among low-type consumers. They avoid associating
score. Following Demirgüç-Kunt and Levine (2009), with one another because such associations signal even
we assume that credit scores can influence access to more strongly to lending institutions that their type is
financing at the extensive and intensive margins, i.e., low. Such within-group discrimination is different from
by increasing the number of those who are considered between-group discrimination studied commonly in the
eligible for financing as well as by providing access to literature (e.g., Arrow 1998, Becker 1971, Phelps 1972).
credit at better terms. Although network-based scoring In §5, again within a discrete setting, we allow
can affect access to financing at the extensive and consumers to exert effort to improve their true credit-
intensive margin, the impact on each might be uneven worthiness or type. When social ties motivate effort,
for different segments of society. social credit scoring may benefit those with poor finan-
We first develop a model with continuous risk types cial health in two ways, i.e., not only by letting them
incorporating network-based data (§2). Under the benefit from a positive signal from social ties with
assumption of homophily, the notion that people are others having a stronger financial footing but also by
more likely to form social ties with others who are motivating them to invest more in their own financial
similar to them, we show that network data provide health. We consider environments with explicit discrim-
additional information about consumers and reduce ination and with homophily. We find that when there
the uncertainty about their creditworthiness. We find are complementarities between the effort exerted by
that the accuracy of network-based scores depends individuals, the between-group connections can moti-
primarily on information from the direct ties, i.e., the vate effort and thus lead to increased social mobility in
assessed consumers’ ego-network. This implies that both environments. The within-group connections also
credit-scoring firms can efficiently assess an individual’s improve effort in a discriminatory environment. By con-
creditworthiness using data from a subset of the overall trast, when homophily is the only factor determining
network. tie formation, a high number of low-type friends who
In §3, we extend our model to allow consumers exert low effort will reduce an individual’s desire to
in a network to form ties strategically to improve exert effort. In §6, we analyze another way consumers
their credit scores. We find that they may then choose can exert effort to improve their financial outcomes,
not to connect to people with lower scores. This can i.e., by actively networking to endogenously alter the
result in social fragmentation within a network: Those probability of meeting people with high creditworthi-
with better access to financing opportunities choose to ness. Our analysis demonstrates that low types exert
segregate themselves from those with worse financing effort to meet others more aggressively than high types
opportunities. As a result, consumers self-select into only when they are in dire need of improving credit
highly homogeneous yet smaller subnetworks. The access. Otherwise, high types exert greater effort.
impact of such social fragmentation on credit scoring
accuracy is ambiguous. On the one hand, scores may 1.2. Related Literature
more accurately reflect borrowers’ risk as each agent is Though motivated by and couched in terms of social
situated in a more homogeneous ego-network. On the credit scoring, the insights we develop go beyond
other hand, scores may become less accurate because that realm. Our models involve a relatively abstract
smaller ego-networks provide fewer data points and notion of customer attractiveness or “type” that has
hence less information on each person. How important two properties: (1) Social relationships are homophilic
financial scores are relative to social relationships with respect to types; and (2) A third party such as a
determines whether strategic tie formation improves or firm or society at large values higher types more and
harms credit score accuracy. When accuracy declines, bestows some rewards (external to social relationships)
network-based scoring could put deserving consumers that are monotonically increasing with one’s type. The
with poor financing opportunities in further hardship. notion of homophily in customer value, i.e., the notion
Wei et al.: Credit Scoring with Social Network Data
236 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

that attractive prospects or customers are more likely mediated through endogenous changes in network
to be connected to one another than to the unattractive, structure.
and vice versa, underlies social customer scoring in Second, we provide new insights on the risk of
predictive analytics (e.g., Benoit and Van den Poel discrimination and exclusion triggered by social financ-
2012, Goel and Goldstein 2013, Haenlein 2011). It is ing (Ambrus et al. 2014, Armour 2014). Our model
also the basis for targeting friends and other network allows for the possibility of discrimination against less
connections of valuable customers in new product creditworthy consumers. There are two ways through
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

launch (e.g., Haenlein and Libai 2013, Hill et al. 2006), which such discrimination can come about. The first
in targeted online advertising (Bagherjeiran et al. 2010, is that consumers may be subject to discrimination
Bakshy et al. 2012, Liu and Tang 2011), and in customer based on type. In an endogenous network, borrowers
referral programs (e.g., Kornish and Li 2010, Schmitt will be more selective in forming relationships, and
et al. 2011). The basic insights also apply to employment may prefer to form relationships with higher-type
settings, where firms have long used employee referral consumers to protect their credit score. Formation of
programs to attract better applicants (e.g., Castilla 2005) networks to attain a high credit score can be an indirect
and many have started to use social network data to way of discrimination because some consumers are
gain more information about applicants’ character and systematically excluded from others’ networks. The
work ethic (e.g., Roth et al. 2016). second is that consumers may observe each other’s
The model construct that we label “social credit effort to improve their score and may discriminate
score” captures a customer’s attractiveness or type based on personal effort. Any low-type consumer who
as perceived by a firm based on social network infor- does not exert effort may face disengagement by fellow
mation, in which the firm bestows some benefits that low-type contacts who exert effort and who want to
are monotonically increasing with type. Hence, our disassociate their own credit score from hers.
insights about social credit scoring can also be inter- Third, our work is relevant to ongoing debates on the
preted as pertaining to consumers’ social status more impact of new social technologies on social integration
broadly, i.e., their “position in a social structure based versus balkanization. Rosenblat and Mobius (2004)
on esteem that is bestowed by others” (Hu and Van find that a reduction in communication costs decreases
den Bulte 2014, p. 510). As such, our analysis involv- the separation between individuals but increases the
ing endogenous tie formation contributes not only to separation between groups. Along similar lines, van
research traditions in economics and sociology (e.g., Alstyne and Brynjolfsson (2005) find that the Internet
Ball et al. 2001, Podolny 2008) but also to the recent can lead to segregation among different types of indi-
marketing research on how status considerations affect viduals. In this study, we identify conditions under
consumers’ networking behavior (Lu et al. 2013, Toubia which network-based credit scoring (and customer
and Stephen 2013), their acceptance of new products scoring in general) may foster or harm integration
(Iyengar et al. 2015), and their appeal as customers within versus between groups.
(Hu and Van den Bulte 2014). Finally, our work will be of topical interest to the
Even when limited to the realm of financial credit growing number of scholars seeking to better under-
scoring, our analysis relates to several streams of recent stand consumers’ financial behaviors, especially the
work. First is the large and growing amount of work on role of homophily (Galak et al. 2011) and trust signaling
microfinance and, more specifically, how group lending (e.g., Herzenstein et al. 2011, Lin et al. 2013) in gaining
helps improve access to capital by reducing the negative access to credit. It will also be of interest to researchers
consequences of information asymmetries between focusing on the practices in emerging economies where
creditor and debtor (e.g., Ambrus et al. 2014; Bramoullé consumer finance and access to credit are particularly
and Kranton 2007a, b; Stiglitz 1990; Townsend 1994). important yet the traditional credit scoring apparatus
Our analysis focuses on individual rather than group is found lacking. Creditors in these markets often seek
loans, and on a priori customer scoring rather than a to enrich scores based on an individual’s history with
posteriori compliance through group monitoring and additional information (e.g., Guseva and Rona-Tas
social pressure. Hence, our result that social credit 2001, Sudhir et al. 2015, Rona-Tas and Guseva 2014).
scoring can lead people to form their network ties The rest of the article develops as follows. In §2, we
differently and to exert more effort in improving their present a benchmark model with data collection from
financial health is different from, yet dovetails with, the networks to assess creditworthiness, and then provide
evidence by Feigenberg et al. (2013) that group lending justification for the emergence of this industry. In §3,
tends to trigger changes in network structure that in we investigate the possibility of networks forming
turn reduce loan defaults. The two different kinds of endogenously to the social credit scoring practice. We
“social financing” practices acting at two different stages extend our model to allow consumers to vary in their
of the loan (customer selection and terms definition financing needs in §4. We consider the possibility
versus compliance) can lead to improved outcomes of social mobility through effort in §5. We extend
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 237

the model in several directions in §6 and conclude where i ∼ N 401 c −1 5 and is independent across individ-
with implications for public policy and marketing uals. The firm observes the signals of a finite set of
practice in §7. consumers y, which we refer to as the vector of signals
as well. For these consumers, the firm may observe
2. Model with Exogenous Network the presence or absence of a tie. We use g ≡ 4g 1 1 g 0 5 to
Consider a society with a large population S. Each denote such information. Specifically, g 1 is the set of
person i is denoted by a type xi , and xi follows N 401 q −1 5 the dyads that the lender knows are friends, and g 0
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

across individuals, with precision q > 0. We assume is the set of the dyads that the lender knows are not
that each agent knows her own type and discovers friends. Furthermore, for each person in y, we allow
that of fellow consumers upon meeting them. g 0 to include all of the dyads that involve her and
The process of forming friendships is specified as someone outside y.4
follows. Each pair of consumers meet with a very small First, we present some properties about the firm’s
independent probability of > 0. Between i and j there posterior on the types of consumers in a network.
is an independent match value mij ∼ 2 . A friendship Together with the nodes in y, the ties in g 1 define a
between i and j creates utility mij − xi − xj for either subnetwork involving only nodes on which a signal
individual. So our model features homophily based is observed. In this subnetwork, let di be the degree
on preference rather than opportunity (Zeng and Xie of i,5 and r4i1 j5 be the length of the shortest path (i.e.,
2008): Individuals enjoy the company of others like geodesic distance) between i and j.
them more than that of others unlike them. Person i
accepts the formation of a friendship tie with j, iff, Proposition 1. Let vector x indicate the types of con-
they have met and sumers in vector y. Pr4x g1 y5 is a multivariate normal
density with precision matrix è−1
mij > xi − xj 0 (1)
4è−1 5ii = c + di 1
On mutual consent of both parties, a friendship
tie is created. The assumption of a 2 distribution 4è−1 5ij = −18ij∈g 1 9 1
implies that the probability i and j become friends
upon meeting is and mean vector
2
= cèy0 (4)
Pr4mij > xi − xj 5 = e−xi −xj /2 0 (2)
Proposition 1 states that the lender’s beliefs about the
Let G denote the set of friendships (ties) in society types of consumers in the network follow a multivariate
and ni denote the number of friends of i, or, the normal distribution the parameters of which depend on
degree of i under G. p The expected number of friends the network structure. So two consumers with identical
2
for i is Ɛ4ni xi 5 = S q/4q + 15e−4q/41+q55xi /2 .2 To repre- individual signals (such as personal financial history)
sent an environment with sufficient uncertainty about may obtain different network-based scores because
the creditworthiness of consumers, we make three of social connections. These consumers would obtain
assumptions: (i) the society is large (S → +); (ii) the similar financing opportunities if credit scores relied
probability that any pair of individuals meet is very solely on individual history. In the new regime, despite
small ( → 0); and (iii) types are diffuse (q → 0). These identical individual financial histories, it is possible
three properties characterize a society with sufficient that they will have unequal access to financing because
uncertainty about individuals. They p also allow us to of score gains and losses from the social network.
assume that the product term S q/4q + 15 holds a Equation (4) shows that the weight that contact j’s
constant, which we denote by N .3
signal receives depends on her location in the network.
Suppose that friendships in the society have been
Proposition 2 states an upper bound on the weight
formed. The lender is interested in updating its infor-
of connection j’s signal on i’s posterior mean. When
mation about the types of consumers using signals
all else is equal, the upper bound on the weight of j
collected from the network. For any individual i, the
decreases in the distance r4i1 j5. If i and j are not
lender may observe a noisy signal yi about her type
connected in the subnetwork, the weight is zero.
y i = x i + i 1 (3)
4
This type of information arises when the lender observes all of i’s
R + −4t−xi 52 /2 2 2
q/425e−qt /2 dt=S q/4q+15e−4q/41+q55xi /2 .
2
p p
Ɛ4ni xi 5 = S −
e friends and their signals, which implies that i is not friends with the
3
In a small society where everyone is likely to be friends with others, rest of the society. Corollary 1 demonstrates an example of such a
or in a society where each type is organized in perfectly homogeneous situation.
5
and mutually disconnected subgraphs (i.e., components), there is Note that di , the observed degree of i need not be the same as her
little to no uncertainty about an individual’s type. This implies that true degree, ni , as here we allow for observing any subnetwork of
network-based scores are less useful. friends, di ≤ ni 0
Wei et al.: Credit Scoring with Social Network Data
238 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

Proposition 2. For all i 6= j and r4i1 j5 < +, the second example, although individual 2 is at an equal
weight matrix of Proposition 1 satisfies distance to persons 1 and 3, their signals receive differ-
ent weights: Individual 3’s signal is diluted as she is
c r4i1 j5 linked to individual 4.
cèij < 1
c + di 1 − Propositions 1 and 2 together imply that agents
who have lower distances to high-type consumers
where can receive a more favorable posterior in credit score
maxk∈y 8dk 9
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

≡ 0 assessment. Conversely, proximity to those with low

c + maxk∈y 8dk 9 signals may hurt an individual’s assessment. Con-
To generate further insights about how the weight of sumers cannot choose their distance as we have not yet
a connection’s signal changes with distance, we follow considered active selection of friendship ties to attain
with two examples: such benefits (see §3). When the weight of a friend j’s
signal (on updating the beliefs about the type of i) is
Example 1. For a simple example, consider a star zero, this implies that either it is unknown whether
network g 1 that is centered at 1. there is a friendship between the ego and j, or that
3 2
j ∈ g 0 and they are not friends. When two people are
not friends, the interpretation is that they have not met
1 due to the low meeting probability.
In the remainder of the paper, we assume that when
evaluating a particular i, the firm observes the complete
4 ego-network of i, i.e., all of the ties ij ∈ G, and receives
With c = 1, cè equals a signal on each of i’s friends. We collect the signals
  in the vector yi , which we will refer to as the set
004 002 002 002 of i’s friends. Note that this imposes an additional
 002 006 001 001  assumption on the previous analysis: We now require
 002 001 006 001  0
 
that g 1 equals the complete set of i’s direct ties. The
002 001 001 006 posterior belief of the firm about an individual’s type
can then be stated as a special case of Proposition 1.
By Proposition 1, this is a “weight” matrix, suggesting
that to calculate the posterior mean of x1 , for example, Corollary 1. For the risk assessment of type i,
the firm should weigh the signals 4y1 1 y2 1 y3 1 y4 5 by Pr4xi yi 5 is normal with precision
40041 0021 0021 0025. Note further that direct neighbors c
i = c + n1 (5)
(friends) for nodes 2, 3, and 4 receive more weight c+1 i
than indirect neighbors (friends of friends). and mean
1 c X
Example 2. Consider the following g 1 . i = cyi + y 0
4 i c + 1 ij∈G j
2 Corollary 1 states that when an individual has a
1 3 higher number of connections, the posterior about
her type will be more precise. The assessment of an
With c = 1, the weight matrix is
individual with a higher degree is likely to be closer

0062 0024 0010 0005
 to this true type, xi .6 More important, (5) implies that
 0024 0048 0019 0010  the precision of a lender’s beliefs is higher than the
 0 precision of the individual signal of i, even with data
 0010 0019 0048 0024 
0005 0010 0024 0062 only from the direct relationships of i. The corollary
thus states useful information about the efficiency of
Note that direct neighbors are weighed more heavily risk assessment based on network data. If gathering
than indirect neighbors, and that direct neighbors need data on the whole network is impossible or costly,
not receive equal weight. For instance, the updating efficiency gains can still be attained by using data from
of x2 weighs the signal from node 1 more heavily than the focal consumer’s immediate neighbors. Remember
that from node 3. from Proposition 2 that first degree contacts of i receive
a greater weight, and that data from longer paths in
The above examples convey the intuition that distant the network are expected to receive gradually lower
signals on average receive lower weight in a firm’s weights in the beliefs about one’s creditworthiness.
updating of beliefs about a consumer’s type. In Exam-
ples 1 and 2, the weight of the signal of an individual 6
Note that i = 1/E44i − xi 52 yi 5, which is the inverse of the condi-
who is two links away is always lower than the weight tional mean squared error. Because in (5) i is increasing in ni , the
of the individual who is only one link away. In the conditional mean squared error is decreasing with ni .
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 239

3. Endogenous Tie Formation optimally by consumers. Although individual i would

We next study consumers’ incentives to form network prefer to be friends with others similar to her, which
ties to improve their scores. This suggests that the prob- was expressed in (1), she may have additional utility
ability that two agents will become friends depends on from adding high type or removing low type friends
their type, xi , and the expected utility from improving due to the improvement in her credit assessment. This
their credit score. suggests that consumers will form relationships with
Facing network-based scoring, a consumer has an others who have lower types only if the match value mij
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

incentive not to form ties with low types to achieve a yields sufficiently high utility.
more favorable score. Such endogenous tie formation Comparing (6) with (1), a greater (lesser) desire to
involves a trade-off between utility from friendship link to individuals with higher (lower) types would
ties with people one likes and utility from a high score. indicate that an agent should pick i ≤ 1 and i ≥ 1.9
To formally express this, we assume that the posterior Remember that forming a friendship tie requires mutual
mean i enters the utility additively. The utility of consent: For i and j to become friends, i should want to
individual i is connect with j and j should want to connect with i.10
Thus i becomes irrelevant and i becomes the param-
X
Ui = 4mij − xi − xj 5 + i 1 (6)
j2 ij∈G eter that sets the level of mixing with others. In the
where the first part of the utility, 4mij − xi − xj 5, rest of the paper we omit any further references to i .
indicates a social utility taking into consideration Consider the symmetric case where i = for all i.
homophily. The second part, i , indicates how much i If everyone applies the same rule with common ,
enjoys having a high posterior mean. Here, calibrates a friendship is established after meeting, iff, mij >
the relative importance an individual places on receiv- xi −xj . With the common rule in place, the probability
ing a high credit score versus the utility from friendship of becoming friends after meeting becomes
ties with people she likes. All consumers gain utility 2
from their posterior credit score at rate .7 If = 0, Pr4xi − xj 1 5 = e−xi −xj /2 0
the individual cares only about forming friendships
for social utility. If → +, then the agent cares little Compared with the tie formation probability in an
about social utility but cares greatly about improving exogenous setting (given by Equation (2)), consumers
her score. will be more selective in linking to others. Fewer ties
Parameter can also be interpreted as a measure will be formed in the endogenous case.
of the desire for status. How much people care about
how highly others evaluate them (i.e., generate a 3.1. Credit Scoring with Endogenous Tie Formation
posterior about their type based on characteristics of In this section we complete the analysis of endogenous
their network) captures the importance people place relationship formation using an equilibrium concept.
on their position in a social structure based on esteem We use 41 i 5 to denote the common rule with the
that is bestowed by others, i.e., their status. Let each possible deviation of i. The expected utility of i becomes
consumer i adopt a tie formation rule a priori (i.e.,
before meeting j) which states that she will accept

X
friendship with j, iff, Ɛ4Ui xi 1 1 i 5 = Ɛ mij − xi − xj xi 1 1 i
( j2 ij∈G

mij > i xi − xj for xj ≥ xi 1 + Ɛ6i 45 xi 1 1 i 71 (7)

mij > i xi − xj for xj < xi 0
The parameters i and i represent the degree to 9
The benefits of network-based scoring are measured by the differ-
which i is willing to accept a lower and a higher- ence between one’s expected posterior mean and one’s individual
type individual as a friend. These parameters are not signal. This difference increases in i (i.e., the rate at which the
individual rejects ties with low-type friends) and decreases in i (i.e.,
exogenous but will be chosen simultaneously8 and the rate at which the individual adds high-type friends). Choosing
i > 1 is worse than i = 1 because it decreases both the expected
7
To allow for the possibility that some agents may have no interest score benefit and the social utility of a tie. Similarly, choosing i < 1
in improving their scores when they meet others with similar types, rather than i = 1 would decrease the utility from a higher credit
§4 presents a discrete formulation of our matching model and we score and the social utility of a well matching tie. Together, these
provide a special case wherein the high types have zero utility from two arguments imply that: (i) any symmetric equilibrium derived
credit scores. with restrictions is still an equilibrium even if we allow i > 1 or
8
Note that in this model consumers form ties simultaneously. i < 1; and more important, (ii) there is no symmetric equilibrium
A model with sequential friendship formation would need to where > 1 or < 1.
10
consider, in addition to tie formation rules, rules about the order If we allowed consumers to form friendships without mutual
in which consumers form ties, and would need to assume that consent, then everyone could link to anyone to improve her own
individual beliefs about firms’ financial assessment are consistent score. The benefits of network-based scoring would be limited since
with equilibrium outcomes. a connection to a high type would not be informative of one’s type.
Wei et al.: Credit Scoring with Social Network Data
240 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

where i 45 = Ɛ4xi yi 1 5 is the lender’s posterior. Each Proposition 3. For 0 < < N , there exists at least one
person calculates her expected utility from being in symmetric equilibrium, and any symmetric equilibrium ∗
a friendship network before the network is formed, must satisfy
−2

implying that expected utility will depend on the
1 < ∗ < 1 − 0 (10)
friendship rule 41 i 5 adopted. The expectation Ɛ4 · 5 is N
taken before meeting others. We first display a version In words, when networks are created endogenously, con-
of Corollary 1 under a symmetric rule. In the following,
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

sumers are more selective in accepting friendships in equilib-

when i conforms with the common rule, we omit i rium; the upper bound on selectivity is determined by how
in the expectation conditionals. much importance consumers put on a high credit score and
Lemma 1. Under a common relationship formation the expected degree in society.
rule , the posterior Pr4xi yi 1 5 is normal with precision Corollary 2. If c ≥ N /4N − 5, then
c Ɛ6i 4∗ 5 ∗ 7 > Ɛ6i 415 = 171
i 45 = c + n1 (8)
c+ i
where i ≡ Precision4xi yi 1 5. On average, the network-
and mean based score becomes more accurate when consumers are
averse to connecting with lower type peers. Otherwise, if
1 c X
i 45 = cyi + y 0 c ≤ 1, then Ɛ6i 4∗ 5 ∗ 7 < Ɛ6i 415 = 17. On average, the
i 45 c + j2 ij∈G j network-based scores are less accurate.

Compared to Corollary 1, in Lemma 1, i and i are Social credit scoring changes consumer incentives to
scaled by the selection rule . When borrowers are form relationships in two directions. Compared to the
more selective in forming friendships with lower types exogenous setting ( = 1), in the endogenous setting
(when is higher), a financial institution will put more with = ∗ > 1, relationships are formed more selec-
weight on friends’ signals to update beliefs about the tively. This has several consequences. First, relationships
type of an individual (i.e., to calculate the posterior). are more strongly homophilous, that is, consumers
In broad terms, this selectivity addresses our second form relationships with others who are closer to their
main research question: When consumers react to an own type. For lenders, this first effect has a positive
environment with network-based scoring, will scores be impact on network scores: The accuracy of their assess-
less or more precise? In other words, will assessments ment will improve as a result of obtaining signals from
based on network data yield a better assessment? Our closer types. Network-based scores will be even more
precise due to data from others who are expected to be
answer to this question is a qualified yes. We explain
more similar in type.
the mechanism through which this improvement can
Second, consumers will reject friendship ties with
be achieved via a lemma and a proposition.
others who have lower types. This implies that ego-
Lemma 2. The expected degree under a symmetric rule networks will shrink (Lemma 2). This second effect has
satisfies a negative impact on network scoring accuracy. The
N two forces, i.e., homogenization and the shrinking of
Ɛ4ni 5 = √ 0 (9) ego-networks, work against each other. The net effect

is ambiguous.
A lower rate of mixing between types (a higher ) Corollary 2 identifies a further condition to charac-
results in a smaller number of ties per person. Ties terize situations in which the net effect is positive and
are formed only between those who are highly similar network score accuracy improves with endogenous
to each other in type. Such self-selection reduces the tie formation. For some sufficiently small , lenders
expected number of connections among consumers but may benefit from using network-based credit scoring
increases the information value of any single link and as it becomes even more precise with self-selection of
the signal it conveys. The net effect on the formation consumers to form networks to improve their credit
of ties is not yet clear. We address it next. scores. The improvement in precision is conditional on
Proposition 3 shows that, under the limits of S1 1 consumers placing sufficiently low weight on financial
and q, there is a symmetric equilibrium ∗ where outcomes relative to the utility derived from social con-
i = ∗ , which maximizes (7) for any individual i, nections. Paradoxically, when consumers care greatly
given that = ∗ is the common rule adopted by about their score or status, they may reduce the size
everyone else. In other words, there exists a common of their social networks so much that network-based
tie formation rule from which no individual wants scoring becomes less reliable in equilibrium.
to deviate, and with which the lender’s posterior is Can societal tissue make network-based scoring more
consistent. effective in some societies than others? Corollary 2
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 241

states that the parameter range under which network- 4. Role of Signals from Social Contacts
based scores are more precise is larger when the average In the preceding sections, we developed a model with
number of friends is higher. If everything else remains continuous types and assumed that every individual
the same, the benefits of network-based scoring may had identical incentives to improve her credit score.
be greater in societies where people maintain a large In reality, there may be differences among consumers
number of connections, which are likely to be societies about how much utility they can gain from improving
with collectivist cultures (Hofstede 2001). Interestingly, their credit score conditional on their type. In this
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

several start-ups turning to social scoring have been

section, we introduce a discrete version of the model
growing in countries known to have collectivist cultures
to allow for this possibility. The discrete version allows
where the density of relationships is generally higher.
us to analyze in greater detail how the firm uses
Lenddo, for instance, operates in Mexico, Colombia,
signals of low versus high type friends when assessing
and the Philippines, and reports that Mexico is its
fastest growing market.11 a consumer’s creditworthiness. This enables us to
disentangle and contrast the role of high- and low-
3.2. Lending Rates with Endogenous type contact signals in the network.
Network Formation
We now relate our scoring formulation to lending 4.1. Credit Scoring and Tie Formation with
rates, i.e., access to finance at the intensive margin. High and Low Types
The discussion in this section implies that network- Consider a society with two types of borrowers: high
based scoring affects the rates at which consumers can types (h) and low types (l) where the prior is uniform,
borrow, even if they would qualify to receive credit with Pr4xi = l5 = Pr4xi = h5 = 12 . Whereas high types
using the individual score system. For simplicity and have a low risk of credit default, low types have a
concreteness of discussion, we specify the perceived
higher risk. With probability , any two consumers
probability of repayment of credit by consumer i, Pi as
will meet. On meeting, they learn each other’s type
1 and their match value mij > 0, which is i.i.d. across
Pi = 1
1 + e−i pairs, with positive distribution density f . For i, the
utility of becoming friends with j is
which increases from 0 to 1 as the lender’s assessment
of the borrower’s posterior mean, i , increases from
− to +. Consider a risk-neutral lender who earns a mij − 18xj 6=xi 9 1 (12)
rate of ro from a non risky investment. Let ri be the
lending rate to be charged to consumer i with type xi . where the disutility of becoming friends with a dif-
The firm determines the rate by solving ferent type is normalized to 1. The utility of not
becoming friends is 0. Given the specification, the
Pi 41 + ri 5 + 41 − Pi 5 · 0 = 1 + ro 0 probability that two same-type consumers will become
This formulation takes into account not only the friends conditional on meeting is 1, while the proba-
expected creditworthiness of a consumer, i , but also bility of two different types becoming friends is p ≡
the outside options of the lender, ro . For ro = 0, the Pr 4mij > 15 < 1. Hence the network features preference-
borrowing rate for i equals the log odds of default based homophily. We retain the assumptions S → +
versus repayment and → 0 and set S = N for some positive number N .
With the discrete formulation, the expected number of
1 − Pi
ri = = e−i 0 (11) friends for any type is 12 S41 + p5: Increasing the degree
Pi of homophily (a lower p) reduces the expected number
As the consumer’s likelihood of a default increases, of friends.
she faces a higher borrowing rate. Note that the finan-
cial utility of a consumer given in Equation (6) can Network-based Score. We assume that the lender
be derived by assuming that the lending rate enters may observe a signal yi which is −1 or 1, indicating a
the utility through − log4ri 5. If lending rates can be low or high type. The signal is credible but incorrect
interpreted in the context of economic opportunities with probability < 12 . This implies, for example, that
available to consumers, then a consumer with a better if the lender receives a signal from an l-type consumer,
network score will be likely to receive a loan on bet- with probability 1 − it observes yi = −1 and with the
ter terms. This links network-based credit scores to remaining probability it observes yi = 1. Let yi be the
financing access at the intensive margin. collection of signals from i and the friends of i. We
first explore how the firm perceives the probability of
11
http://techonomy.com/2014/02/lenddos-borrowers-mexico-philip an agent being of h-type conditional on the structure
-pines-get-credit-via-facebook/. of her social network.
Wei et al.: Credit Scoring with Social Network Data
242 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

Lemma 3. In evaluating the type of i, the posterior for Figure 2 Pr4xi = h yi 5 vs. Li 4 = 0041 p = 0061 Hi = 101 yi = −15
her to be high type is
0.6
yi
p + 41 − 5 Li

Pr4xi = h yi 5 = 1 + 0.5
1− + 41 − 5p

Pr (xi = hyi)
Hi −1 0.4
+ 41 − 5p
· 1 (13)
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

p + 41 − 5 0.3

where yi is the signal observed for agent i, Hi is the number 0.2

of friends with high signal, and Li is the number of friends
0.1
with low signal.
Lemma 3 suggests that low- and high-type signals 10 20 30 40 50
observed for a consumer’s social connections affect the Li
lender’s assessment of that consumer’s creditworthi-
ness in different directions. Note that 4 + 41 − 5p5/
4p + 41 − 55 < 1 and 4p + 41 − 55/4 + 41 − 5p5 > 1. Because the lending rate enters the utility additively
Thus, high-type signals increase the likelihood that through −xi log4ri 5, we have
an agent will be categorized as high type, whereas X
Ui = 4mij − 18xi 6=xj 9 5 + xi Ri 1 (14)
low-type signals reduce this likelihood. Figures 1 and 2
ij∈G
illustrate how Pr4xi = h yi 5 changes with Hi and Li .
The firm would prefer to extend credit to l-types with a where Ri ≡ log4Pi /41 − Pi 55 is the log odds of repay-
higher number of h-type connections, if everything else ment. A higher Ri implies a lower risk of extending
remained the same. This suggests that in a given net- credit to an individual. Furthermore, the parameter xi
work where l-types are fairly segregated from h-types calibrates the importance of improving access to financ-
due to homophily, l-types who are bridges between ing. Note that this formulation allows low and high
l-types and h-types may be favored by the lender types to have two different levels of financial need.
(compared to l-types surrounded by the same-types). When h < l , high types’ utility is less dependent on
Put differently, in-group centrality of l-types will hurt improving financing compared to the low types. When
their financing opportunities whereas between-group h = l , both types have identical financial needs. This
centrality will improve them. exposition mirrors our continuous-type model, except
Endogenous Network Formation. Equation (13) that different types may weigh financial concerns
applies only when tie formation is based only on social (represented by Ri ) differently when forming ties.
utility and excludes the credit score ( = 0). We now Let consumers choose tie formation rules before the
consider the case wherein consumer utility includes meeting process. Intuitively, given the network-based
credit score. We construct the utility of a borrower score, consumers will be more selective towards low
similar to §3.2. Pi is the firm’s assessment of borrower types and less selective towards high types. Because of
i’s probability of repayment, which we may take as the the simplicity of the discrete-type model, the friend-
posterior probability that i is a high type. The lending ship rules we allow are general and flexible. More
rate for borrower i is again given by ri = 41 − Pi 5/Pi . specifically, two high types will continue to form a tie
with probability 1 after they meet. As to the friendship
between low types, a low type i will set a threshold i
Figure 1 Pr4xi = h yi 5 vs. Hi 4 = 0041 p = 0061 Li = 101 yi = −15 and accept another low type j, iff

mij − i > 00
0.8
Because friendships are formed based on mutual con-
Pr (xi = hyi)

sent, a friendship between a high and low type can

0.6 only be formed when the high type accepts friendship.
A high type i will accept a low type j, iff
0.4 mij − i > 00

As in the continuous case, social credit scoring makes

10 20 30 40 50 consumers wary of forming ties with low types. In the
Hi discrete case low and high types are allowed to differ
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 243

in their need for financing, and low types face discrimi- other words, when l-types are very selective in forming
nation or social rejection from both low and high types. ties among themselves (p low), then in-group ties help
This result is interesting since discrimination is often to achieve a more favorable assessment from the firm,
thought to take place between groups or is believed to as low types have fewer ties than high types and a
be exercised by one group on another. Interestingly, large friendship circle becomes a conspicuous signal,
within-group discrimination arises endogenously with suggesting that one is more likely to be a h-type. That
the use of the network-based scoring for the low types, is the reason low-type signals can increase the high
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

in addition to the more common between-group dis- type perception, Pi 41 5. Yet when low types are less
crimination. Within-group discrimination may make selective towards other own types, the negative signal
the surviving within-group ties more valuable, as we begins to dominate the positive impact from size of
will see next in Lemma 4. social circle and Pi decreases in Li .
We define a symmetric profile characterized by two We now turn to the impact of how selective low types
thresholds, i.e., 41 5, where i = for all low type i are in forming ties among themselves, characterized
and i = for all high type i. Let 41 1 i 5 denote a by selection rule . Ri 41 5 is not always decreasing
symmetric profile except for possible deviation of a low ¯
in Li . In particular, we can define a value 45 such
type i. Let Ɛ4Ui l1 1 1 i 5 represent the expected utility that the expected effect of an additional low type friend
before the meeting process for a low-type individual i ¯
on Ri 41 5 is positive, iff > 45. Formally, ¯ can be
defined by
X
Ɛ4Ui l1 1 1 i 5 = Ɛ mij − 18xi 6=xj 9 l1 1 1 i 1−
p + 41 − 5p¯ p¯ + 41 − 5p

j2 ij∈G
= 10
+ 41 − 5p p + 41 − 5
+ xi =l Ɛ6Ri 41 5 l1 1 1 i 71
¯
It can be easily shown that 0 < 45 < . We show,
where the lender’s posterior assessment is Pi 41 5 =
Pr4xi = h yi 1 1 5, consistent with the profile. Similarly, in detail, how a consumer’s odds of a favorable risk
Ɛ4Ui h1 1 1 i 5 is the corresponding expected utility assessment vary with respect to the selectivity of l-types
in Lemma 5.
of a high type. Using this utility formulation, we first
lay out the lender’s prior about consumer types in Lemma 5. The expected log odds for a low type under a
Lemma 4. common tie formation criterion 41 5, Ɛ6Ri 41 5 l1 1 7,
is strictly quasi-concave in and achieves its maximum at
Lemma 4. Let 41 5 be the symmetric criterion, p ≡ ¯ ¯
45. Furthermore, 0 < 45 < , i.e., the selectivity among
Pr4mij > 5 the probability of two l-types forming a tie,
low types which results in the most favorable risk assessment
and p ≡ Pr4mij > 5 be the probability of a tie formation
for a low type, is lower than the selectivity of high types
between h and l types. Then the posterior probability of i
towards the low types.
being high type is
Figure 3 plots a numerical example for the expected
Pr4xi = h yi 1 1 5 log odds of repayment as a function of . Note that

yi
p + 41 − 5p Li
very high or very low levels of within-group selectivity
= 1+
1− + 41 − 5p
Figure 3 Expected Log Odds of Repayment vs. Selectivity
p + 41 − 5p Hi 41/25N 41−p 5 −1

· e
1 (15) –1.4
p + 41 − 5
–1.6
where Hi is the number of friends with high signal, and Li
is the number of friends with low signal. –1.8
E(Ri (, )l, , )

Lemma 4 presents a slightly different result com-

– 2.0
pared with Lemma 3 in decomposing the contribu-
tions of high and low signals. When consumers form – 2.2
ties endogenously, the probability of a favorable risk
assessment, Pi 41 5 (or the corresponding Ri 41 5), is – 2.4
increasing in the number of high signals (i.e., Hi ) for
any level of p . By contrast, Ri 41 5 increases in the – 2.6
number of friends with low signals (i.e., Li ) only if p
is sufficiently small,12 and decreases in Li otherwise. In – 2.8
0 1 2 3 4 5 6 7
()
12
Precisely, when p < p + 4/41 − 5541 − p 5. Note. = 002, f ∼ â 431 25, = 5.
Wei et al.: Credit Scoring with Social Network Data
244 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

result in lower expected odds, whereas medium levels the low types are (i.e., independent of the value of l ),
of selectivity among low types yield the most favorable low types will face within and between group exclusion.
risk assessment for them. The inverse U-curve rela- More important, since high types are more successful
tionship stems from two competing forces that shape in tie formation, they can afford to be selective in
low-type borrowers’ chances of receiving a loan. As forming friendships. The low types, by contrast, cannot
the level of selectivity begins to increase from zero, the be picky choosers: If they set the friendship threshold
expected assessment initially improves. Consumers too high, they find themselves on the downhill side of
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

benefit from disassociating themselves from l-types, the expected log odds curve (Figure 3). They would
thus improving the appearance of being an h-type. achieve a higher score and higher social utility by
As selectivity increases further, however, a second being less selective. As a result, the within-group
and competing effect starts to dominate: Consumers’ discrimination against low types is always lower than
ego-networks begin to shrink extensively. Recall that the between-group discrimination against them. This
the size of a borrower’s network becomes a conspicu- result is formally stated in Proposition 4.
ous signal of her type when consumers can form ties Proposition 4. Suppose h 1 l > 0. In any symmetric
endogenously. Extreme selectivity leads to a smaller ¯ ∗ 5 and ∗ > 1,
equilibrium 4 ∗ 1 ∗ 5, we have 0 < ∗ < 4
number of ties and so reveals the true low type of a i.e., when both types gain utility from improving their credit
borrower, thus reducing her chances of a favorable scores, the within-group discrimination among low types is
credit assessment. always lower than the between-group discrimination against
them.
Lemma 6. The expected log odds for a low type is strictly
decreasing in for < . Higher levels of selectivity of high In summary, two forces influence the network-based
types towards low types reduce the chances of a favorable score in equilibrium to be more or less diagnostic for
assessment for low types. detecting a low type. Compared with the scenario
before people react, higher exclusion among low types
The lemma states that, unlike the within-group exclu- make social network-based scoring less powerful, by
sion that helps low types to some degree, between-type Lemma 5. Similarly, higher levels of exclusion on low
exclusion strictly reduces their chances of improving types by high types increase the accuracy of the scores
their financial outcomes. As high types exclude lower by Lemma 6.
types from their networks, the latter’s chances of a
favorable assessment from the firm decreases, resulting 4.2. Special Case: Lower Financing Needs for
in further hardship for this segment. High Types
We will look for a symmetric equilibrium where Until now, we have focused on an environment where
no consumers have ex-ante incentive to deviate, and the high types need financing. In reality, it is often the
the company’s posterior is consistent with their equi- case that the need for financing (i.e., obtaining credit
librium behaviors. More precisely, 4 ∗ 1 ∗ 5 is a sym- or a loan) is markedly more severe for low types. To
metric equilibrium if, for all i, Ɛ4Ui l1 ∗ 1 ∗ 1 i 5 (or address this possibility, we provide the outcomes from
Ɛ4Ui h1 ∗ 1 ∗ 1 i 5, depending on i’s type) is maximized the special case when h = 0. Note that, by continuity,
by i = ∗ (or i = ∗ ). While ensuring that there will this implies that similar results would hold if h is
be no unilateral deviation, a Nash equilibrium in social a very small positive number. Note also that when
networks does not necessarily rule out mutual improve- h = 0, is no longer material, and high types form a
ment in the utility of consumers. For example, a very tie with low types only when mij > 1 (i.e., = 1).
high acceptance criteria such as ∗ = can always Proposition 5. When h = 0, there exists a unique
be part of an equilibrium because if no l-type accepts ¯
equilibrium among low types such that 0 < ∗ < 415. When
another l-type, an l-type would have no incentive for only low types gain utility from improving their credit
unilaterally deviating from this threshold. We remove scores, there exists within-group discrimination among low
“unintuitive” equilibria similar to the one described types in equilibrium. This discrimination level is lower than
from consideration. Formally, we will not consider the preference of high types to avoid forming relationships
tie formation criteria 4 ∗ 1 ∗ 5 an equilibrium if there with low types due to mere homophily.
is another profile 4 ∗∗ 1 ∗ 5 such that (i) low types are Proposition 5 suggests that when high types put no
better off, and (ii) given that high types choose ∗ and or very little weight on access to financing, they may
every other low type chooses ∗∗ , a low type is willing reject many social ties with low types due to homophily.
to set her criterion ∗∗ as well. Similarly, we do not In addition, due to financial concerns, l-type consumers
consider 4 ∗ 1 ∗ 5 an equilibrium if there is a profile are systematically excluded even from the networks
4 ∗ 1 ∗∗ 5 with unintuitive properties alike. of others similar to them. Put differently, existing
Note that from Lemma 5, for any equilibrium, ∗ < ∗ financial inequality breeds within-group discrimination
should hold. In words, when both low and high types and social isolation among those of lower type and
need financing, regardless of how dire the needs of greater need.
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 245

4.3. Explicit Discrimination Against Low Types type. It is not difficult to see that Lemmas 5 and 6 can
We have shown how strategic discrimination against be stated here without change. Furthermore, a result
low types may emerge endogenously even in the can be derived that corresponds to Proposition 4.
presence of nonstrategic homophily among low types.
Proposition 6. When low types are exogenously dis-
To extend the discussion on discrimination, we analyze
criminated against and l 1 h > 0, in a symmetric equilib-
an environment with exogenous discrimination against ¯ ∗ 5 < ∗ < 1 and ∗ < 1.
rium, 4
l-types. To formally express such discrimination, we
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

construct the utility for i of becoming friends with j in

a manner similar to but different from the specification 5. Effort to Become a High Type
in Equation (12) Our results thus far have relied on the assumption
mij − 18xj =l9 0 that consumers are endowed with types that cannot be
changed. In other words, we assumed that there is no
Keeping the discrete matching formulation with this
social mobility. Although some type indicators (e.g.,
slight modification, the probability that two h-type
family, race, birthplace, country of origin) cannot be
individuals will become friends conditional on meeting
altered, other potential indicators, such as occupation or
is 1 and the probability that any other type of pairs
financial discipline, can be improved if low types exert
will become friends is p1 ≡ Pr4mij > 15 < 1. Social utility
effort (e.g., by investing in education). In this section,
is penalized whenever one becomes friends with an
we extend our discussion to allow for this possibility.
individual who is an l-type.
An array of factors may force l-type consumers to exert
Parallel to Lemma 3, the following lemma gives the
effort, but we will focus on factors endogenous to tie
posterior before consumers strategically form their
formation such as the reduction of borrowing costs
social ties to obtain better network-based scores. Note and the threat of social exclusion.
that mathematically the following lemma is a special We model the mechanism in the following fashion.
case of Lemma 4 where p = p = p1 . Consider a friends network G among l and h type
Lemma 7. Let p1 ≡ Pr4mij > 15 be the probability of consumers. Let Gl denote the subnetwork among the
formation of a tie with at least one low type. Then, low types. Furthermore, let Hi denote the number of
yi Li h-type contacts of a low-type i, which are collectively
p1 represented with the vector H for all of the low types.
Pr4xi = hyi 5 = 1+
1− +41−5p1 Similarly, let Li denote the number of l-type contacts of
Hi −1 a low-type i. Each low-type consumer may exert effort
p1 41/25N 41−p1 5 ei ≥ 0 such that, with probability ei , she will become a
· e 1 (16)
p1 +41−5 high type. Note that the effort therefore projects types
of possible future contacts. We assume that given the
where Hi is the number of friends with a high signal, and Li
network and parameters of our model, ei ≤ 1 for all
is the number of friends with a low signal.
low-type i. High-type consumers exert zero effort and
The lemma says that having a friend with a low remain high types.
signal actually improves one’s score. When explicit dis- The utility that a low-type individual i derives from
crimination is present, the expected number of friends exerting effort ei is composed of two parts
varies for each type: For a high type, the expected X
degree is 12 S41 + p5, whereas for a low type it is Sp. Ui 4e1 G5 = 8mij − 18xj =l9 41 − ej 59 + ui 1 (17)
j2 ij∈G
Similar to the endogenous rise of discrimination, a
larger social network is a conspicuous signal. A con- where
sumer with a larger network emits a stronger signal
that she is a high type. Because in expectation low ei X
ui = aei − b − Hi + ej ei 0 (18)
types have a smaller social circle, any tie becomes a 2 j2 ij∈G1 x =l
j
signal of being high type.
What happens when both exogenous discrimination The term in curly brackets in Equation (17) captures
and endogenous tie formation are at work? Lemma 7 consumer i’s expected social utility under the assump-
implies that consumers will be less selective towards tion of explicit discrimination (§4.3) and exertion of
low types in an attempt to obtain better scores. Similar own and friends’ effort. Given the effort of a friend ej ,
to the thresholds we defined for the homophily case, there is 1 − ej probability that j will remain a low type,
we let low types choose a criterion i ≤ 1 towards their in which case i’s utility from forming ties with j will
same type, and let high types choose i ≤ 1 towards be discounted by a unit normalized to 1.
low types. High types continue to form ties with The term ui expresses the nonsocial benefits and costs
probability 1 upon meeting. A tie between two different of exerting effort. First, term aei captures the expected
types forms only when the high type accepts the low intrinsic benefits of becoming a high type. Second, the
Wei et al.: Credit Scoring with Social Network Data
246 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

cost of effort is captured with the marginal cost bei /2 5.1. Effort in an Exogenous Network
that is increasing in effort. Third, under social network- We are interested in the Nash equilibrium under which
based scoring, a (potential) high-type friend j has a consumers simultaneously choose their efforts when
positive effect on i’s credit score and thus reduces i’s the network is exogenously given. Proposition 7 sum-
financing burden. We formally express this network marizes the optimal level of effort for a consumer
effect by allowing the marginal cost of effort for i to conditional on her social network, following Ballester
decrease in the number of high-type friends she has and et al. (2006).
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

in the efforts of her low-type friends to become high

P Proposition 7. Let Al be sociomatrix (i.e., the adjacency
types, at rate > 0. Alternatively, b4Hi + ij∈G1xj =l ej 5ei
matrix) of Gl .
can be thought of as an interaction term, representing
(i) Under a discriminating social utility, if the largest-
how the return to one’s own effort (ei ) is expected to
magnitude eigenvalue of Al is smaller than −1 , then the
be amplified by the number of friends one expects will
equilibrium effort is
be considered high type. Some investors, for instance,
may prefer friends who are also invited to participate e∗ = 4I − Al 5−1 4ab −1 + H5
in exclusive investment opportunities (Bursztyn et al.
2014). In a very different setting, one is likely to gain = 4I + Al + 2 Al2 + · · · 54ab −1 + H50 (20)
admission to an exclusive bar or dance club if oneself
and the rest of one’s party are attractively dressed. (ii) Under a homophilic social utility, if the largest-
It is important to make two notes here. First, the magnitude eigenvalue of Al is smaller than 2b −1 + −1 ,
derivation of the functional form of ui is a reduced- the equilibrium effort is
form approach to motivate complementarity between
e∗ = 6I − 42b −1 + 5Al 7−1 64a + H − L5b −1 + H7
one’s effort and the effort of her friends. It is possi-
ble to derive this form of complementarity based on = 6I + 42b −1 + 5Al + 42b −1 + 52 Al2 + · · · 7
the results provided in the earlier sections. (In the
· 64a + H − L5b −1 + H70 (21)
online appendix (available as supplemental material at
http://dx.doi.org/10.1287/mksc.2015.0949), we offer a Proposition 7 states that the effort exerted by con-
more detailed description of how Equation (18) can be sumers to improve their score relies on several factors.
derived through this route.) As demonstrated in §4, A discriminatory environment and an environment
under network-based credit scoring with non-zero with homophily differ in the role of the low types in
financing needs for both types, low types will face inducing effort. In both environments, a consumer with
within-group and between-group discrimination. Under a higher number of high-type friends is likely to exert
such pressure, l-type consumers would exert effort more effort, as her overall cost of borrowing is lower.
to increase their social and credit scoring utility from In an environment with discrimination, if two l-type
friendships. The benefits of exerting effort depend on consumers are connected to the same number of h-type
the expected number of low- and high-type friends. friends, the one with a higher number of l-type friends
Second, it is possible to consider alternate specifica- is incentivized to exert more effort. This is perhaps sur-
tions of social utility. For instance, we could also inves- prising, as sufficiently high within-group connectivity
tigate an environment with pure homophily instead of can be a stronger motivator of effort. By contrast, in
discrimination, in which case Equation (17) would be homophily, increasing proportions of low-type friends
replaced with can reduce effort due to enhanced social utility when a
consumer with low-type friends remains low type with

X
Ui 4e1G5 = mij −ei 18xj =l9 41−ej 5 low effort.
j2ij∈G
The expression for the equilibrium level of effort
−41−ei 5 18xj =h9 +18xj =l9 ej +ui 0 (19) given in Equations 4205 and 4215 is a form of Bonacich
centrality. The effort exerted by an agent to improve her
In an environment with homophily, consumer i will credit score is proportional to her Bonacich centrality
become a high type with probability ei , in which case measure, which is the “summed connections to others,
there will be a disutility for a tie with consumer j weighted by their centralities” (Bonacich 1987, p. 1172).
who, after exerting effort ej , remains a low type (which With a discriminating social utility, a consumer who is
happens with probability 1 − ej ). With probability 1 − ei , at the center of a social network is likely to be exposed
consumer i will remain a low type, in which case to higher positive network effects, and therefore may
she will face a disutility from ties with high types exert greater effort. As a result, consumers who are
(including low types who become high types after more central in the network are more prone to social
exerting effort ej ). mobility when there are complementarities. In an
Next, given the utility form in (18), we will first environment with pure homophily, there will be two
derive the optimal effort level in a given network. conflicting forces determining centrality and social
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 247

mobility relationships. First, being central in a network with a low type (which is normalized to 1) prevents
of high types and low types who exert effort can any pair from becoming friends. Moreover, a complete
increase a consumer’s chances of social mobility. Second, network is pairwise stable because everyone exerts
if a low-type consumer is central among other low reasonable effort. The effort reduces the disutility of
types who exert little effort, she will reduce her effort friendship between low types; the friendship utility
to “fit” and be similar to her network to enhance between any pair is exactly zero. Breaking any one
her social utility. Therefore, in tie formation based on link increases the costs of effort for the pair; thus, they
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

homophily, it is possible for central low types to exert will decrease their efforts. This leads to higher costs
low effort leading to permanent low class membership for their friends, and eventually everyone’s effort will
and financial hardship. decrease. As a result, everyone receives less utility
from the friendship and effort.
5.2. Effort with Endogenous Network Formation Overall, the example suggests that the network struc-
Among Low Types Under Discriminating Utility ture in different societies may facilitate social pressure
As we have specified in (17)–(19), the friendship utility to exert effort at different rates. In particular, in soci-
of a friend of i depends on the effort that i will exert. eties where network structure is sparse, social pressure
Hence the effort of i plays an important role in her is expected to be less effective and social mobility
friends’ network formation. Moreover, in the last sec- may remain limited. By contrast, in denser societies,
tion, we saw that i’s effort depends on her position social pressure can be more effective, motivating higher
in the network. This mutual dependence between the levels of social mobility. The difference suggests that
network position and effort suggests the possibility of network-based scoring practices are expected to reach
multiple stable situations. With discriminating social different levels of success in different societies, and
utility, for example, in one society, people may exert that the performance is conditional on the network
low effort, and as a result, may become sparsely con- structure of society.
nected. This in turn gives little incentive for them to
exert effort. Conversely, in another society, people may 6. Extensions
exert high effort and thus may become more densely
connected, reinforcing their high-effort behavior. 6.1. Uncertainty About Friends’ Types
In our main model, the underlying assumption was
To further explore how effort mitigates the likelihood
that upon meeting, consumers learn about each others’
of exclusion, we consider a two-stage game under
types with certainty. In reality, types may be observed
the discrimination environment. In the first stage,
with some noise. Consider the case wherein consumers
consumers choose friends, and friendships are formed
meet others but observe their types imperfectly. Let
bilaterally. In the second stage, consumers exert effort.
consumer i observe a signal of xj upon meeting with j,
Let e∗ 4G5 be the Nash effort for a given network G,
which is correct with probability 1 − with 0 < < 12 .
which is characterized in Proposition 7. The first-stage
This implies that the added utility from homophily
reduced form utility for i depends on G only
relies on how the uncertainty about the other’s type
Ui 4e∗ 4G51 G50 is resolved: Expected social utility is mij − if the
signal is the same as one’s own type, and mij − 1 +
We look for pairwise-stable networks G under U . G is otherwise. Respectively, probabilities p ≡ P 4mij > 5 and
pairwise stable if (i) for any ij ∈ G, we have both p1− ≡ P 4mij > 1 − 5 define how likely two consumers
Ui 4G5 > Ui 4G − ij5 and Uj 4G5 > Uj 4G − ij5; (ii) for any are to become friends upon meeting.
ij y G, Ui 4G5 ≥ Ui 4G+ij5 or Uj 4G5 ≥ Uj 4G+ij5. Example 3 Compared with the benchmark model, the added
provides an application of different stability outcomes uncertainty implies that ties will be less informative
in equilibrium. for the firm to predict a consumer’s type. To see this,
first note that under this formulation, the probability
Example 3. Consider a society with four low-type that two consumers of the same type will form a tie
consumers and explicit discrimination, and assume upon meeting is
that a = 1, b = 5, and mij = 12 for all i1 j. Let b = 15 . It
can be easily verified that the empty network and the qs ≡ 41 − 52 p + 41 − 41 − 52 5p1− 1 (22)
complete network are pairwise stable. For the empty
and that two consumers of opposite types will form a
network, each person exerts effort 15 and obtains utility tie is
of 101 . For the complete network, each person exerts qd ≡ 2 p + 41 − 2 5p1− 0 (23)
effort 12 and has utility 85 .
Using these probabilities, we can formulate how the
The example demonstrates that the empty network firm will assess a borrower’s type to be high as given
is pairwise stable because everyone exerts very low in Lemma 8.13
effort. A single link between a pair will not generate a
sufficiently large change. The disutility of friendship 13
The derivation of Lemma 8 follows the derivation of Lemma 4.
Wei et al.: Credit Scoring with Social Network Data
248 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

Lemma 8. When consumers learn about each others’ belief about a consumer’s type. Only the strong ties
types with uncertainty, will reveal information about a contact’s type and
yi become eligible for the firm to use to determine the
qd + 41 − 5qs Li

social score.
Pr4xi = h yi 5 = 1 +
1− qs + 41 − 5qd The general implication is straightforward. Because
strong ties are more homophilous than weak ties and
qs + 41 − 5qd Hi −1

· 1 since they provide a greater ability to learn about one’s
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

qd + 41 − 5qs contacts, the accuracy of social scoring increases with

the relative prevalence of strong versus weak ties.
where Hi is the number of friends with high signal, and Li
is the number of friends with low signal. 6.3. Effort to Enhance Probability of Meeting
We are interested in how the presence of noise in High Types
detecting each other’s true types in social relationships In §5, the model was built such that the low-type
may influence the firm’s ability to rely on social credit consumers exerted effort to climb social ladders by
scores. We compare Lemma 8 with Lemma 3. Since improving their type. Under some circumstances, con-
p1 < qd < qs < 1, 1 < 4qd + 41 − 5qs 5/4qs + 41 − 5qd 5 < sumers cannot change their type but can exert effort to
4p1 + 41 − 55/4 + 41 − 5p1 5 and 4 + 41 − 5p1 5/4p1 + increase the probability of meeting high types. Net-
41 − 55 < 4qs + 41 − 5qd 5/4qd + 41 − 5qs 5 < 1. In words, working is an example of such directed effort. In this
signals from contacts carry less weight in forming section we explore this possibility, which also allows
beliefs about a consumer’s type when types cannot be us to endogenize the probability of meeting between
perfectly observed in friendship. two consumers.
There are two observations related to this finding. We use the settings of the discrete-type model in §4
First, the level of information sharing between con- and allow individual i to choose an effort level ei .
sumers can change the appropriateness of a social Conditional on the effort exerted, the individual is likely
network for credit scoring. For example, if an online to meet another person randomly with probability
4M/S5ei , where M is a constant that calibrates the
network allows consumers to frequently communicate
chance of meeting another person proportional to the
and exchange in-depth information, this may positively
effort exerted in a society of size S. A meeting between i
influence the efficiency of credit assessment by reducing
and j happens when either of the two “runs into”
the uncertainty about friends’ types. Second, the ability
the other. Suppose a common effort e is exerted by
of peers to observe each other’s types may correlate
everyone but i. Then the expected number of meetings
with the characteristics of the network, including tie
for i becomes
strength. For example, parameter could reflect the
strength of ties correlating with the ability to convey M M e eM
S 1 − 1 − ei 1−e = ei + e − i M0
complex or subtle information (Van den Bulte and S S S
Wuyts 2007, pp. 71–72) and hence with one’s ability to
When S → +, the expected number of meetings
observe a friend’s type. Next, in §6.2, we discuss this
increases to 4ei + e5M.
in detail.
First consider the scenario of exogenous tie formation
6.2. Friendship Formation and Strength of Ties where an individual’s utility depends only on the social
In §6.1, we maintained the assumption that all relation- utility from friendships. Recall that, upon meeting
ships carry equal information and pointed out that the with j1 i always forms a tie if j is of the same type, and
informativeness of a link may relate to tie strength. forms a tie, iff, mij > 1 if j is of the different type. Let f
be the density of the matching value distribution. The
We will adjust the earlier model slightly to extend the
expected social utility for i, given that a symmetric
earlier discussion.
effort e is used except for possible deviation of i to ei ,
Specifically, we assume that consumers can form
can be derived from
weak and strong ties, and that they learn about others’
type with certainty only if they have strong ties with X
Ɛ mij − 18xi 6=xj 9 xi 1 e1 ei
them. After meeting, a match value mij > 0 and the j2 ij∈G
tie type are randomly determined. If the tie is strong,
consumers obtain the utility mij − 18xi 6=xj 9 by forming a M Z Z
= 4ei + e5 tf 4t5 dt + 4t − 15f 4t5 dt 0
friendship. If the tie is weak, types remain unknown, 2 0 1

and the social utility of forming a tie is mij − . Param-

Let å denote the term in the last parentheses. Let 12 ei2
eter captures the disutility from forming a weak tie.
be the cost of effort. The equilibrium effort is then
Because weak ties do not carry information about
given by
the type or the type difference between the ego and the Må
friend, a firm cannot use them to update its posterior e∗ = 0 (24)
2
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 249

Under this common effort level, the firm’s posterior on Finally, an environment where the low types exert
type is again given by (13) in Lemma 3. sufficiently high levels of effort could help to create a
Next, we set this equilibrium effort level as the bridge between the two types, possibly reducing the
baseline, and compare it to that when social relation- social separation. Therefore, low types who have more
ships affect financial benefits. Because the credit score to gain from improving their financing (l > h ) could
introduces asymmetric desirability of low-type and exert sufficiently great networking effort to connect the
high-type friends, the effort levels exerted by low two types.
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

types and high types will, in general, be different. In

principle, the firm’s posterior needs to incorporate the 7. Conclusion
difference in efforts. Here for simplicity, we focus on
how effort level will differentiate between types but 7.1. Main Insights
omit how it would affect a firm’s posterior assessment. Increasing access to financing is important in many
Formally, we let the credit score enter utility addi- countries where institutions and contract enforcement
tively through xi Ri with Pi given simply by (13). are weak (e.g., Feigenberg et al. 2013, Rona-Tas and
We characterize a symmetric equilibrium, by which we Guseva 2014). In low-income countries, in particular,
mean the effort pair 4el∗ 1 eh∗ 5 where every low type part of the credit access problem stems from the fact
chooses el∗ and every high type chooses eh∗ such that no that reliable data on financial history do not exist, are
consumer has an incentive to deviate. limited, costly to collect or hard to verify. In these
The following proposition summarizes how the countries, lenders tend to be very conservative in
consumer motivation to meet others changes compared accepting borrowers’ credit applications. This, of course,
with the effort they would exert simply to maximize makes it even harder for those who are in financial
their utility from friendships. hardship to obtain credit and generate a financial track
record. Group lending has proven to be a popular way
Proposition 8. to address this problem. An alternative and possible
(i) For l = h > 0, eh∗ > e∗ > el∗ . When both types have complement is to use additional available data to assess
identical needs for financing, high types exert more effort applicants’ creditworthiness. Using social data is one
than low types. such option.
(ii) For l sufficiently larger than h , el∗ > eh∗ ≥ e∗ . When Motivated by the importance of consumer access
low types have higher needs for financing, they exert more to credit and by the increasing use of network-based
effort than high types. credit scoring, we analyzed the potential implications
of such practices for consumers. Our study shows
The proposition suggests that when l and h type that there are benefits to collecting information from a
consumers have identical needs for financing, high consumer’s network rather than only individualized
types exert higher levels of effort to increase their data. Simply put, when consumers have an above
probability of meeting others compared to low types average chance of interacting with others of similar
and compared with the effort exerted when consumers creditworthiness, then network ties provide additional
only want to maximize social utility. This is because reliable signals about their true creditworthiness. Hence,
high types have a higher marginal return on effort social scoring can reduce lenders’ misgivings about
than the low types (i.e., are more likely to form new engaging applicants with limited personal financial
ties as a result of effort). As a result, independent of history, which include many who are economically
their financial needs, high types always exert more disadvantaged and underbanked.
effort than they would when they earn utility from As these new scoring methods gain popularity, con-
improving their access to credit in addition to the sumers may adapt their personal networks, which in
gains in social utility. Low types, by contrast, have turn may affect the usefulness of these scores. If one’s
lower returns, but when high types make an effort network can influence one’s financing chances, some
to meet others, they also benefit from it. With some consumers, particularly those in more dire need of
probability, a meeting will take place between a low improving their credit score, may form social ties more
and a high type and a friendship will be formed if mij selectively. If all consumers behave in this manner and
is sufficiently high. forming social ties requires mutual agreement, the end
If, on the other hand, the low types’ utility from result of such behavior will be social fragmentation into
improving their credit scores is very high (l very high), subnetworks where consumers only connect to others
this pattern result could reverse. Low types would who are very similar to them. Though we expect that
feel an immense pressure to increase the probability of such fragmentation and balkanization will be deemed
becoming friends with high types, resulting in a higher socially undesirable by many, its implications for net-
level of effort exerted by low types compared to that work scoring accuracy is not straightforward. Although
of high types. there will be fewer ties conveying information about
Wei et al.: Credit Scoring with Social Network Data
250 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

one’s contacts useful in updating lenders’ prior beliefs, when types are sticky and cannot be altered, we allow
each of the ties will be more informative. We find, how- consumers to exert effort to improve their chances of
ever, that there are situations in which social scoring is meeting other people. This second model shows that
beneficial even when consumers adjust their networks. consumers’ networking effort will depend on their
Specifically, these are: (i) consumers place sufficiently need for financing. When high and low types have
low importance on the posterior mean of the firm’s comparable needs for financing, high types have higher
beliefs about their type (low ), (ii) high precision on returns on their effort of creating new ties and thus
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

individual credit scores (high c), and (iii) relatively exert more effort to meet others. Because the types are
dense network (high N ). revealed only after meeting, low types’ likelihood of
To focus on the role of connections to consumers with meeting a higher type increases when high types exert
different levels of financial strength in the emergence effort, too. Therefore they choose to free ride on others’
of balkanized societal structures, we introduce discrete efforts. This outcome reverses when low types are in
types and discrete type matching. Not surprisingly, dire need of financing, and they become the primary
connections to those with high-type signals have an driver of meetings in society.
overall positive impact. More interesting is that the One possible outcome of social scoring, which is not
impact of connections to low-type signal consumers addressed in this research, is that consumers strategi-
can be positive or negative, depending on the tie cally manipulate the perception of their type by trading
formation rules used in society. As demonstrated in friendships for financial access. In particular, realizing
Figure 3, consumers with poor financial health would their higher financial status, high types may want
prefer others like them not to be too selective but also to offer their friendships in exchange for monetary
not to be too liberal in their willingness to associate rewards. To model an environment wherein friendships
with people with poor financial health. Intuitively, as are traded, we may need to consider several additional
the selectivity of same-type consumers increases, the layers of complexity. First, rationally, traded friendships
impact of negative signals received from some of the would need to be formed such that the credit scoring
low-type friends weakens. As the selectivity increases firm should be unable to distinguish a fake relationship
even further, low types’ social circles will shrink such from a true friendship. Otherwise, low types would
that it will be harder for them to emit a high-type have no incentive to pay for a high type’s friendship.
signal, since size of social circles is a conspicuous signal Second, high types must be financially motivated and
of type. As a result, disadvantaged consumers would the benefit from forming a friendship with a low type
prefer some intermediate level of ostracism and social must exceed the losses from less favorable risk assess-
isolation. ment. Third, trading friendships must be rare enough
In our extensions, we discuss two scenarios that that a credit-scoring firm still benefits and desires to use
may reduce the reliability of social scores. First, if data from the social networks. Altogether, modeling
consumers cannot observe their social contacts’ types an environment of this sort would require a fairly
perfectly upon meeting, the added noise will imply complicated model, which goes beyond the purposes
that homophily will play a lesser role in the formation of the current study. Despite the complication, our
of social networks. As a result, firms’ ability to detect expectation for the findings would be fairly simple: In
a borrower’s type by looking at her friends will be line with the extensions discussed in §§6.1 and 6.2, if
limited. Similarly, if the network consists mainly of social ties have lower informative value and homophily
weak rather than strong ties, this will also reduce social is diluted, social credit scores will be less diagnostic in
scores’ diagnosticity, since strength correlates with detecting one’s true creditworthiness.
how well consumers know each other. In both of these
scenarios, contacts’ signals carry lower value to the 7.2. Implications for Public Policy
firm in assessing the risk of a borrower. The link between credit scores and income is hard
We also consider the possibility of exerting effort to ignore.14 It is reported that most U.S. consumers
in two different ways. First, we move away from the with an income under $60K have a poor credit score.15
static type model and allow consumers to improve Moreover, a significant portion of the individualized
their type. We find that when there is discrimination credit score calculation relies on a consumer’s existing
against low types, low- and high-type contacts play a debt level. Those with higher amounts of debt, all else
role in motivating effort, but high types, in general,
have a stronger effect. In an environment with only 14
This is so even though FICO and other leading institutions state
homophily, these results hold as well, unless a consumer
that income is not a part of one’s individual credit score, as it is a
is highly embedded in a network with many low-type self-reported item of assessment.
friends who exert low effort. Such consumers are not 15
http://www.creditsesame.com/about/press/consumers-who-earn
motivated to exert effort towards improving themselves -60000-or-less-have-dangerously-high-credit-usage-levels-according-to
and are more likely to remain a low type. Second, -credit-sesame/.
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 251

equal, are expected to have lower credit scores. With 7.3. Implications for Management
network-based assessment, it is possible for immigrants, To managers in the financial industry, our analysis
underbanked consumers, recent college graduates, and suggests that lenders can expect to reduce their risk in
others who do not have a credit history but who the short run by incorporating network-based measures.
are creditworthy to signal this to lenders with higher This dovetails with new governmental policies on risk.
accuracy. The benefits introduced through network- For example, as part of the regulations by the Basel
based systems may help overcome a portion of the Committee on Banking and Supervision, European
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

financing problems, particularly if networks are created banks have been encouraged to reduce the level of risk
based on attributes correlated to financial health. they undertake (Sousa et al. 2013). Regulations in the
However, our analysis also raises an important con- banking industry encourage U.S. financial institutions
cern about discrimination against already financially to better manage risk as well. These regulations have
disadvantaged and underbanked groups. For instance, come at a time when big data analytics are enabling
the Equal Credit Opportunity Act (ECOA) prohibits financial institutions to access larger and richer data sets.
lenders from discrimination based on sex, race, color, Indeed, it has been reported that social media and
religion, national origin or age. To the extent that some social network data are being used not only by start-
of these characteristics correlate with creditworthiness ups but also by established and more institutionalized
and that homophily along those dimensions correlate credit scoring firms such as Experian (Armour 2014).
with homophily along levels of creditworthiness, a The trend toward using social data may prove to be
side-effect of social credit scoring could be discrimina- useful in the post-crisis environment.
tion in access to credit along characteristics prohibited Our study also offers some insight to managers
by the ECOA (National Consumer Law Center 2014, outside the financial industry who use social scoring
pp. 27–29). Aside from strict legality, there is also a con- for targeting customers when launching new products,
cern that social scoring opens an additional back door targeting ads or designing referral programs. (i) The
effectiveness of social scoring need not decrease when
to discrimination along dimensions that many may find
customers purposely adapt their networks to improve
objectionable (Dixon and Gelman 2014, Pasquale 2015).
their score and their access to the benefits it entails.
Matters are even more complex as our results also
(ii) Marketers do not need information on the complete
show that social scoring may lead consumers with
network. Data on the focal consumer’s immediate
low creditworthiness to prefer being discriminated
contacts already provide an improvement in scoring
against (in tie formation at least) to some moderate
accuracy. (iii) Social scoring is likely to be most diagnos-
extent. Thus moderate levels of discrimination and
tic in societies and communities (online or not) where
social ostracism by fellow consumers may actually
consumers maintain many strong, rather than weak,
help rather than harm disadvantaged consumers. Also,
ties. (iv) Smart marketers will go beyond generic ties
one hitherto ignored societal benefit of social scoring is and seek to leverage specific ties that correlate highly
that it can motivate rather than demotivate financially with the traits they seek in their target customers.
disadvantaged citizens to exert greater effort to improve A car manufacturer such as Audi, for instance, will
their creditworthiness. The financial discrimination and benefit from focusing on Twitter connections pertaining
social exclusion implications of social credit scoring, to cars (personal communication). (v) The benefits
and how they balance against its benefits, warrant of social scoring to the marketer are greater when
attention from policy makers and researchers. the benefits of having a high score matter little to
Finally, our findings here are of interest to policy customers or at least has little impact on those with
makers keen on understanding the mutual interaction whom they chose to form ties. More generally, the
between social status and network structure. As noted benefits of social scoring are greater when they involve
at the outset, our mathematical analysis of credit scoring networks of ties that not only exhibit great homophily
applies broadly to social status. Some people command but also are built and maintained for intrinsic rather
less respect than others. Differences in status are rarely than extrinsic reasons. Examples of the former used in
based solely on differences in true but hard-to-observe social scoring include telephone call data and kinship
ability or character. People often use the company that data (Benoit and Van den Poel 2012, Hill et al. 2006).
others keep as a signal when assessing the respect they Examples of the latter are many ties in general-purpose
deserve. Our analysis of the benefits and challenges of online social networking platforms, where linking is
social credit scoring, including improved diagnosticity very easy and often occurs between casual contacts.
paired with the risk of unwitting discrimination and (vi) Customers with a high number of connections
the seeming paradox of optimal ostracism, extends to (degree centrality) in an undirected network such as
situations wherein citizens, employees or customers Facebook or LinkedIn are not necessarily the most
are valued and accorded status based on the company attractive. This is not only because centrality in such
they keep. networks cannot distinguish between opinion seekers
Wei et al.: Credit Scoring with Social Network Data
252 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

and opinion leaders (in-degree versus out-degree cen- have Pr4y x5 = Pr4y g1 x5. Using Bayes’ rule we have
trality) but also because (as our analysis shows) the
most active networkers may be high-type or low-type Pr4x g1 y5 ∝ Pr4x5 Pr4g1 y x5
customers, depending on whether low types value = Pr4x5 Pr4y x5 Pr4g x50
the benefits of a high consumer score more than high
types. (vii) Marketers should be concerned that social Thus
customer scoring may create the impression of unfair
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

Pr4xg1y5
discrimination. This is not only a legal and an ethi- 2 2 Y −4x −x 52 /2
∝ e−qxi /2 × e−c4yi −xi 5 /2 ×
Y Y
cal issue but also a commercial one. For instance, in e i j
i∈y i∈y ij∈g 1
January 2015, users of WeChat, the Chinese chat app,
protested against discrimination after they were not

2 /2 Ɛ4ni xi 5
61−e−4xi −xj 5
Y Y
× 7× 1− 0 (25)
targeted to see an ad for BMW, the luxury car maker. ij∈g 0 2 i1 j∈y ij∈g 0 2 jyy
S
Some believed that the targeting algorithm involved
social scoring based on those to whom the potential In the expression above, 41 − Ɛ4ni xi 5/S5 is the probability
targets were connected (Clover 2015). Because social that i is not friends with j for some i whose type is xi and
scoring uses inputs beyond one’s traits and history, some j whose type is unknown. Fix some i ∈ y and consider
the term ij∈g 0 2 jyy 41 − Ɛ4ni xi 5/S5. If 8ij ∈ g 0 2 j y y9 is not
Q
marketers must balance improved diagnostics against
empty, then by our assumption on the information structure,
actual and perceived fairness.
it multiplies across everyone in the rest of the society. So its
The insights in this paper also provide some guid- value under the limits of S, , and q is
ance on data collection and system design. Several
firms already create credit scores using social network Ɛ4ni xi 5 S−y

lim 1− = e−N 1
data. One important question they face is whether the S→1 Ɛ4ni xi 5→N S
number of friends is a useful signal for measuring
which is not a function of x thus does not contribute to the
creditworthiness. Our study demonstrates that even
conditional density. Note that the rest of the terms in the
when it is not a signal directly linked to one’s type, the
right-hand side of (25) multiply across finite items. It is easy
practice of network scoring would endogenously make to see that as → 0 and q → 0,
the number of friends a useful signal. Thus social credit
2 Y −4x −x 52 /2
scoring may shape credit assessment in its own image, Pr4x g1 y5 ∝ e−c4yi −xi 5 /2 ×
Y
e i j 0 (26)
i.e., help to construct the reality it is meant to describe, i∈y ij∈g 1

just as modern option theory did for valuing financial

This implies that Pr4x g1 y5 is a multivariate normal density
derivatives (MacKenzie 2006, MacKenzie and Millo N 4Ì1 è5. To find the parameters Ì and è, all we need to do
2003). This makes the implications of our analysis all is match the coefficients. The coefficients of xi2 1 xi xj , and xi
the more important. in the quadratic form − 12 4x − Ì50 è−1 4x − Ì5 are − 12 4è−1 5ii ,
−4è−1 5ij and 4è−1 5i1 1 + 4è−1 5i2 2 + · · · . The corresponding
Supplemental Material coefficients in the right-hand side of (26) are − 12 4c + di 5,
Supplemental material to this paper is available at http://dx 18ij∈g 1 9 and cyi . Matching them gives us the results in the
.doi.org/10.1287/mksc.2015.0949. Proposition.

Proof of Corollary 1. This is a special case of Proposi-

Acknowledgments
tion 1, where i is fixed and y = 8j ij ∈ G9, g 1 = 8ij ij ∈ G9 and
The authors thank the senior editor, associate editor, and
g 0 = 8ij ij y G1 j 6= i9.
two anonymous referees for their comments and sugges-
tions. The authors also benefited from comments by Lisa
Proof of Proposition 2. Let D be the diagonal matrix
George, Yogesh Joshi, Upender Subramanian, Yi Zhu, and
where Dii = c + di , and B = D−1 A where A is the adjacency
participants at the 2014 Columbia-NYU-Wharton-Yale Four
matrix of g 1 . We express the precision matrix by
School Marketing Conference, the 2014 Marketing Science
Conference, the 12th ZEW Conference on Information and è = 4I − B5−1 D−1 0
Communication Technologies at the University of Mannheim,
the 2015 INSEAD Marketing Camp, and the 2015 CRES Let B0 denote the matrix B when c = 0. Since B0 is a stochastic
Strategy Conference at Washington University in St. Louis. matrix (i.e., each row summing up to 1), its largest-magnitude
The authors gratefully acknowledge financial support from eigenvalue is 1. When c > 0, B is non-negative and it is easy
the Rodney L. White Center for Financial Research and to see that
the Social Impact Initiative of the Wharton School of the B < B0 0
University of Pennsylvania.
By the Perron-Frobenius Theorem, we know that the largest-
Appendix. Proofs magnitude eigenvalue of B is smaller than that of B0 , which
is . Given that < 1, we write
Proof of Proposition 1. Because once conditional on the
types x, the signals y are independent of the network, we è = 4I + B + B2 + · · · 5D−1 0
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 253

Because for any k ≥ 1, Bk is non-negative and Bk < k , Next we turn to the matching value. We have
we have
Pr4mij 1 ij ∈ G5 = Pr4mij 5 Pr4ij ∈ g mij 5
4Bk 5ij < k 0 √
r
q Z xi +mij / −qxj2 /2

Now consider a node j whose distance from i in the subnet- −m2 /2
= mij e ij √ e dx j 0
work defined by g 1 is r4i1 j5 ≥ 1. Because A is the adjacency 2 xi −mij / i
matrix of g 1 , and there is no path between i and j whose So,
length is less than r4i1 j5, we know 4Bk 5ij = 0 for all k < r4i1 j5.
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

N −m2 /2 1 1
Hence an upper bound of 4I + B + B2 + · · · 5ij is S Pr4mij 1 ij ∈ G5 → √ m2ij e ij √ +p 1
2 i
+
which, with (28), implies that
k = r4i1 j5 /41 − 50
X
r
k=r4i1 j5 2 2 −m2ij /2
Pr4mij ij ∈ G5 → m e 0
ij
Proof of Lemma 1. Derivation of the lemma follows This is the density of a 3 distribution. So we have
similarly to the proof of Corollary 1. r
2
Proof of Lemma 2. Under a symmetric rule , i, and j Ɛ4mij ij ∈ G5 → 2 0 (30)

become friends, iff, they have met and mij > 4xi − xj 52 . Thus Expected social utility can be computed by summing over i’s
r expected social utility from each j in the society
Z + 2 q −qt2 /2
Ɛ4ni xi 1 5 = S e−4t−xi 5 /2 e dt Ɛ ui =
X
Pr4ij ∈ G5 Ɛ4−xj − xi ij ∈ G5 + Ɛ4mij ij ∈ G5

− 2
j6=i
q −44q5/4+q55xi 2 /2
r
= S e 0
= S Pr4ij ∈ G5 Ɛ4−xj − xi ij ∈ G5 + Ɛ4mij ij ∈ G5 0
q +
p Equipped with (28)–(30), we find its limiting value,
Recall that S q/4q + 15 = N . Taking q → 0 gives the
result. N 1 1 1 1
Ɛ ui → √ 2 p +√ − + 0 (31)
2 i i
Proof of Proposition 3. For notational simplicity, the
A nice intuitive result from this is that the social utility is
expectation sign Ɛ4·5 throughout this proof refers to the maximized at i = 1. Any deviation from that distorts the
conditional expectation Ɛ4· xi 1 1 i 5, which is computed friendship formation and is suboptimal in terms of social
conditional on the type xi and a symmetric rule except for utility.
possible deviation of i to i . Similarly, the notation Pr4 · 5 also Next we look at the expected utility from the network-
refers to the probability with the same conditionals. based score. The bias from using network-based scoring is
P
First we calculate the expected social utility, Ɛ ij∈G 4mij − P
ij∈G 4xj − xi 5

xj − xi 5, which we denote more compactly as Ɛ ui . For any j Ɛ i 45 − xi = Ɛ
we have c + + ni

X
Pr4xj 1ij ∈ G5 = Pr4xj 5Pr4ij ∈ Gxj 5 =Ɛ Ɛ 4xj − xi 5 ni
c + + ni ij∈G
2
(
q −qxj2 /2 e−i 4xi −xj 5 /2
r
if xj ≤ xi 1
= e · 2 (27) ni
2 e−4xi −xj 5 /2 if xj > xi 0 =Ɛ Ɛ4xj − xi ij ∈ G5
c + + ni
Equation (27) enables us to calculate the probability of being

ni
friends with j =Ɛ Ɛ4xj − xi ij ∈ G50
c + + ni
Z + The first equality comes from the fact that yi (and yj ) are
Pr4ij ∈ G5 = Pr4xj 1 ij ∈ G5 dxj unbiased signals of xi (and xj ). The second equality uses the
−
r iterated law of expectation. The last equality uses the fact
1 1 1 q −4q/4q+155x2 /2
= p +√ e i 1 that Ɛ4xj − xi ij ∈ G5 is not a function of ni .
2 i q +1 Using (27), we calculate, in a way similar to (29),
r
and in particular, its limiting value

2 1 1 1 1
Ɛ4xj − xi ij ∈ G5 → − √ +p 0
1

1 1
i i
S Pr4ij ∈ G5 → N p + √ 0 (28)
2 i So under the limits, we have the equality

ni
Similarly, (27) also enables us to calculate the conditional Ɛi 45−xi = Ɛ
type difference and its limiting value c ++ni
| {z }
Z +

Ɛ4−xj −xi ij ∈ G5 = −xj −xi Pr4xj ij ∈ G5dxj r

− 2 1 1 1 1
· − √ +p 0 (32)
p

1 1

1 1
i i
→ 2/ + p + √ 0 (29) | {z }
i i
Wei et al.: Credit Scoring with Social Network Data
254 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

Because it is somewhat difficult to find an explicit expres- The first equality uses (8) for the expression of i 45. The
sion for , even under limits, we address it implicitly. From second equality comes from (9).
this point on, notations Ɛ Ui , Ɛ ui , and Ɛ i all refer to their If c ≤ 1, then the precision is decreasing
p in after 1. So
limiting values. it is smaller at ∗ than at 1. If c ≥ N /4N − 5 > 1, then
First, to find the equilibrium, we look at the best response the precision is no larger at 1 than at 41 − /N 5−1 , which
correspondence for i, that is, the value of i that maximizes is the upper bound of ∗ . Noting that the precision is also
Ɛ Ui for any . We use the derivative of Ɛ Ui quasiconcave in , we see that it is smaller at 1 than at ∗ .
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

¡ Ɛ ui ¡4Ɛ i 45 − xi 5
F 4i 1 5 2= + 0 Proof of Lemma 3. Using the definition of conditional
¡i ¡i
probability, we have
By (32),
¡4Ɛ i 45 − xi 5 ¡ ¡ Pr4xi = h1 yi 5 = Pr4yi xi = h5 Pr4xi = h50
= + 0
¡i ¡i ¡i
Note that (i) has the same sign as i − , (ii) ¡/¡i > 0, The prior Pr4xi = h5 = 1/2 by our assumption. The likelihood
(iii) 0 < < 1, and (iv) ¡/¡i < 0. The first three points are Pr4yi xi = h5 has three parts: (i) the probability that i is
clear. The last point can be seen by noting that ni is binomially friends with those whose signals are collected in yi and
distributed, and under the limits, Poisson distributed with that these friends have the signals as collected in yi , (ii) the
the mean given in (28). probability that i is not friends with anyone outside yi , and
Using (i)–(iv), we see two useful properties for the second (iii) the probability that i’s own signal is as that collected
component of F in yi . Formally,
¡4Ɛ i 45 − xi 5 ¡
< 1 for i ≥ 1 (33) Pr4yi xi = h5 =
X Y Y
418xj =h9 +p1 18xj =l9 5 Pr4yj xj 5
¡i ¡i
xi ij∈G ij∈G
and
¡4Ɛ i 45 − xi 5 1
·Pr4yi xi = h5· 12 1
Y
> 01 at i = 10 (34) × 1− 2
41+p1 5
¡i ijyG1 j6=i

Next we translate these two properties into two properties P

where xi is the summation across all possible vectors of
of F . First, (34) shows that
friends’ types, which contains 2ni items.
F 411 5 > 0 (35) Another way to express the probability Pr4xi = h1 yi 5 is

because ¡ Ɛ ui /¡i = 0 at i = 1, by (31). Pr4xi = h1yi 5

To obtain the second property, we look at a simpler case Y
of F where is ignored in the derivative = Pr4yj xj = h5+p1 Pr4yj xj = l5
ij∈G
¡ Ɛ ui ¡
F˜ 4i 1 5 2= + 0 1− 21 41+p1 5 ·Pr4yi xi = h5· 12 0
Y
¡i ¡i × (37)
ijyG1 j6=i
It can be easily verified that as long as < N , there is an
invariant solution to F˜ 4·1 5 = 0 Similarly, we find the corresponding expression for
Pr4xi = l1 yi 5.
−2

o
≡ 1−
N
Y
Pr4xi = l1yi 5 = p1 Pr4yj xj = h5+Pr4yj xj = l5
ij∈G
and Fe4i 1 5 ≤ 0 for i ≥ o . Together with (33), this shows that
1− 12 41+p1 5 ·Pr4yi xi = l5· 12 0
Y
×
F 4i 1 5 < F˜ 4i 1 5 ≤ 01 for any i ≥ max41 50 o
(36) ijyG1 j6=i

These two properties about F are sufficient to derive the Hence we compute the following ratio:
proposition. Define æi 45 2= arg maxi ≥1 Ɛ Ui as the best
response correspondence. Using Berge’s Theorem we show Pr4xi = l yi 5

yi
p1 + 1 −
Li
p1 − p1 +
Hi
that it is upper semi-continuous. Furthermore, (35) and (36) = 0
Pr4xi = h yi 5 1− + p1 − p1 1 − + p1
imply
1 < æi 45 < max41 o 50 This ratio, together with Pr4xi = l yi 5 + Pr4xi = h yi 5 = 1,
This shows that any fixed point of æi 4·5 must be between 1 proves the proposition.
and o . Using Kakutani Fixed-Point Theorem we show that a
fixed point exists. Proof of Lemma 4. Similar to the proof of Lemma 3,
we find the expression Pr4xi = h1 yi 5 by replacing p1 in (37)
Proof of Corollary 2. For the precision, with p . The expression for Pr4xi = l1 yi 5 is
c Y
Ɛ6i 45 7 = c + Ɛ n Pr4xi = l1yi 5 = p Pr4yj xj = h5+p Pr4yj xj = l5
c+ i ij∈G
√
c 1− 12 4p +p 5 ·Pr4yi xi = l5· 12 0
Y
×
= c+ N0
c+ ijyG1 j6=i
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 255

So the ratio becomes p > p . Hence we replace p by p on the right-hand side of

the above inequality
1 − 4p + p 5/2 S−ni
yi
Pr4xi = l yi 5
= ·
Pr4xi = h yi 5 1 − 41 + p 5/2 1−
2
− 41 − 52 41 − 52 − 2

6· · · 7 < + p 1
p + p − p Li p − p + p Hi + 41 − 5p p + 41 − 5

× 0
+ p − p 1 − + p
which is smaller than zero because the denominator in the
Taking the limits of S and completes the proof. first term is smaller than that of the second.
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

Proof of Lemma 5. We want to study the expected log Proof of Proposition 4. For notational simplicity, we
odds as a function of . By Lemma 4, we have omit the 41 5 in the conditional of any expectation operator.
We also use Ri short for Ri 41 5.
Ɛ6Ri 41 5 l1 1 7 We start with a low-type person. With Pi 41 5 given by (15),

we can easily write down i’s expected credit score for any
= 41 − 25 log
1− i ≥
p + 41 − 5p

− 12 N 6p + 41 − 5p 7 log Ɛ4Ri l1 i 5
+ 41 − 5p

p + 41 − 5p

= 41 − 25 log
− 21 N 6p + 41 − 5p 7 log 1−
p + 41 − 5
p + 41 − 5p
+

− 21 N 41 − p 50
Z
(38) − 12 N p + 41 − 5 f 4t5 dt log
R + i + 41 − 5p
Since p = f 4t5 dt where f is the density of the matching
p + 41 − 5p
Z +
value, the derivative of the above expected log odds w.r.t. is − 12 N f 4t5 dt + 41 − 5p log
i p + 41 − 5
¡ Ɛ6Ri 41 5 l1 1 7
− 21 N 41 − p 50
¡
¡ Ɛ6Ri 41 5 xi = l1 1 7 ¡p
= · Thus for any i ≥ ,
¡p ¡
p + 41 − 5p

p + 41 − 5p ¡ Ɛ4Ri l1 i 5

= 12 N 41 − 5 log = 12 N 41 − 5 log
+ 41 − 5p ¡i + 41 − 5p
p + 41 − 5p

p + 41 − 5p

+ log f 450 (39) + log f 4i 50 (40)
p + 41 − 5 p + 41 − 5
Note that the derivative is strictly increasing in p , thus strictly P
Next for the social utility Ɛ4 ij∈G mij − 18xj =h9 l1 i 5, which we
decreasing in . By the definition we gave to , ¯ the derivative
use Ɛ4ui l1 i 5 as a shorthand for, we have for any i ≥ ,
¯
is zero at 45. So we conclude that the expected log odds as
a function of is quasi-concave with the maximum attained
Z + Z +
¯
at 45. Ɛ4ui l1 i 5 = 12 N 4t − 15f 4t5 dt + tf 4t5 dt 0
i

Proof of Lemma 6. We want to study the expected log Thus for any i ≥ ,
odds as a function of . First, the expected log odds is
expressed as in (38). We take its derivative w.r.t. ¡ Ɛ4ui l1 i 5
= − 21 N i f 4i 50 (41)
¡i
¡Ɛ6Ri 415xi = l117
¡ First, we show that ∗ > 0. Consider the case wherein

p +41−5p

p +41−5p
every low type chooses = 0. Clearly, ¡ Ɛ4ui l1 i 5/¡i = 0
= 21 N log +41−5log but ¡ Ɛ4Ri l1 i 5/¡i > 0 at i = 0 for any ≥ 1. Hence
+41−5p p +41−5
¡ Ɛ4Ui l1 i 5/¡i > 0 and the low type wants to increase i
−41−5 p 41−52 −2 p
2 2

+ + f 450 above 0 and be more exclusive towards her fellows. This
+41−5p p +41−5 incentive to deviate means that = 0 cannot be part of an
equilibrium.
Since f is positive, we focus on the term within the brackets. ¯ ∗ 5, we use our refinement.
Second, to show that ∗ < 4
Using the inequality log4t5 < t − 1 except for t = 1, we have ¯ ∗ 5. Now
Suppose 4 1 5 is an equilibrium where ∗ ≥ 4
∗ ∗
∗∗
2 p + 41 − 5p 41 − 5p + 41 − 52 p consider a behavior that is smaller than but sufficiently
6· · · 7 < + ¯ ∗ 5 for the low type. Every low type will be better
close to 4
+ 41 − 5p p + 41 − 5
off in 4 1 ∗ 5 than in 4 ∗ 1 ∗ 5, because (i) by Lemma 5, we
∗∗

41 − 5p + 41 − 52 p 41 − 5p − 2 p know that Ɛ4Ri l5 is quasi-concave in and differentiably
− − 0 ¯
+ 41 − 5p p + 41 − 5 maximized at 45; and (ii) the social utility
Note that the right side is (linearly) decreasing in p . Recall
Z + Z +
that the condition of the Lemma is < , which implies that Ɛ4ui l5 = 12 N 4t − 15f 4t5 dt + tf 4t5 dt

Wei et al.: Credit Scoring with Social Network Data
256 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

is strictly decreasing in Last, consider a candidate profile 41 5 with = 1. From

the above we know that, for it to be an equilibrium, it must
¡ Ɛ4ui l5 ¯
be that > 45, which says that removal of one low-type
= − 12 N f 450 (42)
¡ friend strictly increases one’s expected credit score. Hence a
Thus Ɛ4Ui l5 is decreasing in on the interval [ ∗∗ 1 ∗ ]. high type has incentive to raise her criterion above 1. So it
Furthermore, given that every other low type chooses = ∗∗ cannot be an equilibrium.
and every high type chooses = ∗ , a low-type i has no
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

incentive to increase her criterion i beyond ∗∗ . This can Proof of Proposition 7. First, we look at the case of
be seen by comparing (39) with (40) and (41) with (42), and discrimination. Taking first-order condition (FOC) w.r.t. ei ,
from ¡ Ɛ4Ui l1 i 5/¡i < 0 for i > ∗∗ ; nor does she have we have for each i
incentive to lower the criterion because doing so changes

ei∗ = ab −1 + Hi + ej∗ 1
X
nothing (since friendships are established mutually). We
ij∈G1xi =l
conclude that 4 ∗ 1 ∗ 5 fails the refinement.
Last, we turn to the high types. We show that = 1 cannot which we write in the matrix form
be part of an equilibrium. The argument is similar to that for
the low types. Briefly, consider a symmetric profile 41 = 15. e∗ = ab −1 + H + Al e∗ 0
To be an equilibrium, it must be that < 415. ¯ This implies
that ¡ Ɛ4Ri h1 i 5/¡i > 0 at i = 1. But ¡ Ɛ4ui h1 i 5/¡i = 0 This implies that
at i = 1. Hence a high type wants to raise her i above 1.
This incentive to deviate means that = 1 cannot be part of 4I − Al 5e∗ = ab −1 + H0
an equilibrium.
By Perron-Frobenius Theorem, the largest-magnitude eigen-
Proof of Proposition 5. Let ä be the set of equilibria value of Al is real and positive. Furthermore, if this eigenvalue
without refinement. Given that h = 0 and = 1, we are is smaller than −1 , then A < 1, which implies that the
series k k
P
effectively looking for the point(s) in ä that maximizes the k=0 Al exists. One can readily verify that the series

expected total utility of low types. is the inverse of 4I − Al 5.

Using (39) with = 1, we see that 4¡ Ɛ4Ri l5/¡5/f 45 is The case of homophily can be similarly proved. In particu-
strictly decreasing in and equals 0 at = . ¯ Using (42), we lar, the FOC is
see 4¡ Ɛ4ui l5/¡5/f 45 is strictly decreasing in and equals 0
ei∗ = 4Hi − Li 5b −1 + 2b −1 ej∗ + ab −1 + Hi + ej∗ 0
X X
at = 0. These imply that
ij∈G1 xi =l ij∈G1 xi =l
¡ Ɛ4ui l5 ¡ Ɛ4Ri l5
+ l One can again write it into matrix form and solve for e∗ .
¡ ¡
is strictly decreasing and has a single point within 401 5 ¯ Proof of Lemma 8. Using
where it is zero. This is also where Ɛ4Ui l5 is maximized. Y
Denote this point by ∗ . We would be done if ∗ is shown to Pr4xi = hyi 5 = qs Pr4yj xj = h5+qd Pr4yj xj = l5
ij∈G
be an equilibrium without refinement. By comparing (39)
with (40) and (42) with (41), clearly at i = ∗ and i ≥ ∗ , 1− 21 4qs +qd 5 ·Pr4yi xi = h5· 12 1
Y
×
ijyG1 j6=i
¡ Ɛ4ui l1 i 5 ¡ Ɛ4Ri l1 i 5
+ l ≤ 01 Pr4xi = lyi 5 =
Y
qd Pr4yj xj = h5+qs Pr4yj xj = l5

¡i ¡i
ij∈G
and the inequality is strict for i > ∗ . This implies that a
1− 12 4qs +qd 5 ·Pr4yi xi = l5· 12 0
Y
×
low-type i has no incentive to deviate if every other low type ijyG1 j6=i
chooses ∗ .
So the ratio is
Proof of Lemma 7. Mathematically it is a special case of
Pr4xi = l yi 5
Lemma 4, with p = p = p1 .
Pr4xi = h yi 5
Proof of Proposition 6. The arguments closely resemble

yi
qd + qs − qs Li qd − qd + qs Hi

those of the proof of Proposition 4. Here we discuss them = 1
1− qs + qd − qd qs − qs + qd
briefly.
First, consider a candidate profile 41 5 with ≤ 45.¯ which, together with Pr4xi = l yi 5 + Pr4xi = l yi 5 = 1, gives us
One can show that a low type has incentive to increase her the result.
criterion because doing so increases both her social utility
and credit score. So it cannot be an equilibrium. Proof of Proposition 8. Again for notational simplicity,
Second, we filter ∗ = 1 with refinement. Suppose that all expectation operators in this proof are conditional on the
4 ∗ 1 ∗ 5 is an equilibrium. Compare it with 4 ∗∗ 1 ∗ 5 where symmetric profile 4el 1 eh 5. So, for example, Ɛ4Ui l1 ei 5 actually
∗∗ is smaller but sufficiently close to 1. One can show that refers to Ɛ4Ui l1 el 1 eh 1 ei 5, which is the expected utility of a
low types are better off under ∗∗ and that no single low low type when she chooses ei , while everyone else follows
type has incentive to increase criterion under ∗∗ . 4el 1 eh 5.
Wei et al.: Credit Scoring with Social Network Data
Marketing Science 35(2), pp. 234–258, © 2016 INFORMS 257

Given (13), we see that for any individual i, an additional Benoit DF, Van den Poel D (2012) Improving customer retention in
high-type friend increases (and an additional low-type friend financial services using kinship network information. Expert
decreases) the expected utility from credit score by Systems Appl. 39(13):11435–11442.
Bonacich P (1987) Power and centrality: A family of measures. Amer.

+ 41 − 5p1

p1 + 41 − 5
J. Sociol. 92(5):1170–1182.
Dxi = xi log + 41 − 5 log 0 Bramoullé Y, Kranton R (2007a) Risk sharing across communities.
p1 + 41 − 5 + 41 − 5p1 Amer. Econom. Rev. 97(2):70–74.
Bramoullé Y, Kranton R (2007b) Risk-sharing networks. J. Econom.
Clearly, the friendship formation criteria will be: A low Behav. Organ. 64(3–4):275–294.
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

type accepts another low type, iff, mij > Dl , a high type Bursztyn L, Ederer F, Ferman B, Yuchtman N (2014) Understanding
accepts a low type, iff, mij > 1 + Dh , a low type accepts mechanisms underlying peer effects: Evidence from a field
a high type, iff, mij > 1 − Dl , and a high type accepts another experiment on financial decisions. Econometrica 82(4):1273–1301.
high type, iff, mij > −Dh (which always holds). Using these Castilla EJ (2005) Social networks and employee performance in a
call center. Amer. J. Sociol. 110(5):1243–1283.
criteria and the requirement that a tie is formed upon mutual
Chui M (2013) Social media to boost financial services. InFinance
acceptance, the expected utility of a low type when she 127(2):34–36.
chooses ei is Clover C (2015) Backlash in China over wechats targeted ad-
Z verts. Financial Times (January 28). Accessed February 22,
Ɛ4Ui l1 ei 5 = 4ei + el 5M 4t − Dl 5f 4t5 dt 2015, http://www.ft.com/intl/cms/s/0/0fd1abc2-a610-11e4
Dl -abe9-00144feab7de.html.
Z Demirgüç-Kunt A, Levine R (2009) Finance and inequality: Theory
+ 4ei + eh 5M 4t − 1 + Dl 5f 4t5 dt1 and evidence. Ann. Rev. Financial Econom. 1(1):287–318.
1+Dh Dixon P, Gelman R (2014) The scoring of America: How secret con-
sumer scores threaten your privacy and your future. Accessed
and the expected utility of a high type when she chooses ei is February 22, 2015, http://worldprivacyforum.org/2014/04/wpf
Z -report-the-scoring-of-america-how-secret-consumer-scores
Ɛ4Ui h1 ei 5 = 4ei + el 5M 4t − 1 − Dh 5f 4t5 dt -threaten-your-privacy-and-your-future/.
1+Dh Feigenberg B, Field EM, Pande R (2013) The economic returns to
Z social interaction: Experimental evidence from microfinance. Rev.
+ 4ei + eh 5M 4t + Dh 5f 4t5 dt0 Econom. Stud. 80(4):1459–1483.
0
Galak J, Small D, Stephen AT (2011) Micro-finance decision making:
Using FOCs, we have, in an equilibrium, A field study of prosocial lending. J. Marketing Res. 48(SPL):
Z Z S130–S137.
Goel S, Goldstein DG (2013) Predicting individual behavior with
el∗ = M 4t − Dl 5f 4t5 dt + M 4t − 1 + Dl 5f 4t5 dt1 (43)
Dl 1+Dh social networks. Marketing Sci. 33(1):82–93.
Guseva A, Rona-Tas A (2001) Uncertainty, risk, and trust: Russian
and and American credit card markets compared. Amer. Sociol. Rev.
Z Z 66(5):623–646.
eh∗ = M 4t − 1 − Dh 5f 4t5 dt + M 4t + Dh 5f 4t5 dt0 (44) Haenlein M (2011) A social network analysis of customer-level
1+Dh 0 revenue distribution. Marketing Lett. 22(1):15–29.
Haenlein M, Libai B (2013) Targeting revenue leaders for a new
Comparing (43), (44), and (24), one can verify that: (i) el∗ < product. J. Marketing 77(3):65–80.
e∗ < eh∗ for l = h > 01 (ii) el∗ > eh∗ , iff Herzenstein M, Sonenshein S, Dholakia UM (2011) Tell me a good
story and I may lend you money: The role of narratives in peer-
Dl 1+Dh 1+Dh
to-peer lending decisions. J. Marketing Res. 48(SPL):S138–S149.
Z Z Z
4Dl − t5f 4t5 dt > Dh f 4t5 dt + tf 4t5 dt1 Hill S, Provost F, Volinsky C (2006) Network-based marketing:
1+Dh 0 0
Identifying likely adopters via consumer networks. Statistical Sci.
which holds for sufficiently large Dl . 21(2):256–276.
Hofstede GH (2001) Culture’s Consequences: Comparing Values, Behaviors,
Institutions and Organizations Across Nations (Sage Publications,
Thousand Oaks, CA).
References Hu Y, Van den Bulte C (2014) Nonmonotonic status effects in new
product adoption. Marketing Sci. 33(4):509–533.
Ambrus A, Mobius M, Szeidl A (2014) Consumption risk-sharing in Iyengar R, Van den Bulte C, Lee JY (2015) Social contagion in new
social networks. Amer. Econom. Rev. 104(1):149–182. product trial and repeat. Marketing Sci. 34(3):408–429.
Armour S (2014) Borrowers hit social-media hurdles. Wall Street Jenkins P (2014) Big data lends new zest to banks’ credit judgments.
Journal (January 8). Accessed June 30, 2014, http://www.wsj.com/ Financial Times. Accessed June 24, 2014, http://www.ft.com/intl/
articles/SB10001424052702304773104579266423512930050. cms/s/0/dfe64c0c-fadd-11e3-8959-00144feab7de.html.
Arrow KJ (1998) What has economics to say about racial discrimina- Kornish LJ, Li Q (2010) Optimal referral bonuses with asymmetric
tion? J. Econom. Perspectives 12(2):91–100. information: Firm-offered and interpersonal incentives. Marketing
Bagherjeiran A, Bhatt RP, Parekh R, Chaoji V (2010) Online advertising Sci. 29(1):108–121.
in social networks. Furht B, ed. Handbook of Social Network Lin M, Prabhala NR, Viswanathan S (2013) Judging borrowers by
Technologies and Applications (Springer, New York), 651–689. the company they keep: Friendship networks and information
Bakshy E, Eckles D, Yan R, Rosenn I (2012) Social influence in social asymmetry in online peer-to-peer lending. Management Sci.
advertising: Evidence from field experiments. Proc. 13th ACM 59(1):17–35.
Conf. Electronic Commerce (EC’12) (ACM, New York), 146–161. Liu K, Tang L (2011) Large-scale behavioral targeting with a social
Ball S, Eckel C, Grossman P, Zame W (2001) Status in markets. Quart. twist. Proc. 20th ACM Internat. Conf. Inform. Knowledge Manage-
J. Econom. 116(1):161–188. ment (ACM, New York), 1815–1824.
Ballester C, Calvo-Armengol A, Zenou Y (2006) Who’s who in Lohr S (2015) Banking start-ups adopt new tools for lending. New
networks. Wanted: The key player. Econometrica 74(5):1403–1417. York Times (January 18). Accessed February 22, 2015, http://
Becker GS (1971) The Economics of Discrimination, 2nd ed. (University www.nytimes.com/2015/01/19/technology/banking-start
of Chicago Press, Chicago). -ups-adopt-new-tools-for-lending.html?_r=0.
Wei et al.: Credit Scoring with Social Network Data
258 Marketing Science 35(2), pp. 234–258, © 2016 INFORMS

Lu Y, Jerath K, Singh PV (2013) The emergence of opinion leaders Rusli EM (2013) Bad credit? Start tweeting. Wall Street Journal (April 1)
in a networked online community: A dyadic model with time http://www.wsj.com/articles/SB100014241278873248836045
dynamics and a heuristic for fast estimation. Management Sci. 78396852612756398.
59(8):1783–1799. Schmitt P, Skiera B, Van den Bulte C (2011) Referral programs and
MacKenzie D (2006) An Engine, Not a Camera: How Financial Models customer value. J. Marketing 75(1):46–59.
Shape Markets (MIT Press, Cambridge, MA). Sousa MR, Gama J, Brandao E (2013) Introducing time-changing
MacKenzie D, Millo Y (2003) Constructing a market, performing the- economics into credit scoring. Technical Report 513, University
ory: The historical sociology of a financial derivatives exchange. of Porto, Porto, Portugal.
Downloaded from informs.org by [158.130.193.218] on 28 March 2016, at 15:08 . For personal use only, all rights reserved.

Amer. J. Sociol. 109(1):107–145. Stiglitz JE (1990) Peer monitoring and credit markets. World Bank
National Consumer Law Center (2014) Big data: A big disappointment Econom. Rev. 4(3):351–366.
for scoring consumer credit risk. NCLC, Boston. Sudhir K, Priester J, Shum M, Atkin D, Foster A, Iyer G, Jin G,
et al. (2015) Research opportunities in emerging markets: An
Pasquale F (2015) The Black Box Society: The Secret Algorithms That Con-
inter-disciplinary perspective from marketing, economics, and
trol Money and Information (Harvard University Press, Cambridge,
psychology. Cust. Need. Solut. 2(4):264–276.
MA).
Toubia O, Stephen AT (2013) Intrinsic vs. image-related utility in
Phelps ES (1972) The statistical theory of racism and sexism. Amer.
social media: Why do people contribute content to Twitter?
Econom. Rev. 62(4):659–661.
Marketing Sci. 32(3):368–392.
Podolny JM (2008) Status Signals: A Sociological Study of Market Townsend RM (1994) Risk and insurance in village India. Econometrica
Competition (Princeton University Press, Princeton, NJ). 62(3):539–591.
Rona-Tas A, Guseva A (2014) Plastic Money: Constructing Markets for van Alstyne M, Brynjolfsson E (2005) Global village or cyber-
Credit Cards in Eight Postcommunist Countries (Stanford University balkans? Modeling and measuring the integration of electronic
Press, Stanford, CA). communities. Management Sci. 51(6):851–868.
Rosenblat TS, Mobius M (2004) Getting closer or drifting apart? Van den Bulte C, Wuyts S (2007) Social Networks and Marketing
Quart. J. Econom. 119(3):971–1009. (Marketing Science Institute, Cambridge, MA).
Roth PL, Bobko P, Van Iddekinge CH, Thatcher JB (2016) Social media Zeng Z, Xie Y (2008) A preference-opportunity-choice framework
in employee-selection-related decisions: A research agenda for with applications to intergroup friendship. Amer. J. Sociol. 114(3):
uncharted territory. J. Management 42(1):269–288. 615–648.