Economist Versus Machine Matchmakers
Economist Versus Machine Matchmakers
net/publication/330849479
CITATION READS
1 372
3 authors, including:
All content following this page was uploaded by Ming-Jen Lin on 26 March 2019.
Abstract
We study the recommender systems on two-sided platforms. We find that machine-learning al-
gorithms create congestion: they often generate recommendations that are concentrated on a few
users. We propose the use of equilibrium machine-learning algorithms, as they inherit the predict-
ing power from machine-learning and solve the congestion problem by using the market allocation
mechanism in economics. We apply our recommenders to an online-dating data that contains over
490,000 unique users. The out-of-sample hit rate of our equilibrium recommender outperforms the
baseline content filtering by a factor of 19. In the counterfactual simulations, it accelerates the
matching process by 200%.
I
Acknowledgement: We thank Stéphane Bonhomme, Whitney Newey, Matt Shum and the attendees at the 2018
California Econometrics Conference, Taiwan R User Group Workshop (Dec. 2018) for their comments. Ming-Jen Lin
acknowledges the support from the Taiwan Ministry of Science and Technology grant MOST 106-2410-H-002-041-
MY2. Our special thank goes to Sheng-Wei Chen for his valuable input. We are grateful to Shao-Mou Cheng, Chu-En
Ho, and Cheng-Kun Yang for their excellent research assistance.
Disclaimer: We use the confidential, anonymous data from SweetRing. The authors have no material financial
relationships with entities related to this research. The algorithms and empirical results expressed herein are those of
the authors and do not necessarily reflect SweetRing’s algorithms and views.
1
Kenneth C. Griffin Department of Economics, University of Chicago. Email: [email protected].
2
Department of Economics and Dornsife Institute of New Economic Thinking, University of Southern California.
Email: [email protected].
3
Department of Economics, National Taiwan University. Email: [email protected]
1. Introduction
Online two-sided platforms have become increasingly popular. Prominent examples include job
matching platforms (e.g., LinkedIn), expertise finders, and online dating websites.4 Modern e-
businesses deploy recommender systems to help consumers find products. At the core of recom-
mender systems are the users’ ranking scores for products, from which the algorithms can learn the
consumers’ preferences.
Machine-learning algorithms often use the popularity of items or the nearest neighbor of similar
users to make recommendations. While recommending the most popular movie to everyone might
be harmless, recommending the most attractive partner to everyone could be a disaster—causing
congestion in the matching market.5 Congestion has several welfare implications. First, congestion
lengthens the matching process. Second, congestion discourages the participation of new users
since the recommendation opportunity is skewed toward a small number of “superstars”. To the
best of our knowledge, this is the first paper that studies the externality and welfare consequences
of recommender systems.
We use a rich dataset containing more than 490,000 unique users from an online dating service.
On this platform, a user can browse other users’ profiles, and he/she can “like” other users. The
role of “like” is dual. From a user’s perspective, it serves as a signaling device similar to that
4
According to a recent survey from the Pew Research Center, 15% of American
adults have used online dating services. Source: http://www.pewinternet.org/2016/02/11/
15-percent-of-american-adults-have-used-online-dating-sites-or-mobile-dating-apps/
5
For example, several male users from a popular online dating service expressed their frustration when the algorithm
matched more than 150 men with a single woman in New York City. Source: https://www.nytimes.com/2018/08/
20/style/tinder-dating-scam-union-square.html
2
of Facebook. From the platform’s perspective, it reveals information about preferences similar to
that of Pandora. We exploit the information of likes and other users’ profiles to construct the
recommenders.
Our equilibrium matching algorithms consist of three modules: (1) a baseline machine-learning
algorithm, (2) a pseudo market, and (3) a separable matching model. For the baseline algorithms,
we use both flexible fixed-effect regression and Matrix Factorization to estimate the preference
parameters. They represent two polar cases of machine-learning algorithms:content filtering and
collaborative filtering, respectively.6 Given these estimates, we apply the transferable utility (TU)
matching model proposed by Choo and Siow (2006) (hereinafter CS) to the pseudo market, which
is an approximated matching market that includes potential partners and rivals constructed from
the browsing history.
Motivated by the seminal marriage model of Becker (1973), the CS model adjusts the demand-
and-supply through the “matching cost” to discourage users from seeking a relationship with popular
potential partners. Specifically, the CS model computes the equilibrium utility, which is defined
as the gross utility of the match net of the matching cost for any possible pair. For each user, we
rank his/her potential partners within his/her pseudo market according to the implied equilibrium
utilities, and we recommend the N best women/men. Since the agents may take the matching
cost into account, deploying an economic model may help to improve the prediction power of the
baseline machine-learning algorithm. On the other hand, the baseline machine-learning algorithms
tend to produce similar recommendations due to the highly correlated gross-utility estimates. By
introducing the matching cost, the equilibrium matching algorithms can achieve more diversified
recommendations.
Main Findings
We evaluate the prediction power and diversification of the proposed algorithms. These measures
try to address the following two crucial concerns: First, does our recommender provide quality
6
Content filtering utilizes user-item attributes, whereas collaborative filtering extracts information only from the
past user actions.
3
information? Second, are the recommendations diversified enough to promote the opportunity to
be recommended and to avoid competition?
We use the hit rate to answer the first question. We define it as the probability that at least one
member from the recommended list is liked. The baseline machine-learning algorithms have almost
zero hit rates. When combined with our pseudo market and equilibrium matching approach, the hit
rate reaches approximately 35% when recommending ten users. This suggests that our equilibrium
recommenders have prediction power.
Next, we evaluate the diversity of our recommendations and its welfare consequences. First,
we calculate the coverage rate: the number of distinct recommended users divided by the number
of recommendations. The baseline algorithms have coverage rates lower than 30%, whereas the
equilibrium matching algorithms achieve coverage rates above 60%. In particular, the equilibrium
collaborative filtering algorithm has almost 100% coverage. Second, we simulate the number of
rounds needed to clear the matching market in a Gale-Shapley style algorithm. Our equilibrium
matching algorithms cut the number of rounds needed by 50% compared to the baseline machine-
learning algorithms. These results suggest that our algorithms avoid recommending only a small
number of people to most users, which, in turn, improves the welfare by accelerating the matching
process and allocating the opportunity to be recommended more evenly.
Related Literature
Our paper is related to the literature on both the recommender systems in computer science
and the matching models in economics. In one-sided recommenders, Koren et al. (2009) adopt a
novel collaborative filtering algorithm—Matrix Factorization—that outperforms Netflix’s algorithm
in predicting movie ratings. In a different vein, algorithms that are tailored for the two-sided setup,
known as reciprocal recommenders, are studied in Malinowski et al. (2006), Brozovsky and Petricek
(2007), Richards et al. (2008), Krzywicki et al. (2010), Akehurst et al. (2011), and Kunegis et al.
(2012), among many others.
Our paper is also related to the matching literature in economics. An important class of matching
models is the separable matching models, including CS, Chiappori et al. (2015), Galichon and Salanié
4
(2015), and Mourifié and Siow (2017) for the TU cases.7 We refer to Chiappori and Salanié (2016) for
a more comprehensive survey. The primary focuses of this literature, however, are about parameter
identification and comparative statics.8 In contrast, we use the separable matching model to provide
predictions and mitigate the externalities of the machine-learning algorithms.
The rest of the paper is organized as follows. Section 2 describes the mechanism of the online
dating service and provides some summary statistics. Section 3 illustrates Equilibrium Content
Filtering, and section 4 presents Equilibrium Collaborative Filtering. We then show our prediction
and congestion results in section 5 and section 6, respectively. Section 7 concludes.
2. Data
We use the data from one of the major online dating services in Taiwan, which contains 379,448
male accounts and 115,552 female accounts. Each user is required to sign up through his/her
Facebook account, on which the marital status must not be married. Furthermore, an account with
more than 50 friends is required to deter robot or spam account. The users are then asked to create
their profile, including age, height, physique, location, educational attainment, occupation, income,
assets, and other characteristics. By entering search keywords for desired attributes, the users can
browse the short profiles of other users. Three further actions can be made by the user: “click”,
“like”, and “send message.” First, the users can click 9 on the short profile and link to the detailed
profile which includes all characteristics. Second, the users can like users of the opposite sex. Third,
the users can send messages.10 We observe the detailed usage records from July 2016 to October
7
Hitsch et al. (2010a) argue that the non-transferable utility (NTU) matching framework is more appropriate for
online dating since it is unclear whether agents can exchange transfers without offline meetings. However, we find that
the recommenders based on the NTU matching model of Galichon and Hsieh (2017) produce similar results to the
CS model. It is worth noting that alternative NTU matching models, e.g., Gale and Shapley (1962), Dagsvik (2000),
and Menzel (2015), are not applicable for recommender systems since they produce only the equilibrium matching
without the implied equilibrium utility.
8
See, for example, Decker et al. (2013), Fox (2010), and Graham (2011)
9
The previous study by Hitsch et al. (2010b) focus on estimating the mate preferences from the click data
10
Female users can send messages without incurring additional charges, whereas male users have to pay the sub-
scription fee to enable this function.
5
2017, such as who likes whom, along with several of the user attributes.
An important feature on this website is the like function, from which the platform garners its
profit; An unpaid male user can like at most 5 women in 12 hours, while a male user who pays
the subscription fee (VIP) can like at most 15 women in 12 hours. On the other hand, the cap for
a female user is 40 men in 12 hours, regardless of if they have VIP status. One can purchase the
ability to send additional likes.
We first present the like and click patterns from our data. On average, male users clicked 490.30
profiles and sent 115.76 likes, whereas female users clicked 331.38 profiles and sent 67.94 likes. In
machine learning, the difficulty of making an accurate recommendation depends on the sparsity of
the data, namely, the number of the observed ranking data versus the whole sample. In our data,
the total number of clicks divided by the number of all possible pairs is 0.04%, which is much smaller
compared to the sparsity in the Netflix data (1%).
We compute several summary statistics of the users’ attributes from our data, which we compare
to the general population from the sample of the Taiwan Social Change Survey (TSCS) in the year
of 2015. TSCS is a academia-collected cross-sectional database of survey data collected in 201511 ,
which records many of the attributes that overlap with our online dating dataset. A first comparison
of the two raw datasets is shown in Table 1. We find that people in the online dating dataset are
quite different from the general population. First, there are many more men than women in the
online dating website; the male to female ratio is almost 3 to 1. Second, the users of the online
dating service tend to be younger than the population. Third, more than half of our sample have
bachelor’s degrees, while only 35% have bachelor’s degrees in the population.12 Our sample is
biased toward a young, higher educated male population. However, since our objective is to predict
matching in the online dating market, not to predict off-line marriage, as in Hitsch et al. (2010a),
a biased sample is less of a concern.
11
For more details, we refer the reader to http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/23400
12
This gap is driven by the significant expansion of colleges in Taiwan in 2000. As a result, the younger cohort in
our online dating sample has attained a much higher educational level than the general population.
6
3. Equilibrium Matching Algorithms
Our goal is to recommend partners that a user may want to send likes to, since this is the main
source of profit for the platform. We propose the equilibrium matching recommenders based on the
separable matching models of CS. The model takes a collection of men and women, and each agent’s
gross utility over potential partners to form a discrete-choice demand system. A pricing mechanism
clears the market as in a typical competitive equilibrium analysis; to match with an overdemanded
agent, one has to pay a “matching cost,” which decreases the net utility of matching. The model
essentially trades off between desirability and rivalry (measured by the endogenous matching cost).
For each agent, our equilibrium matching algorithm ranks partners according to the corresponding
net utilities and recommends the N best partners. Since the model explicitly takes into account
competition, it is less likely that it will recommend the same set of partners to observationally
similar agents.
Next, we discuss implementation details for the following tasks: (1) how to compute the equi-
librium matching, (2) how to estimate the mate preferences from the explicit feedback (e.g., data
of like or not), and (3) how to choose the relevant set of men and women (the pseudo market)
from the implicit feedback (e.g., click data). Our algorithm is modular: different methods, such
as different estimators for preferences or different matching models can be deployed to achieve a
better performance.
In this section, we review the separable matching models as the framework to predict the matching
outcome. At the core of these models is the combination of the classical matching theory (e.g.,
Shapley and Shubik (1972)), which can predict matching outcomes and the random utility theory
(e.g., McFadden (1976)), which can incorporate unobserved preference heterogeneity.
We first introduce our notation. Let m = {1, 2, 3, . . . , M } be the index for men, and w =
{1, 2, 3, . . . , W } be the index for women. Let xm and zw be the vectors of the attributes of man
m and woman w, respectively. The utility of man m who matches with woman w consists of a
deterministic component α that depends on the observable attributes (xm , zw ) and an additively
separable random component εxm zw :
7
αmw = αxm zw + εxm zw . (1)
The agents can choose to be single. A single man obtains utility εxm 0 , and a single woman obtains
utility η0zw . We define Φmw , the social surplus, to be the sum of the men’s and women’s deterministic
utility matrices.
The separable models exclude the interactions between the unobserved characteristics of m and
w conditional on observed types (xm , zw ). It also implies that the preference ranking for partnership
only depends on the observed characteristics. However, as the online dating data contains a rich set
of regressors, we argue that the separability assumption is less problematic than in the traditional
setup.
CS postulates that a type-x man should pay τxz to match with a type-z women. The price system
adjusts the demand and supply such that the number of type-x men who wish to match with a
type-z woman equals the number of type-z women who wish to match with a type-x man:
x Prob{z = argmaxz∈Z0 (αxz − τxz + εxz )} = nz Prob{x = argmaxx∈X0 (γxz + τxz + ηxz )},
nM W
(4)
where nM W
x and nz are the total number of type-x men and type-z women. Galichon and Salanié
(2015) prove the existence and uniqueness of the TU matching equilibrium that satisfies Eq. (4)
under fairly general conditions for the random taste shifters. In particular, if the random utility
components εxm zw and ηxm zw follow an i.i.d. type-I extreme value distribution, CS shows that the
price system τxz in Eq. (4) can be concentrated out, and the equilibrium matching (µxz , µx0 , µ0z )
satisfies the following system of equations.
8
( )
µxz = µ0.5 µ 0.5 exp αxz + γxz
x0 0z
2
∑
µx0 + z∈Z µxz = nx , ∀x ∈ X
M (5)
∑
µ0z + x∈X µxz = nW z , ∀z ∈ Z,
where µxz is the number of type-x men who marry type-z women, and µx0 (µ0z ) is the number of
type-x men (type-y women) who remain single. It is also interesting to note that the equilibrium
matching only depends on the social surplus Φxz = αxz + γxz .
We rank partners according to the net utility αxz − τxz , which trades off between desirability
and rivalry. Our algorithm will recommend type-z women more often if the desirability of type-z
women, as reflected in αxz , is higher. On the other hand, our algorithm will recommend type-z
women less often if the matching cost of type-z women, as reflected in τxz , is higher. The matching
cost deserves further explanation. Clearly, if τxy is higher than τzy , it means that type-x men are
less competitive than type-z men when facing type-y women. Such competition may be driven
by the stronger preference for type-z men from the perspective of type-y women, or it may be
driven by more abundant type-x men in the market. The matching cost τ can summarize these
two very different sources of competition into one single number. By taking into τ , the equilibrium
matching algorithm avoids recommending partners with which there is only a small chance to match.
Moreover, by avoiding recommending the “superstars”—those who have a higher market price τ
—our algorithm can effectively solve the congestion issue.
While the separable matching models are well-known, the way we apply them to the recommender
systems departs from the existing literature in three distinct ways. First, the separable matching
models are often used to explain the aggregate marriage distribution µ, and the implied matching
cost τ is often treated as the nuisance parameter in the estimation. We, on the other hand, utilize
τ to solve the congestion problem in recommender systems. Second, for the ease of estimation,
researchers often group individuals into discrete categories, e.g., the age category, when estimating
the model. Since our focus is prediction, we do not face such a practical constraint in the estimation.
We solve the matching equilibrium by treating each individual as a unique type.13 By doing so,
the matching cost will be individual-specific, not group-specific as in CS. This further prevents
13
This is a strategy advocated by Galichon et al. (2016) in the maximum likelihood estimation of separable matching
models.
9
recommending the same partners to everyone within the group defined by the attributes. Third,
online dating is a platform, where players can connect to many potential partners. The separable
matching models, however, are designed for one-to-one matching markets. The use of the one-to-one
matching model is a rather simplified approximation of the complex truth.
To solve the matching equilibrium, one has to first estimate the utility parameters (α, γ) and
specify the set of men and women, which we shall discuss next.
where ymw is the binary indicator of whether man m likes woman w, and d(xm , zw ) is some distance
function between the attributes of man and woman; e.g., the absolute value of difference in age. α1 is
intended to capture vertical preferences, in which users rank those attributes in the same way. α2 is
intended to capture horizontal preferences or homophily. For example, if d(xm , zw ) is the difference
in the years of schooling of man m and woman w, a negative α2 would mean the preference for
similarity in education.14 We use the fitted value α̂1 zw + α̂2 d(xm , zw ) as the deterministic utility
αmw in the matching model. Analogously, for female users, we estimate γxm zw by the following
regression:
The regressors include the user’s marriage status (never married or widowed), smoking and
drinking status, occupation, religion, own evaluation of social ability, and whether one has children.
We also include whether one wants to have children, the ideal living arrangement, ideal splitting of
house chores, and one’s attitude toward relationships. The distance measures include whether they
live in the same city, their difference in heights and ages.
14
Similar specifications are also adopted in Hitsch et al. (2010b), Logan et al. (2008) and Galichon et al. (2016).
10
Since users send likes after clicking the profiles, for each man m, we restrict our sample to the set
of women whose profile has been clicked by m. This leaves us with 31,301,828 observations when
estimating Eq. (6). The same procedure also applies to each woman, leaving us with 16,333,134
observations when estimating Eq. (7). By the above sample construction, our data is a quasi-
panel, in which N equals the number of men (women), and T equals the number of women (men)
whose profiles have been clicked. Therefore, we further control for the individual fixed effect when
estimating Eq. (6) and (7).15 For example, when estimating a man’s preference, we add his fixed
effect. The fixed effect can control for the fact that some users systematically send more likes than
the others and, hence, may bias the utility estimation. Lastly, we add rich interaction terms to
account for preference heterogeneity. In particular, we divide the income, asset, education, and
physique attributes into 3, 2, 4, and 4 categories, respectively. The interaction terms include all
pairwise interactions of these 4 attributes. For instance, we interact the man’s income with the
woman’s education, the man’s education with the woman’s income, etc. Since our main goal is
prediction rather than explaining the mate preferences, as in Hitsch et al. (2010b), we do not report
the regression results here.16
Next, we formally introduce our first equilibrium matching algorithm—the Equilibrium Content
Filtering based the OLS-estimated preferences. Block 1 of Algorithm 1 constructs the pseudo market
that selects the relevant men and women. For a generic user mi , we first find the set of women O1W
that he clicked before. O1W are the women he might be interested in, as revealed by his browsing
history. On the other hand, those women in O1W also clicked other men’s profiles; therefore, men in
O1M are mi ’s competitors, as revealed by the clicking histories of the women in O1W . This procedure
can continue until the desired number of men and women are reached. For example, the women
in O2W are competitors of those women in O1W . The pseudo matching market thus contains mi ’s
direct and indirect competitors and potential partners and these partners’ competitors. Block 2 of
our algorithm takes the estimated preferences illustrated in section 3.2 as the inputs and computes
the equilibrium introduced in section 3.1 to make the recommendations.
15
In principle, one can estimate a two-way fixed effect model to control for both the men’s and women’s individual
heterogeneity. However, our sample size goes beyond what canned software (e.g., Stata) can handle.
16
The regression results are available upon request.
11
Algorithm 1: Equilibrium Content Filtering Algorithm
⋄ Block 1:
input : history of clicks of all users
( )
output : the pseudo matching market of man mi : M (mi ), W (mi )
begin
1. Find the set of women that man mi clicked: O1W
2. For each woman w ∈ O1W , randomly select K1 men that w clicked: OM (w).
O1M = ∪w∈OW OM (w)
1
3. For each man m ∈ O1M , randomly select K2 women that m clicked: OW (m).
O2W = ∪m∈OM OW (m)
1
repeat
step 2 and step 3
until desired number of men and women are reached
∪ ∪
1 4. return M (mi ) = mi O1M , W (mi ) = O1W O2W
⋄ Block 2:
( ) ( )
input : M (mi ), W (mi ) and the estimated preferences αxm zw , γxm zw ,
∀m ∈ M (mi ), w ∈ W (mi )
output : N recommended women
begin
1. solve the matching equilibrium as defined in Eq. (4) and (5).
2. sort the resulting equilibrium net utility of mi , αxmi zw − τxmi zw , ∀w ∈ W (mi ).
2 3. return the women corresponding to the first N highest utilities for mi
12
The rationale for a pseudo market is as follows: While in principle all registered users constitute
the matching market, it is computationally infeasible to deliver the recommended list in real time by
solving such a large scale equilibrium problem. Further, since users send likes after clicking profiles,
it suggests that one can restrict the attention to a subset of users from the clicking history.
Being the core algorithm deployed by the winning team of the Netflix Prize, Matrix Factorization
has attracted an increasing attention in the domain of machine learning and recommender systems.
Here we briefly review Matrix Factorization, and we propose a novel Equilibrium Collaborative
Filtering algorithm that combines the strength of Matrix Factorization and the CS model.
Matrix Factorization (MF) assumes that ymw , user m’s rating on item w, can be approximated
by the inner product of K-dimensional latent factors pm and qw , where pm represents m’s attributes,
and qw represents w’s attributes. Since each user rates only a sparse fraction of the available items,
most of the entries in ymw are missing. MF uses these low-dimensional latent factors to impute the
missing rating data.
In our case, ymw is binary: ymw = 1 if man m liked woman w, and = −1 otherwise. ymw is
missing if m did not click w’s profile.17 Formally, MF estimates the latent factors by solving the
following minimization problem:18
∑
M ∑
W
′ λp λq
min (ymw − pm qw )2 + ||pm ||22 + ||qw ||22 , (8)
pm ,qw ;m=1,...,M,w=1,...,W 2 2
m=1 w=1
where ||·||2 is the l-2 norm. The penalization terms are added to avoid over-fitting as in the standard
Lasso regression.
17
In most of the applications, the percentage of the non-missing ymw is rather small. In our case, only 0.2% of ymw
is non-missing, while in the Netflix case 1% is non-missing.
18
It is possible to consider more general specifications as described in the influential paper by Koren et al. (2009).
We only consider the baseline specification due to the limitation of the available canned software, which will be discuss
later in Appendix A.
13
4.2. Equilibrium Collaborative Filtering
MF represents a polar case of recommender systems that do no rely on the observed user-item
attributes. However, since the estimated latent factors essentially play the same role as the observed
′
attributes, we argue that the fitted value of ŷmw = pm qw in MF can be treated as the social surplus
matrix Φmw in the CS matching model. MF implies that the men’s and women’s j-th latent factor
∂ 2 ymw
are complements: = 1, which further implies that the fitted value of ŷmw can be interpreted
∂pmj ∂qmj
as a supermodular social surplus function.
We elaborate this idea with the following toy example. Suppose there are 2 latent factors; the
first latent factor indicates whether one smokes (1 for smokers and -1 for nonsmokers), and the
second latent factor indicates whether one drinks (1 for drinkers and -1 for nondrinkers). Suppose
there are one man m1 with latent factor pm1 = (1, −1)′ and two women w1 and w2 with latent factors
qw1 = (1, 1)′ and qw2 = (1, −1)′ . In this example, both m1 and w2 smoke but do not drink, while w1
smokes and drinks. The fitted value for the two matches would be: rm1 w1 = 1 × 1 + (−1) × 1 = 0
and rm1 w2 = 1 × 1 + (−1) × (−1) = 2. In this simple case, we can see that ymw is higher when both
sides have similar attributes. We summarize the Equilibrium Collaborative Filtering algorithm that
combines the MF and the CS model below:
14
Algorithm 2: Equilibrium Collaborative Filtering
⋄ Block 1:
input : history of clicks of all users
( )
output : the pseudo matching market of man mi : M (mi ), W (mi )
begin
1. Find the set of women that man mi clicked: O1W
2. For each woman w ∈ O1W , randomly select K1 men that w clicked: OM (w).
O1M = ∪w∈OW OM (w)
1
3. For each man m ∈ O1M , randomly select K2 women that m clicked: OW (m).
O2W = ∪m∈OM OW (m)
1
repeat
step 2 and step 3
until desired number of men and women reached
∪ ∪
1 4. return M (mi ) = mi O1M , W (mi ) = O1W O2W
⋄ Block 2:
( )
input : M (mi ), W (mi ) and the estimated social surplus from the MF as defined in Eq.
(8): Φmw = p′m qw , ∀m ∈ M (mi ), w ∈ W (mi )
output : N recommended women
begin
1. solve the matching equilibrium as defined in Eq. (4) and (5).
2. sort the resulting equilibrium net utility of mi , αxmi zw − τxmi zw , ∀w ∈ W (mi ).
2 3. return the women corresponding to the first N highest utilities for mi
5. Prediction
We compare the predictive performance of the two algorithms in terms of the hit rate, which
is defined as the probability that at least one member from the recommended list is liked. For-
mally,
Definition 1: (Hit Rate) Consider I men, {m1 , m2 , . . . , mI }, and for each man we recommend N
women. Suppose we recommend W̃ (mi ) = {w̃1i , w̃2i , ..., w̃N
i } to m , and in the data, the set of women
i
15
1 ∑I
defined as H̄ = I i=1 hi .
Intuitively, hi measures if the recommender system can suggest at least one woman that mi will
like. The hit rate is therefore the proportion of a successful recommendations.
We first discuss the out-of-sample hit rate of several recommenders out of 5,000 men. We only
consider those who sent between 20 to 30 likes, which is approximately the average number of likes
in the data.19 The results are summarized in Figure 1, and the implementation details are gathered
in Appendix A. Since our algorithms are modular, we can investigate the prediction power of each
step. We consider the following recommenders:
1. OLS: for each m, choose women who correspond to the N highest fitted value α̂mw estimated
by the procedure described in section 3.2.
′
2. MF: for each m, choose women who correspond to the N highest fitted value pm qw estimated
by the procedure described in section 4.
3. Pseudo market: randomly choose N women from m’s pseudo market, which is defined in block
1 of Algorithm 1.
4. OLS+Pseudo market: same as OLS, except here, one chooses the top-N women from the
pseudo market instead of the whole sample.
5. MF+Pseudo market: same as MF, except here, one chooses the top-N women from the pseudo
market instead of the whole sample.
We begin with the OLS and the MF recommender as the baseline cases and then add other
modules, such as the pseudo market and the CS model. The average hit rate under different N ,
the length of the recommendation list, can be found in Figure 1. We use triangles to label the
19
The extreme cases in which men liked an exceedingly small or large number of women will lead to a hit rate that
is close to zero or one, regardless of the algorithm.
16
Figure 1: Hit Rate
algorithms that are based on OLS, and we use circles to label the algorithms that are based on MF.
We use the dashed line to present the hit rate of the pseudo market. The algorithms are further
color-coded according to the different modules deployed within the algorithm.
We find that the plain OLS recommender (a variant of content filtering) outperforms the plain MF
recommender (a variant of collaborative filtering). In fact, the MF recommender has a prediction
power near zero for the like pattern, even when N = 20. In contrast, the OLS recommender has a
hit rate of approximately 2% when N = 10. The poor performances of both algorithms are mainly
due to the extremely sparse observations. By restricting to the pseudo market, even random draws
of recommendations performs much better than OLS or MF alone. When N = 10, the hit rate
is slightly above 20%. If we combine pseudo market and the OLS approach (the green triangle),
when N = 10, the hit rate increases by a factor of 1.8 compared with the random draw from the
pseudo-market. The combination of the pseudo market and MF (the green circle) doubles the hit
rate under the random draw from the pseudo market. Lastly, we add the TU matching model of
CS to make the recommendation. The Equilibrium Content Filtering (OLS+pseudo market+CS,
labeled by the blue triangle) further improves the hit rate of the OLS+pseudo market Algorithm,
whereas the Equilibrium Collaborative Filtering (MF+pseudo market+CS, labeled by the blue
circle) decreases the hit rate of the MF+pseudo market Algorithm. Overall, the hit rate curves of
these two equilibrium recommender algorithms are similar and outperform the randomized draws
from the pseudo market.
17
6. Congestion: Counterfactual Welfare Analysis
Both content and collaborative filtering can lead to congestion—the recommendation lists con-
centrate on only a few users. The content filtering utilizes user-item attributes to make predictions;
therefore, men who share similar attributes will be recommended to similar women by construction.
The collaborative filtering, such as the nearest neighbor approach, utilizes the choice patterns to
make predictions; consequently, men who are within the same neighborhood will be recommended
to similar women. On one hand, a highly clustered recommendation list may increase the dating
competition. On the other hand, congestion also implies that the dating chance is not evenly al-
located by the algorithm. If potential users realize that certain algorithms limit their opportunity
to be recommended, it may discourage them from joining the platform, which can undermine the
platform’s long term revenue. In this section, we attempt to measure the degree of congestion and
its welfare consequences. We find that deploying the CS model can alleviate the congestion problem.
We present several Monte Carlo evidences in this section.
6.1. Coverage
First, we consider a measure for the coverage rate of recommendation, which is defined as the
number of the distinct recommended women divided by the number of recommendations.
Definition 2: (Coverage Rate of Recommendation) Suppose we recommend W̃ (mi ) = {w̃1i , w̃2i , ..., w̃N
i }
I×N | ∪i∈I
1
to mi , i = 1, . . . , I. Define the coverage rate of the top-N recommendations as c(N ) =
Clearly, c(N ) is a decreasing function of congestion. In the case of a top-1 recommendation, suppose
′
every man is recommended to the same woman: w̃1i = w̃1i ∀i, i′ , then c(1) = 1
I × |1| = I1 . On the
other hand, if every m is recommended to a distinct w, then | ∪i∈I w̃1i | = I and, hence, c(1) = 1.
Therefore, the higher the coverage rate, the more diversified the recommendation lists are.
For all simulations in the section, we remove the module of the pseduomarket.20 We randomly
draw 500 markets, with each market containing 1,000 men and 305 women (the gender ratio in
20
The pseudo market is personalized: every one faces a distinct matching market in the algorithm. This leads to
some difficulty in the counterfactual simulations. As will become clear later, the (simplified) Gale-Shapley algorithm
requires a common matching market.
18
the data). We use the OLS preference estimates and the MF latent factors obtained from the
whole sample to compute the coverage rates of the top-1 recommenders produced by the OLS, MF,
Equilibrium Content Filtering, and Equilibrium Collaborative Filtering.
The results are shown in Figure 2. The coverage rate of OLS (the green bar in the right panel)
is close to 10%, whereas the coverage rate of MF (the green bar in the left panel) is approximately
25%. This is the evidence that the non-equilibrium-based algorithms can produce a highly clustered
recommender list. Moreover, the MF has a higher coverage rate than the OLS algorithm. This is
because the MF approach implicitly estimates individual-level coefficients. The richer preference
heterogeneity allowed in the MF thus translates into more diversified recommender lists.21
While it is possible to deploy more complex models to increase the coverage rate, this will
generally lead to the over-fitting issue. However, we show that the matching model can improve the
coverage rates without increasing the complexity of the model. From Figure 2, the coverage rate of
both OLS and MF (the blue bars) increase several folds. In particular, the MF almost achieves full
coverage when combined with the CS model.
21
For content filtering, it is possible to consider random coefficient regressions to achieve similar effects. We leave
this as a future endeavor.
19
6.2. Congestion
We further conduct a simulation study to understand the effect on congestion from the highly
clustered recommendation lists. We apply the classic Gale-Shapley (GS) algorithm (Gale and
Shapley (1962)), with some modifications to fit the online dating market22 . The simplified Gale-
Shapley works as follows:
1. Each man contacts the first woman in the recommendation list generated by the recommender
algorithm.
2. Each woman replies to the best man she receives according to the utility estimated by the
OLS in Eq. (7) and ignores all others. Additionally, if the best man gives the woman negative
utility, she does not reply to anyone.
3. The men who received responses from some women drop out of the market. The men who are
ignored continue to contact the second best women in the recommendation list.
4. Each woman replies to the best man from the new contact list and ignores all others.
The simplified GS algorithm is only designed to simulate how fast a man can find a woman who
is willing to reply him. It has nothing to do with stability or strategy-proofness, as studied in the
classic matching theory. Several remarks are in order. First, men are assumed to be myopic: they
simply follow their recommendation lists. Here, we abstract from the fact that in the real world, men
can both conduct their own search and consider the machine-generated recommendations. On the
other hand, women use the estimated preference parameters to determine to which men they reply.
This asymmetric behavioral assumption is merely a way to isolate the congestion effect from the
clustered recommendation lists of one side of the market. Second, there is no “deferred-acceptance”
phase as in the classical GS algorithm. The platform we study serves as a platform for participants
22
As pointed out by Che and Koh (2016), economists have yet to develop a benchmark model to analyze the issue
of congestion. Several attempts have been made, including those of Che and Koh (2016) and Galichon and Hsieh
(2017), and their references therein. However, it is unclear how to apply those models in our setup.
20
to send likes and exchange messages but does not enforce “one-to-one” matching. Therefore, as
soon as a man receives feedback from a woman, the purpose of the platform is achieved, and we
can remove that man to simplify the congestion calculation. Moreover, the dating platform is not
a real one-to-one marriage market; hence, at each round, there is no need for women to compare
the existing offers versus the new proposals as in the classic GS algorithm. We therefore assume
that women keep in touch with the men to whom they previously replied (those who are dropped
from the simplified GS algorithm), and they continue to consider other men from the new contact
lists.
Figure 3 displays the boxplot of the number of matches made in the first round over the 500
simulated markets. We focus on MF and OLS without CS first. The median number of matches
Figure 3
of MF and OLS are 75 and 25, respectively. Since we have 1,000 men in the simulated market,
approximately 7.5% and 2.5% of the men find a dating partner in the first round. This patterns is
consistent with the coverage rate reported in Figure 3. 23 The interquartile range of the boxplots
of OLS and MF are tight, suggesting little Monte Carlo variations. When combining with the CS
model, the numbers of matches made in the first round increase several times for both MF and
OLS. For MF+CS, the median is more than 400, or 40%, of the men in the market. For OLS+CS,
the median can also be as high as 200, or 20%, of the men in the market. In other words, deploying
23
Generally, the numbers in these two graphs will not be equal because in the simplified GS algorithm, matches are
made only when women accept the offers.
21
the CS model creates 5 times more matches for the MF and 8 times more matches for the OLS,
respectively.
We further summarize the average number of matches made in for the first 10 rounds (those
men who are dropped from the algorithm) in Figure 4. The green lines correspond to the MF
Figure 4
and OLS algorithm without CS, whereas the blue lines correspond to the results combining with
CS. The market clears much faster in the presence of the CS module; the blue lines exhibit an
exponential decreasing pattern as they successfully match more pairs in the first few rounds. By
contrast, the green lines are relatively flat; the less-diversified lists limit the number of matches at
each round.
Lastly, we present the boxplot of rounds needed to clear 95% of the men in the market in Figure
5.24 The medians suggest that it takes approximately 10 rounds for MF and 15 rounds for OLS to
clear 95% of the market.25 When combined with CS, both MF and OLS clear the market much
faster. The CS module effectively reduces the number of rounds by a factor of two: it takes less
than 5 rounds for MF + CS and approximately 6 rounds for OLS + CS.
In summary, we find that deploying the CS model can substantially accelerate the matching
24
We truncate it at 95% due to the tail behavior of the simplified Gale-Shapley algorithm: there exist a few
unattractive men that require excessive rounds to fully clear the market.
25
While the interquartile ranges are tight, there are several outliers. We find that these outliers correspond to the
case when women are more “selective.” Namely, the utilities of the recommended men are all negative.
22
Figure 5
process, approximated by the simplified Gale-Shapley algorithm. The recommendations are more
diversified and the number of rounds required is only half of that without the CS module. These
results suggest that one can mitigate the congestion problem by deploying the CS model.
We suggest using the separable matching model in conjunction with the content or collabora-
tive filtering, which were originally designed for one-sided recommenders. The separable matching
models are easy to deploy in practice. It is related to the optimal transport theory in which there
exist several high-performance algorithms. Moreover, it can seamlessly blend into either content
or collaborative filtering, without increasing the complexity of the underlying machine-learning al-
gorithm.26 The resulting equilibrium recommenders can accommodate two-sided preferences and
rivalry in online dating. In particular, we find that it can improve the hit rate of the baseline OLS
recommender.
On the other hand, the equilibrium recommenders help solve the issue of congestion. Interest-
26
One can construct a recommender with a combination of (1) any machine-learning algorithm that produces an
estimate or an approximation to the gross utility, and (2) any matching model that produces a rankable equilib-
rium utility indices. This combination of machine-learning algorithms and matching models is therefore a “tuning
parameter” that can be optimized in practice.
23
ingly, despite the efforts to make matching more efficient, the platform’s incentives might not be
aligned with the users. Currently, many platforms charge a monthly subscription fee for usage;
therefore, a platform may not have the incentive to reduce congestion or minimize the time users
spend on the platform. The two-sided platforms face a trade-off between the length of an user’s
subscription and the platform’s reputation in making matches—two conflicting goals in terms of
congestion. Platforms may choose a less congested algorithm under alternative pricing model, such
as a “matching guarantee program”: the platform charges a one-time fee until a match is found.
We leave this for future research.
Reference
Akehurst, J., I. Koprinska, K. Yacef, L. A. S. Pizzato, J. Kay, and T. Rej (2011): “CCR-A
Content-Collaborative Reciprocal Recommender for Online Dating.” in IJCAI, 2199–2204.
Becker, G. S. (1973): “A theory of marriage: part I,” Journal of Political Economy, 81, 813–846.
Brozovsky, L. and V. Petricek (2007): “Recommender System for Online Dating Service,” arXiv:
cs/0703042.
Che, Y.-K. and Y. Koh (2016): “Decentralized College Admissions,” Journal of Political Economy, 124,
1295–1337.
Chiappori, P.-A. and B. Salanié (2016): “The Econometrics of Matching Models,” Journal of Economic
Literature, 54, 832–861.
Chiappori, P.-A., B. Salanié, and Y. Weiss (2015): “Partner Choice and the Marital College Premium:
Analyzing Marital Patterns Over Several Decades,” working paper.
Choo, E. and A. Siow (2006): “Who Marries Whom and Why,” Journal of Political Economy, 114,
175–201.
Dagsvik, J. K. (2000): “Aggregation in Matching Markets,” International Economic Review, 41, 27–57.
Decker, C., E. H. Lieb, R. J. McCann, and B. K. Stephens (2013): “Unique Equilibria and Substi-
tution Effects in a Stochastic Model of the Marriage Market,” Journal of Economic Theory, 148, 778–792.
Gale, D. and L. S. Shapley (1962): “College Admissions and the Stability of Marriage,” The American
Mathematical Monthly, 69, 9–15.
Galichon, A. and Y.-W. Hsieh (2017): “A Model of Decentralized Matching without Transfers,” working
paper.
Galichon, A., S. Kominers, and S. Weber (2016): “Costly Concessions: An Empirical Framework for
Matching with Imperfectly Transferable Utility,” Forthcoming, Journal of Political Economy.
24
Galichon, A. and B. Salanié (2015): “Cupid’s Invisible Hand: Social Surplus and Identification in
Matching Models,” working paper.
Graham, B. S. (2011): Handbook of Social Economics, North-Holland, vol. 1, chap. Econometric Methods
for the Analysis of Assignment Problems in the Presence of Complementarity and Social Spillovers, 965–
1052.
Hitsch, G. J., A. Hortacsu, and D. Ariely (2010a): “Matching and Sorting in Online Dating,” American
Economic Review, 100, 130–163.
——— (2010b): “What Makes You Click? Mate Preferences In Online Dating,” Quantitative Marketing and
Economics, 8, 393–427.
Juan, Y.-C., W.-S. Chin, Y. Zhuang, B.-W. Yuan, M.-Y. Yang, and C.-J. Lin (2016): LIBMF: A
Matrix-factorization Library for Recommender Systems.
Koren, Y., R. Bell, and C. Volinsky (2009): “Matrix factorization techniques for recommender sys-
tems,” Computer, 30–37.
Krzywicki, A., W. Wobcke, X. Cai, A. Mahidadia, M. Bain, P. Compton, and Y. S. Kim (2010):
“Interaction-based collaborative filtering methods for recommendation in online dating,” in International
Conference on Web Information Systems Engineering, Springer, 342–356.
Kunegis, J., G. Gröner, and T. Gottron (2012): “Online dating recommender systems: The split-
complex number approach,” in Proceedings of the 4th ACM RecSys workshop on Recommender systems
and the social web, ACM, 37–44.
Logan, J. A., P. D. Hoff, and M. A. Newton (2008): “Two-Sided Estimation of Mate Preferences
for Similarities in Age, Education, and Religion,” Journal of the American Statistical Association, 103,
559–569.
Malinowski, J., T. Keim, O. Wendt, and T. Weitzel (2006): “Matching People and Jobs: A Bilateral
Recommendation Approach,” in Proceedings of the 39th Annual Hawaii International Conference on System
Sciences-Volume 06, IEEE Computer Society, 137–3.
McFadden, D. (1976): Behavioral Travel-Demand Models, Heath and Co., chap. The Mathematical Theory
of Demand Models, 305–314.
Menzel, K. (2015): “Matching Markets as Two-Sided Demand Systems,” Econometrica, 83, 897–941.
Mourifié, I. and A. Siow (2017): “The Cobb Douglas Marriage Matching Function: Marriage Matching
with Peer and Scale Effects,” working paper.
Qiu, Y., C.-J. Lin, Y.-C. Juan, W.-S. Chin, Y. Zhuang, B.-W. Yuan, and M.-Y. Yang (2017):
recosystem: Recommender System using Matrix Factorization, r package version 0.4.2.
Richards, D., M. Taylor, and P. Busch (2008): “Expertise recommendation: A two-way knowledge
communication channel,” in Autonomic and Autonomous Systems, 2008. ICAS 2008. Fourth International
Conference on, IEEE, 35–40.
Shapley, L. S. and M. Shubik (1972): “The Assignment Game, I: The Core,” International Journal of
Game Theory, 1, 111–130.
25
Appendix A. Software and Numerical Details
To construct the pseudo market used in both Algorithm 1 and 2, we set K1 = 527 and K2 = 3. We further
perform an additional random sampling to ensure the gender ratio in the pseudo market coincides with that
of the whole sample. Under these tuning parameters, there are approximately 264 men and 81 women in the
pseudo market.
To solve the problem of MF, we use recosystem28 , which is an R wrapper of the LIBMF library
developed by Juan et al. (2016). LIBMF uses a novel parallel stochastic gradient descent algorithm to
efficiently solve the high dimensional optimization problem as defined in Eq. (8). We use the tuning function
provided by recosystem to select the user-chosen parameters. We choose K = 15 latent factors for both
men and women, and we choose the regularization parameters λp = λq = 0.0001. We choose the following
parameter values that are associated with the numerical optimization. The learning rate is 0.01, the number
of thread for parallel computation is 8, the number of bin is 100, and the maximum number of iteration is
200,000.
We use TraME, an R library developed by Galichon and O’Hara (2016–2018+), to solve the matching
equilibrium. We treat each individual as a unique type, and hence the number of each type defined in Eq. (4)
is set to one: nM W
x = nz = 1. Notice that( the CS model only ) depends
( on the social)surplus matrix Φmw as seen
in Eq. (5). For Algorithm 1, Φmw = α1 zw +α2 d(xm , zw ) + γ1 xm +γ2 d(zw , xm ) , where (α, γ) are estimates
′
of the fixed effect regression as described in section 3.2. For Algorithm 2, Φmw = pm qw . The sovler returns
the men’s equilibrium net utility Umw ≡ αmw − τmw and women’s equilibrium net utility Vmw ≡ γmw + τmw ,
where Φmw = Umw + Vmw . We use Umw to rank the potential partners for both algorithms.
We use the following Bootstrap sampling procedure to construct the training and testing sample when com-
paring the out-of-sample performance of the hit rate of the Equilibrium Content Filtering algorithm.:
1. Randomly draw mi
2. Construct his pseudo market (M (mi ), W (mi ))
3. The pseudo market of mi is the testing sample. The rest of the observations are the training sample
for estimating (α, γ).
4. Use the estimated regression coefficient to impute Φmw for m ∈ M (mi ) and w ∈ W (mi ).
5. Solve the matching equilibrium and compute hi as defined in Definition 1.
6. Repeat I times to calculate the empirical hit rate H̄ as defined in Definition 1.
Regarding the MF method, it is not possible to split the sample by completely excluding people in the
pseudo market. Unlike the regression-based approach, in which the utility parameters in the testing sample
can be extrapolated from the training sample, the MF requires that some non-missing observations in any
row and any column of ymw be supplied to the algorithm. While repeated OLS fitting and model-solving in
the Boostrap sampling is computationally cheap, this is not the case for MF. To avoid repeated training, we
use all available observations to train the MF model. When we make the recommendation from the pseudo
market as in the Equilibrium Collaborative Filtering, it is inevitable that all the women in the pseudo market
have been used in the training phrase. As a result, the hit rate of the MF-related methods reported in Figure
1 is rather an in-sample prediction. To investigate this potential issue, we further use only 50% and 20% of
the available ymw to train the MF model. Under these cases, only part of the women in the pseudo market
are used in the training phrase. However, the results remain similar.
27
If w clicked less than K1 men, then we include all these men.
28
See Qiu et al. (2017)
26
Appendix B. Tables
27
Table 1: Raw Comparison
28