Words Matter The Role of Readability Tone and Deception Cues in Online Credit Markets
Words Matter The Role of Readability Tone and Deception Cues in Online Credit Markets
Mingfeng Lin
Georgia Institute of Technology Scheller College of Business
[email protected]
Richard Sias
University of Arizona Eller College of Management
[email protected] (corresponding author)
Abstract
Using debt crowdfunding data, we investigate whether borrowers’ writing style is associated
with an online lender and borrower behaviors, whether the information contained in linguis-
tic style can mitigate information asymmetry in peer-to-peer markets, and whether online
investors correctly interpret the economic value of written texts. Peer-to-peer lenders bid
more aggressively, are more likely to fund, and charge lower rates to online borrowers whose
writing is more readable, more positive, and contains fewer deception cues. Moreover, such
borrowers are less likely to default. Online investors, however, fail to fully account for the
information contained in borrowers’ writing.
Words are free. It’s how you use them that may cost you.
– Unknown
I. Introduction
FinTech has brought disruptive change to the finance industry. For instance,
crowdfunding (online markets that allow suppliers of capital and demanders of
capital to connect directly) has experienced truly dramatic growth. Debt crowd-
funding’s loan origination volume grew an astonishing 84% per quarter between
2007 and 2014 (Federal Reserve Bank of Cleveland (2015)) and is expected to
reach $559 billion in the next 7 years (Khan, Goswami, and Kumar (2020)). There
This article was previously circulated under the titles “Linguistic Features and Peer-to-Peer Loan
Quality: A Machine Learning Approach” and “Economic Value of Texts in Online Debt Crowdfunding.”
We thank an anonymous referee, Hendrik Bessembinder (the editor), and seminar participants at
Carnegie Mellon University, Georgia Institute of Technology, Michigan State University, National
Taiwan University, Shanghai University of Finance and Economics, Tsinghua University, University
of Arizona, University of California San Diego, University of Rochester; as well as conference partic-
ipants at the 2014 Berkeley Crowdfunding Symposium, the Board of Governors of the Federal Reserve
System’s 2016 Financial Innovations Conference, the 2014 Winter Conference in Business Intelli-
gences, INFORMS 2014 Annual Meeting, INFORMS 2014 Conference on Information Systems and
Technology, and the 2017 Marketplace Innovation Conference at Stanford University, for their valuable
comments and suggestions.
1
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
2 Journal of Financial and Quantitative Analysis
are now both mutual funds (e.g., Symfonie P2P Lending Fund) and ETFs (e.g.,
LEND) investing in peer-to-peer loans. Despite their growing importance, we know
relatively little about how these markets function.
In this study, we focus on one dimension of the crowdfunding market – if
investors in an online marketplace can, and do, use information captured by
individuals’ writing style when making decisions regarding the author. Specifically,
we examine the informational content of writing by potential borrowers on an
online debt crowdfunding platform (Prosper.com). As Iyer Khwaja, Luttmer, and
Shue ((2016), p. 1556) note, the scarcity of work in understanding the role of
writing in financial decisions arises, “In general, coding soft information is chal-
lenging because it is difficult to quantify the information content of pictures or
lengthy personal text descriptions.” We overcome this hurdle by exploiting a
combination of well-developed computational linguistics measures and advances
in machine learning to examine the roles of linguistic style (i.e., how borrowers
write). Specifically, motivated by previous work in finance, accounting, and com-
putational linguistics, we focus on three linguistic dimensions – readability, tone,
and deception cues – to investigate three issues: i) Do lenders use the information
contained in borrowers’ readability, tone, and deception cues when deciding which
loans to bid on, which loans to fund, and what rates to charge? ii) Do these three
linguistic dimensions reveal information regarding a borrower’s default likelihood?
and iii) do lenders fully incorporate the information that can be gleaned from these
three dimensions of the prospective borrower’s writing?
Our first set of tests reveals evidence consistent with the hypothesis that
writing style influences lenders’ behavior. Specifically, controlling for verified hard
credit information (e.g., credit grade, number of open credit lines), unverified hard
credit information (e.g., employment status), and easily quantifiable nonstandard
information (e.g., the maximum rate a borrower is willing to pay), the likelihood of
funding, the number of bids, and the total dollar value of bids are all positively
related to the loan request’s readability and positive tone, but negatively related to
the number of deception cues in the loan request. The results are also economically
meaningful. For instance, 1-standard-deviation higher level of readability, higher
level of a positive tone, or lower level of deception cues are associated with a 3.0%,
11.1%, and 2.5%, respectively, increase in funding likelihood relative to the mean
funding likelihood.
Consistent with increased lender competition for borrowers whose writing is
more readable, more positive, and contains fewer deception cues, borrowers with
these characteristics enjoy lower market clearing rates. Although the results are
statistically significant for only one of the three metrics (when based on a two-tail
test; when based on a one-tail test, all three are at least marginally different from 0),
the economic magnitudes are relatively small – controlling for credit information
and auction characteristics, 1-standard-deviation higher readability, positive tone,
or lower deception cues is associated with, respectively, a 4, 11, and 3 basis point
reduction in the interest rate offered to the borrower.
Our second set of tests investigates whether the evidence suggests lenders
respond to borrowers’ writing because lenders extract information from linguistic
features (e.g., lenders are more likely to bid on loans by borrowers with more
readable texts because, ceteris paribus, such borrowers are less likely to default) or
because lenders irrationally respond to borrowers’ writing (e.g., borrowers with
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 3
more readable texts pay lower rates, but are just as likely to default as otherwise
similar borrowers). Consistent with the former, we provide the first direct evidence
(of which we are aware) that borrowers’ writing style contains information that can
be used to help explain default likelihood.1 Specifically, controlling for verified
hard credit information, unverified hard credit information, and easily quantifiable
nonstandard information, default likelihood is inversely related to readability and
positive tone, but positively related to deception cues. The results are also econom-
ically meaningful. For instance, controlling for other factors, 1- standard-deviation
higher level of deception cues is associated with a 5.70% higher default likelihood
relative to the mean (statistically significant at the 1% level).
Our final set of tests examines whether the evidence suggests crowdfunding
lenders fully use the informational content associated with the linguistic features we
evaluate. Because lenders’ expected return is the rate charged to the borrowers less
the impact on returns due to losses from default, the rate charged to a borrower
should be a direct function of the borrower’s default risk.2 As a result, if lenders
fully incorporate the available information regarding default likelihood into the
rates they are willing to offer borrowers, factors beyond the interest rate should not
play a role in predicting default. Using this framework, we test whether the evidence
suggests lenders fully incorporate the information contained in the borrowers’
writing style. That is, lenders may overweight (e.g., charge too low a rate to a
borrower with a highly readable description) or underweight (e.g., although the
highly readable borrower receives a lower rate, the rate is still too high given the
borrower’s reduced default likelihood) linguistic features.
Our empirical tests suggest that although borrowers’ writing appears to influ-
ence lenders’ decisions, lenders fail to fully incorporate the informational content
that can be garnered from borrowers’ readability, tone, and deception cues. For
instance, although a borrower whose writing is less readable pays a higher interest
rate (on average) than an otherwise similar borrower, the rate is not high enough to
fully offset the increased default likelihood associated with lower readability. That
is, holding the interest rate constant and controlling for other credit and auction
characteristics, a borrower with a less readable loan description is more likely to
default. The results are especially strong for deception cues – holding interest rates,
credit information, and auction characteristics constant, 1-standard-deviation
higher level of deception cues is associated with 5.51% increase in default likeli-
hood (relative to the mean; statistically significant at the 1% level).
Although our results are consistent with the hypothesis that linguistic dimen-
sions influence both lenders’ decisions and borrowers’ behaviors, it is possible that
the linguistic dimensions are associated with other variables that drive the relations.
For example, as detailed in the next section, work suggests that the characteristics
of photos posted on Prosper predict both lender and borrower behavior. A series of
robustness tests, however, suggest that linguistic style and photos offer unique infor-
mational content. For example, we largely find the same relations (for readability, tone,
and deception cues) when limiting the sample to listings and loans without photos
1
We discuss related literature in the next section.
2
For example, Iyer et al.’s (2016) model yields a direct linear relation between default probability and
(1 þ r)1, where r is the rate charged to a borrower.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
4 Journal of Financial and Quantitative Analysis
or when limiting the sample to listings and loans with photos and accounting for
image characteristics (gender, race, positive emotion, and attractiveness).
As a final robustness test, we conduct a series of 7 quasi-experimental studies
using Amazon Mechanical Turk (AMT) workers to help ensure that lenders are
responding to linguistic dimensions rather than other potentially correlated vari-
ables. Specifically, we use pairs of texts from our data that differ in readability (but
hold loan purpose, length, tone, and deception cues approximately constant) and
ask AMT workers to identify the request they perceive as more readable. We
conduct analogous experiments to test if AMT workers perceive texts to differ in
tone and deception. In the second set of tests, we present the more and less readable
texts and ask workers which loan they would fund with their money. We also repeat
these tests for loans pairs partitioned by positive or negative tone and high or low
deception cues. Both sets of tests provide support for hypotheses – AMT workers
both recognize differences in readability, tone, and deception cues, and these
differences impact their lending decisions.
In sum, our results are consistent with the hypothesis that lenders are more
likely to bid on loans from borrowers with more readable texts, with a more positive
tone, and fewer deception cues because such borrowers are less likely to default.
Nonetheless, much of the information contained in borrowers’ writing style
remains unincorporated – although a borrower whose writing is less readable, less
positive, and contains more deception cues pays a higher rate, the rate is not high
enough to fully offset the additional default risk.
Our research contributes to understanding the fast-developing FinTech market
in which online peer-to-peer lending is a major component. Specifically, we dem-
onstrate that similar to professional lenders (e.g., Agarwal and Hauswald (2010),
Agarwal, Ambrose, Chomsisengphet, and Liu (2011)), our results suggest that peer-
to-peer lenders directly vet borrowers using nonstandard information when making
lending decisions. Although these investors appear to infer information from
writing style, they fail to fully incorporate the information into their decisions
(i.e., there is room for efficiency improvement). Moreover, in an effort to speed
up the funding process, most peer-to-peer platforms have eliminated written com-
ponents. This decision may be perfectly rational as it increases speed and ease of
funding which may drive more lenders and borrowers to the platform. Nonetheless,
our results do suggest that there is an informational efficiency cost associated with
eliminating borrowers’ written statements.3 As we discuss in greater detail in
Section V, however, one potential limitation of using soft information (such as
images and linguistic attributes) on such platforms is that soft information can be
easily manipulated by potential borrowers (i.e., it is not a costly signal).
Despite these limitations, our results suggest that crowdfunding and other
FinTech platforms can use easily scalable computational linguistic technologies
to extract information about the author from any data that contains writing.
Our results, for instance, suggest that platforms may be able to exploit information
3
It is also possible that deceptive borrowers could become more important over time if they realize
their deceptions influence lenders given evidence that lenders fail to fully incorporate this information.
On the other hand, one may propose that lenders would eventually learn and become more efficient at
using deception cues.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 5
content such as using deception cues to determine which loan applications to verify
credit information (e.g., income and employment status) before allowing the list-
ings to be posted publicly. This could improve loan performance in their portfolio
and help improve a platform’s long-term reputation. Our methods can be readily
adapted or extended to other types of crowdfunding such as donations (e.g.,
Gofundme.com), rewards (e.g., Kickstarter.com), or equity (e.g., Seedrs.com). In
fact, texts may be even more valuable in many other, nonloan, crowdfunding
situations that do not have the luxury of extensive hard credit financial information.
II. Background
Research in the past two decades reveals the important role of writing in
financial decisions. The vast majority of this work, however, focuses on informa-
tional content gleaned from professionally written documents such as annual
reports, SEC filings, analyst reports, and media reports.4 Most of the limited work
examining the writing of nonprofessionals, primarily via internet message boards
(e.g., Antweiler and Frank (2004), Das and Chen (2007), and Sabherwal, Sarkar,
and Zhang (2011)), reveals that although non-professionals’ writing can impact
volatility, there is little evidence it contains meaningful information regarding
public equity valuations.
Relatively little work focuses on the role of linguistic style in crowdfunding
markets. Lenders in the Prosper market (in our sample period) can view four
information components: hard credit information (e.g., the potential borrower’s
number of open credit lines), easily quantifiable nonstandard information (e.g.,
maximum rate the borrower is willing to pay) and two potential sources of hard to
quantify soft data – an optional photo and a written description of the borrower’s
financial situation and purpose of the loan. In a clever study, Iyer et al. (2016) infer
that because the combination of hard data and easily quantifiable nonstandard
information does not fully explain lenders’ ability to discriminate among bor-
rowers, Prosper lenders must extract valuable information from the soft noneasily
quantifiable data. Specifically, the authors use the residual in a regression of the
interest rate on standard and easily quantifiable nonstandard data as an estimate for
the portion of the interest rate determined by noneasily quantifiable soft data
(i.e., photos and writing). This residual, of course, also captures information inves-
tors infer from hard data or the easily quantifiable nonstandard data, unless the
authors’ model is perfectly specified. Moreover, although consistent with Prosper
investors using soft information, their latent variable approach does not indicate
4
See, for example, Tetlock (2007), Li (2008), Tetlock, Saar-Tsechansky, and Macskassy (2008), You
and Zhang (2009), Cecchini, Aytug, Koehler, and Pathak (2010), Feldman, Govindaraj, Livnat, and
Segal (2010), Goel, Gangoly, Faerman, and Uzuner (2010), Miller (2010), Humpherys, Moffitt, Burns,
Burgoon, and Felix (2011), Lehavy, Li, and Merkley (2011), Loughran and McDonald (2011), (2013),
(2014), Dougal, Engelberg, Garcia, and Parsons (2012), Gurun, and Butler (2012), Rennekamp (2012),
Jegadeesh and Wu (2013), Lawrence (2013), Huang, Zang, and Zheng (2014), Tan, Wang, and Zhou
(2014), Franco, Hope, Vyas, and Zhou (2015), Purda and Skillicorn (2015), Asay, Elliott, and Renne-
kamp (2017), Bonsall, Leone, Miller, and Rennekamp (2017), Hwang and Kim (2017), and Lo, Ramos,
and Rogo (2017).
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
6 Journal of Financial and Quantitative Analysis
what information investors’ use, how they use it, or whether investors fully exploit
the informational content of the soft data.
While our study focuses on written text, two previous studies provide evidence
that photos posted in Prosper listings (just over half of Prosper listings in our data
include photos) influence peer-to-peer investors’ decisions. Unfortunately, these
studies provide contradictory evidence regarding the Iyer et al. (2016) contention
that investors extract value-relevant information from soft data. Duarte, Siegel, and
Young (2012) find that borrowers whose photos are judged “trustworthy” are more
likely to have their loans funded, pay a lower rate, and are less likely to default. In
contrast, Ravina (2019) suggests that peer-to-peer investors do not extract value-
relevant information from borrowers’ photos. Specifically, Ravina concludes that
photos provide disinformation to investors by encouraging investors to fund and
offer lower rates to “beautiful” borrowers who are just as likely to default as less
attractive borrowers.5
Although not their primary focus, several previous studies consider some
linguistic dimensions in their investigations of peer-to-peer lending – primarily
in the form of easily quantifiable metrics such as number of words in the written text
or number of words within word list categories (e.g., “Money” words including, for
instance, “cash” and “owe”). In an early investigation into the role of friendship
networks on Prosper, Lin, Prabhala, and Viswanathan (2013) include the natural
logarithm of the number of words of text in the listing as a control variable in their
primary tests. In a robustness test, the authors select 12 categories (e.g., money
words, positive emotion words, certainty words) from Linguistics Inquiry and Word
Count (LIWC) that classifies words into five main categories and 80 subcategories
(see Tausczik and Pennebaker (2010)). Although the authors do not provide details
from these tests, they conclude, “Most variables [e.g., number of money words],
however, do not show consistent results across the three outcome variables [funding
likelihood, interest rate, default].” Similarly, Iyer et al. (2016) include several easily
quantifiable measures of the listing text including number of HTML characters,
number of text characters, average word length, average sentence length, number of
numerics, percent of misspelled words, number of dollar signs, percent of listing as
signs, and number of characters in listing title. Because the authors treat these
variables as controls, they do not report coefficients or discuss if any of their
linguistic metrics are meaningfully related to outcomes.
Ravina (2019) focuses on the role of soft information captured by images
(primarily beautiful vs. nonbeautiful borrowers and race) in a sample of Prosper
listings and loans that include images (approximately half of the Prosper listings
include an image). Ravina uses three different word list sources (LIWC, DICTION,
and positive and negative business-related words by Henry (2008)) to generate
20 linguistic measures (e.g., Tenacity are words that reflect confidence and totality).
The author primarily focuses on how these linguistic counts are related to images
(e.g., the number of tenacious words is higher for borrower’s whose photos are
viewed as trustworthy, but there appears to be no relation between the number of
5
See both Duarte et al. (2012) and Ravina (2019) for discussions of differences (in samples, methods,
variables, and results) between these studies. Graham, Harvey, and Puri (2016) examine related issues in
evaluating CEOs.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 7
tenacious words and beauty). Ravina also examines how incorporating the 20 lin-
guistic measures impacts the results for images. She finds, for instance, that beau-
tiful borrowers are still more likely to receive funding when controlling for the
20 linguistic measures (e.g., number of tenacious words). The author finds no
evidence of a meaningful relation between the categorical word counts and either
funding likelihood or the internal rate of return earned on a loan (a measure of
default likelihood) for the vast majority of linguistic word categories and none of
the word count categories are meaningfully related to both funding likelihood and
internal rate of return.
As far as we are aware, only one previous study focuses on what information
investors may garner from borrower’s writing style in peer-to-peer lending. Spe-
cifically, using two different peer-to-peer lending platforms in Germany, Dorfleit-
ner, Priberny, Schuster, Stoiber, Weber, de Castro, and Kammler (2016) examine
three linguistic measures: spelling errors, text length, and indicator variables for the
presence of four groups of “social and emotional” words (e.g., the five negative
emotion words are funeral, lament, sick, difficult, and deceased). The authors
hypothesize that i) fewer spelling errors and longer descriptions (“up to a certain
amount of words”) will be associated with greater funding likelihood and lower
default likelihood and ii) because investors will respond irrationally to social and
emotional words, the use of such words (e.g., either positive words and negative
words) will be associated with greater funding success but a higher default likeli-
hood. Inconsistent with the indirect evidence in Iyer et al. (2016), Dorfleitner et al.
(2016) find no evidence that spelling errors, text length, or indicators for the
presence of any of the 4 groups of social and emotional words are meaningfully
associated with default likelihood.6 We more fully discuss the Dorfleitner et al.
(2016) study, as well as tangentially related work in the marketing and accounting
literatures, in the Supplementary Material.
Our debt crowdfunding data comes from Prosper, one of the largest peer-to-
peer lending sites in the USA, with more than $13 billion in funded loans since
inception. In our sample period (Feb. 12, 2007, to Oct. 16, 2008), Prosper operated
as a reverse Dutch auction.7 Prospective borrowers created a request for a 36-month
amortizing loan (known as a “listing”) containing information regarding the loan
and the borrower for any amount up to a maximum of $25,000.
6
As discussed in greater detail in the Supplementary Material, direct comparison of the Dorfleitner
et al. (2016) evidence to the studies based on U.S. data is limited due to fundamental differences in
platform structure. For instance, credit scores (likely the most important hard information) are optional
and missing for more than 75% of the loans on one of the platforms Dorfleitner et al. examine.
7
Our sample matches the sample period in Iyer et al. (2016). Prosper voluntarily stopped funding
loans in Oct. 2008 while the SEC (who issued a cease-and-desist order for all peer to peer lenders in Nov.
2008) investigated the peer to peer market. Prosper reopened in July 2009 but under a new model with
different credit categories and classification mechanisms. Similarly, Prosper used substantially different
credit classifications prior to the start of our sample period. See Iyer et al. for additional details.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
8 Journal of Financial and Quantitative Analysis
B. Linguistic Metrics
Our study focuses on the linguistic features of borrowers’ writing (i.e., how
they write) by employing well-developed methods from the computational linguis-
tics literature. Specifically, we evaluate three linguistic features – readability, tone,
and deception cues – that have been the focus of extensive work in closely related
literature (almost uniformly based on professionally written texts such as annual
8
See Prosper’s S1-A filing (https://www.sec.gov/Archives/edgar/data/1416265/000110465908074769/
a08-29602_1s1a.htm) for additional details. Prosper later moved to a different system to assign credit
grades.
9
Iyer et al. (2016) provide an example Prosper listing in their online Appendix D (http://doi.org/
10.1287/mnsc.2015.2181).
10
Investors could also click on links that provide information regarding the borrowers’ Prosper
“friends” and “group membership.” See Lin et al. (2013) for details. Our empirical tests include
indicators for these variables.
11
Assume, for example, that a borrower asks for $1,000 at a maximum rate of 20% and receives bids
from investors A and B for $500 each at 20% followed by bids, sequentially, from investors C and D for
$500, each at 19%, and one bid for $500 at 18% from investor E prior to the listing end date. In this case,
investors C (the first to bid 19%) and E (with the lowest rate bid) would fund the loan at 19% (the lowest
market-clearing rate).
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 9
reports or analyst reports). Beattie ((2014), p. 112) points out, for example, that
most of the natural language processing (NLP) accounting literature focuses on
“readability, tone, and markers of deception.” Moreover, as detailed below, research
in a variety of fields demonstrates that these linguistic dimensions can both reveal
information about the writer and influence readers’ perceptions and behaviors. We
recognize, of course, that by its very nature, one could never capture every dimen-
sion of variation in writing style. However, as we detail below, these three linguistic
dimensions are well-established.12
1. Readability
Readability refers to the ease of understanding and comprehension due to the
writing style. Linguistics research demonstrates that readability enhances readers’
positive perceptions about the author (e.g., Oppenheimer (2006), Shah and Oppen-
heimer (2007)) and that clearer writing can indicate higher education levels and
social status (e.g., Hargittai (2006)).13 Empirical studies in a wide range of disci-
plines confirm this view. For example, readability is shown to improve reader
attention and retention on web blogs (Singh, Sahoo, and Mukhopadhyay (2014))
and increase product sales (Ghose and Ipeirotis (2011)). An extensive literature in
finance and accounting also supports this view for professionally written content;
for example, investors react more strongly to more readable annual reports or
analyst reports (e.g., Li, (2008), You and Zhang (2009), Miller (2010), Rennekamp
(2012), Lawrence (2013), Huang et al. (2014), Loughran and McDonald (2014),
Tan et al. (2014), Franco et al. (2015), and Hwang and Kim (2017)). Given this body
of work, we predict that more readable Prosper listings will generate greater lender
interest, be more likely to receive funding, and benefit from lower rates. Moreover,
if readability reveals information about the borrower, then higher readability should
be associated with lower default risk.
Drawing on existing literature, our readability metric is composed of three
dimensions: spelling errors, grammatical errors, and lexical complexity. The first
two dimensions are based on the simple idea that fewer spelling and grammatical
errors make a text more readable (e.g., Ghose, Ipeirotis, and Li (2012), Loughran
and McDonald (2014)). We measure spelling errors as the number of errors nor-
malized by the number of words. We use LanguageTool, a rule-based grammar-
checker, to measure the number of grammar errors in a text and, similar to spelling
errors, normalize by the number of words. We use the well-known Gunning FOG
measure (which is a combination of sentence length and proportion of words with
three or more syllables) to capture reading complexity.14 Several studies (Loughran
12
Because the linguistic data are “soft,” there is literally no limit on the potential number of measures.
We select our three dimensions based on related literature. Nonetheless, it is always possible that some of
our dimensions are correlated with other factors that influence online borrower or lender behavior. For
instance, perhaps a person asking for a loan for funding medical expenses will use more complicated
(less easily readable) language and peer-to-peer investors are less likely to fund medical expense loans
for some reason. Regardless, our empirical results demonstrate that our measures correlate with peer-to-
peer investor decisions and default risk.
13
Given three linguistic components, the related literature is very large. Here, we focus on a subset of
closely related papers.
14
The Gunning FOG index estimates the years of formal education one needs to understand the
writing. A broad literature uses the metric to examine the readability of annual reports and analyst
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
10 Journal of Financial and Quantitative Analysis
and McDonald (2014), Bonsall et al. (2017)) point out that the FOG metric may be a
poor measure of 10-K readability, because annual reports have a large number of
“complex” words that are easy for investors to understand (e.g., management). In
our case, however, the writing is non-professional and written by prospective online
borrowers with relatively few traditional credit options. As a result, the FOG index
is well-suited for our purposes. As discussed above, previous work uses some of
these dimensions as controls for their investigation of other factors. For instance,
both Duarte et al. (2012) and Lin et al. (2013) include average sentence length
(a component of the FOG index) as a control variable.
To generate our readability measure, we first rescale each component (spelling
errors, grammatical errors, and lexical complexity) to zero mean with unit variance
(i.e., we standardize so that each of the three components contributes equally to the
measure) and then sum the three standardized components. We then standardize the
sum and multiply it by 1 (such that a higher value indicates greater readability) to
generate the final readability measure. Panel A of Table 1 provides a brief overview
of the measure and the Supplementary Material provides construction details for the
three readability components.
2. Tone
Extant research finds that investors respond favorably to positive tone in
professionally written text (e.g., Tetlock (2007), Tetlock et al. (2008), Loughran
and McDonald (2011), Gurun and Butler (2012), and Jegadeesh and Wu (2013)). In
addition, work suggests that entrepreneurs with greater positivity are both more
likely to succeed (e.g., Bird (1989), Zott and Huy (2007)) and receive greater
funding (Chen, Yao, and Kotha (2009)). Thus, analogous to readability, we hypoth-
esize that writing with a more positive tone will be associated with greater lender
demand, lower market-clearing interest rates, and lower default likelihood.
Given the nature of the data, it is not surprising that tone lacks a universal
definition or metric. In addition, previous work refers to tone as “tone” (e.g., Tetlock
et al. (2008)), “sentiment” (e.g., Tetlock (2007)), and both “tone” and “sentiment”
(e.g., Loughran and McDonald (2011)). There are (at least) two ways to measure
tone at scale – a lexicon (or dictionary-based) approach or a machine learning
approach. Most work uses the lexicon approach. For instance, Tetlock (2007) and
Tetlock et al. (2008) use the Harvard-IV-4 dictionary as the basis for their metrics,
Ravina (2019) uses the Henry (2008) dictionary to compute word counts for
positive and negative business-related words, Lin et al. (2013) use the Linguistic
Inquiry and Word Count (LIWC) dictionary, and Loughran and McDonald (2011)
build their own word lists. Most closely related to our work, Dorfleitner et al. (2016)
also build their own list. Specifically, Dorfleitner et al. use an indicator variable for
the presence of one of any seven “positive emotion” keywords (thank you, rejoice,
dream, urgent, healthy, desire, and trust) and a second indicator for any one of five
“negative emotion” keyword (funeral, lament, sick, difficult, and deceased).15 The
reports, for example, Li (2008), Miller (2010), Lawrence (2013), Franco et al. (2015), and Asay et al.
(2017).
15
Dorfleitner et al. (2016) word lists are based on translations from Google Translate.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 11
TABLE 1
Definitions
Table 1 provides definitions and construction information for variables used in the study.
Variable Names Definition and Measurements
READABILITY SP_ERR The ratio of the number of spelling errors to the number of words in the
loan description request standardized to zero mean, unit variance
GRAMM_ERR The ratio of the number of grammatical errors to the number of words
in the loan request standardized to zero mean, unit variance
LEXICAL_COMPLEXITY An estimate of the years of formal education needed to understand the
(FOG) first reading of a text. FOG index is 0.4(average sentence length þ 100
(percentage of words with at least three syllables)). Value is
standardized to zero mean, unit variance
READABILITY = 1 (Standardized SP_ERR þ Standardized GRAMM_ERR þ Standardized FOG) rescaled to zero mean,
unit variance
TONE POSITIVITY Machine learning (based on a sample of 3,379 listings stratified
across credit grades; 70% training, 30% testing) to identify tone. Tone
is rescaled to zero mean, unit variance
DECEPTION_CUES EXCL_WORDS Ratio of exclusion words (e.g., “but,” “without,” and “except”) to the
number of words in the loan description rescaled to zero mean, unit
variance
MOT_WORDS Ratio of motion verbs (e.g., “walk,” “move,” and “go”) to the number of
words in the loan description rescaled to zero mean, unit variance
FIRST_PER_PRON Ratio of first-person pronouns to the number of words in the loan
description rescaled to zero mean, unit variance.
THIRD_PER_PRON Ratio of third-person pronouns to the number of words in the loan
description rescaled to zero mean, unit variance
NEG_EM_WORDS Ratio of negative emotion words (e.g., “hate,” “sorry,” and
“worthless”) to the number of words in the loan description rescaled to
zero mean, unit variance
Deception cues = (Standardized MOT_WORDS þ Standardized THIRD_PER_PRONþ Standardized NEG_EM_WORDS)
(Standardized EXCL_WORDSþ Standardized FIRST_PER_PRON) rescaled to zero mean, unit variance. Deception cue
word classifications are from the Linguistics Inquiry and Word Count dictionary (LIWC)
Panel B. Verified Hard Credit Information
CR_GRADE Letter assigned to application based on the borrower’s credit score. The seven credit grades are AA
(least risky), A, B, C, D, E, HR (high risk; most risky)
BANKCARD_UTIL The sum of the balances owed by the borrower on open bankcards divided by the sum of the cards’
credit limits
AMT_DELQ Natural logarithm of (1 þ dollar amount delinquent)
DEQL(LAST_7_YR) Natural logarithm of (1 þ number of delinquencies in prior 7 years).
INQ_LAST_6_MTHS The number of credit inquiries in previous 6 months
PUB_REC(10_YR) The number of public records (bankruptcy, civil judgments, and tax liens) in the prior 10 years
PUB_REC(LAST_YR) The number of public records (bankruptcy, civil judgments, and tax liens) in the previous year
CURRENT_LOC Natural logarithm of (1 þ borrower’s number of current credit lines)
OPEN_LOC Natural logarithm of (1 þ number of borrower’s open credit lines)
CR_HIS(MTHS) Natural logarithm of (1 þ number of months of credit history)
REVOLV_CR_BAL Natural logarithm of (1 þ borrower’s dollar amount of revolving credit balance)
LOAN_AMT Natural logarithm of the loan amount requested by the borrower
DATE_IND Indicator variables for each month in our sample period (total of 21 indicators)
Panel C. Unverified Credit Information
TABLE 1 (continued)
Definitions
FUNDED_IND An indicator that equals one if a listing is fully funded and becomes a loan
NUM_OF_BIDS The total number of bids a listing receives
$AMT. BID/$AMT. The ratio of the total dollar value of all bids for a listing to the total requested amount of the listing
REQSTD (winsorized at 99% level)
BWR_IR(r) Rate borrower pays
1/(1 þ BWR_IR) 1/(1 þ rate borrower pays)
%PRINPAL_REPAID (Original principal defaulted value)/original principal
DEFAULT_IND An indicator that equals one if the loan defaults (status is “Defaulted (Bankruptcy),” “Defaulted
(Delinquency),” “Charge-off,” or “4þ months late”)
3. Deception Cues
Because the written component is unverified, potential borrowers may attempt
to deceive lenders. Research shows, for example, individuals intentionally misstate
facts in online dating profiles (Toma and Hancock (2012)). Psycholinguistics research
demonstrates that fabricated stories linguistically differ from true stories and deception
cues can help identify dishonesty.18 Moreover, research suggests that most writers are
16
Our training and testing data set accounts for approximately 1% of loan applications in Feb. 2006 to
Jan. 2013 period which includes data outside our sample period (Feb. 2007 to Oct. 2008). As noted
above, following Iyer et al. (2016), we limit our sample period from Feb. 2007 (when Prosper changed
their minimum credit score requirement) to Oct. 2008 (when Prosper closed temporarily due to SEC
scrutiny regarding peer to peer lending).
17
Precision and recall are standard measures of pattern recognition. Precision is the number of
properly classified (positive, neutral, or negative) tone sentences over the number of identified (correctly
and incorrectly) sentences. Recall is the ratio of the number of correctly identified tone sentences to the
number of actual sentences.
18
Given the size of this literature, we primarily focus on empirical evidence. For theoretical models
supporting these metrics see, for example, Humpherys et al. (2011).
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 13
unaware of these cues, and even if they were, these cues are largely unconscious and
difficult to control (see, e.g., the discussion in Toma and Hancock (2012)).
A related body of work examines the linguistic features of annual reports
(usually the “Management Discussion and Analysis” section) to detect fraud (e.g.,
Cecchini et al. (2010), Goel et al. (2010), Humpherys et al. (2011), Purda and
Skillicorn (2015), Siering, Koch, and Deokar (2016), Dong, Liao, and Zhang
(2018), and Wang, Li, and Singh (2018)). Several related studies (e.g., Hobson,
Mayew, and Venkatachalam (2012), Larcker and Zakolyukina (2012)) evaluate
vocal markers or linguistic features from earnings conference calls for indications
of fraud.
We focus on 5 deceptive communication cues identified in previous work. The
first two components – exclusion words and motion verbs – come from research
(see, e.g., Newman, Pennebaker, Berry, and Richards (2003), Pennebaker, Mehl,
and Niederhoffer (2003), Hancock, Curry, Goorha, and Woodworth (2008), and
Toma and Hancock (2012)) that suggests lying is more cognitively demanding
than truth-telling. This work argues that exclusion words (e.g., “but,” “except,” and
“without”) are cognitively taxing for a liar because i) such words require keeping
track of what belongs in a category and what does not, and ii) it is harder to create
what did not happen than what did happen. Analogously, this literature suggests that
motion verbs (e.g., “walk,” “move,” and “go”) are less cognitively complex
because they provide simple concrete descriptions relative to more cognitively
demanding statements regarding judgments and evaluations and, as a result, liars
tend to use more motion words.19
The third and fourth components – first-person pronouns and third-person
pronouns – are based on work that suggests liars tend to dissociate themselves from
their lies. As a result, research (e.g., Newman et al. (2003), Hancock et al. (2008),
and Toma and Hancock (2012)) indicates that deceivers tend to use more third-
person pronouns (e.g., “he,” “him,” and “her”) and fewer first-person pronouns
(e.g., “I,” “we,” and “me”) in their writing.
The final deception cue comes from research (see, e.g., Knapp, Hart, and
Dennis (1974), Vrij (2000), Newman et al. (2003), Pennebaker et al. (2003), and
Toma and Hancock (2012)) that suggests the act of deceiving generates negative
emotions (such as anxiety, shame, and guilt) resulting in increased use of negative
emotion words such as “hate,” “sorry,” and “worthless.”
In sum, deceivers tend to use more motion words, third-person pronouns, and
negative emotion words, but fewer exclusion words and first-person pronouns.
Thus, we first scale each of these five cues by the total number of words in the
written description (i.e., each measured as a percentage of words in the listing) and
then standardize (i.e., rescale each component to zero mean with unit variance) each
of the five cues to create an equally weighted (across cues) measure of deception
cues. Specifically, we compute the sum of the first three (i.e., standardized motion
words, standardized third-person pronouns, and standardized negative emotion
words) cues and subtract the sum of the last two (i.e., standardized first-person
pronouns and standardized exclusion words) cues. The resulting value is then
Newman et al. (2003) give the example that the motion verb in “I walked home” is less cognitively
19
complex that the statement “Usually I take the bus, but it was such a nice day.”
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
14 Journal of Financial and Quantitative Analysis
TABLE 2
Descriptive Statistics
Table 2 reports descriptive statistics for listings and loans during our sampling period (Feb. 12, 2007–Oct. 16, 2008). Table 1
provides definitions for all variables.
Variable N Mean Median Std. Dev. Minimum Maximum
20
As detailed in the Supplementary Material, the three linguistic measures are largely independent of
each other with correlations ranging from 0.13 (tone and deception cues) to 0.04 (readability and
deception cues).
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 15
C. Descriptive Statistics
Table 1 reports definitions for the linguistic metrics (Panel A), the verified hard
credit information available to lenders (Panel B), the (generally) unverified self-
reported credit information (Panel C), easily quantifiable nonstandard information
(Panel D), and the listing and loan outcome variables (Panel E). Table 2 reports
descriptive statistics for these variables. Our sample includes all listings and loans
from Feb. 12, 2007, to Oct. 16, 2008.21 As shown in Table 2, our sample consists
of 215,930 loan requests, of which 17,443 were funded. Approximately 7.4% of
funded loans were withdrawn or canceled prior to close, yielding a total of 16,044
funded loans that were fully executed. As shown in the bottom row in Panel E of
Table 2, 38% of those loans subsequently defaulted.
21
We exclude one 0% interest rate loan made between family members who were both Prosper
lenders “… to see what the borrowing side is like …”
22
Recall the standard deviation of readability, tone, and deception cues are all one. Thus, dividing the
coefficient by the overall funding likelihood (0.0808) generates the above figures.
23
To conserve space, the coefficients for the control variables are reported in the Supplementary
Material. Control variable results are largely consistent with previous work and expectations, for
example, high credit grade loans pay lower rates.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
16 Journal of Financial and Quantitative Analysis
TABLE 3
Linguistic Measures and Funding Success
The first column in Table 3 reports marginal effects (standard errors in parentheses) from a probit regression of the funding
success indicator on hard credit information, unverified credit information, easily quantifiable nonstandard information, and
three linguistic features – readability, positivity, and deception cues. The second through fourth columns report marginal
effects (standard errors in parentheses) from Tobit regressions of number of bids, total dollar amount bid divided by amount
requested, and interest rate, respectively, on the same independent variables. Table 1 provides definitions for all variables.
***, **, and * indicate statistical significance at the 1%, 5%, and 10% levels from two-tailed tests, respectively. The sample
includes 215,930 listings between Feb. 12, 2007 and Oct. 16, 2008.
Loan Funded Number of Total $Amount Bid/$Amt. Interest
Indicator Bids Requested Rate
Recall that Prosper acted as a reverse Dutch auction over our sample period.
As a result, for most of the loans in our sample (i.e., as noted above and shown in
Table 2, 76% of loans were open for duration), the total number of bids and the total
amount of bids are unlimited.24 That is, even when a request is 100% funded,
another lender can bid a lower rate. Thus, we next examine the relation between the
linguistic characteristics and both the total number of bids received and the ratio of
total dollar amount bid to dollar amount requested. Specifically, we estimate Tobit
regressions of the number of bids and total dollar amount bid/dollar amount
requested on the three linguistic measures and the same set of controls (verified
credit information, unverified credit information, and easily quantifiable nonstan-
dard information). The results, reported in the second and third columns of Table 3,
reveal that loan listings that are more readable, more positive, and contain fewer
deception cues experience both a greater number of bids and a greater dollar value
of bids (normalized by the amount requested).25 Once again, the results are eco-
nomically meaningful. For instance, 1-standard-deviation larger value of the decep-
tion cue metric (recall the linguistic measures are standardized) is associated with
3.15 fewer bids and a 2.1% lower value of the ratio of total dollars bid to dollars
requested.
We next investigate if the increased investor demand associated with more
readable and positive descriptions with fewer deception cues results in lower rates
for borrowers whose writing exhibits those characteristics. Analogous to Duarte
24
Although our tests include an indicator to differentiate closed when funded listings from open for
duration listings, to ensure the number of bids and the total bid amount are not impacted by the closed
when funded loans, we repeat these tests limiting the sample to open for duration loans only. As shown in
the Supplementary Material, our results are qualitatively identical.
25
In the Supplementary Material, we also consider the total dollar value bid (i.e., not normalized by
amount requested) as a dependent variable and find similar results.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 17
et al. (2012) and Ravina (2019), we estimate a Tobit regression of the interest rate
charged to borrowers on the three linguistic measures and the full set of controls
listed in Panels B–D of Table 2. The results, reported in the last column of Table 3,
reveal that only the coefficient associated with positive tone differs meaningfully
from 0 (at the 1% level). Although the coefficients associated with readability and
deception cues have the expected signs, their values do not differ meaningfully from
0 based on a two-tailed test. Arguably, given our hypotheses, these should be one-
tail tests – in which case, both coefficients are marginally (at the 10% level) different
from 0. The economic impact, however, is relatively small – 1-standard-deviation
higher readability, higher positive tone, or lower deception cues is associated with a
4 basis point, 11 basis point, and 3 basis point lower interest rate, respectively.
Relative to the mean interest rate of 18.3%, these values represent a 0.20%, 0.61%,
and 0.19% reduction in the interest rate paid.26
In short, regardless of how we frame the test, the results in Table 3 are
consistent with the hypothesis that improved readability, more positive writing,
and fewer deception cues are all associated with greater peer-to-peer investor
demand.
TABLE 4
Linguistic Measures and Default Risk
The first column in Table 4 reports marginal effects (standard errors in parentheses) from a probit regression of the default
indicator on hard credit information, unverified credit information, easily quantifiable nonstandard information, and three
linguistic features – readability, positivity, and deception cues. The second column reports the marginal effects of a Tobit
regression of the fraction of principal repaid on the same independent variables. Table 1 provides variable definitions. ***, **,
and * indicate statistical significance at the 1%, 5%, and 10% levels from two-tailed tests, respectively. The sample includes
16,044 funded loans between Feb. 12, 2007 and Oct. 16, 2008.
Default Indicator %Principal Repaid
meaningful – recall (see Table 2) that the standard deviations of readability, tone,
and deception cues are all one. Thus, controlling for the verified hard credit
information, the unverified credit information, and easily quantifiable nonstandard
information, 1-standard-deviation lower readability, less positive tone, or higher
level of deception cues is associated with a 0.79%, 1.71%, and 2.16% higher default
likelihood, respectively. Given 37.9% of the loans default (Panel E of Table 2),
1-standard-deviation higher level of deception cues implies a 5.70% increase over
the mean default likelihood (0.0216/0.379).
Following Iyer et al. (2016), we also consider the fraction of the loan repaid
((original principal default amount)/original principal) as a dependent variable.
The second column in Table 4 reports marginal effects (and standard errors) from a
Tobit regression of the fraction of the loan repaid on the three linguistic metrics and
the same set of controls. We continue to find evidence that the linguistic measures
help forecast default risk as the three coefficients have the predicted signs and differ
meaningfully from 0 (at the 5% level or better). The results also suggest the impact
is economically meaningful. A 1-standard-deviation higher readability, for exam-
ple, is associated with 1.797% more of the principal repaid. In short, the results in
Table 4 provide the first direct evidence that information captured from online
borrowers’ writing style is indeed economically valuable and relevant.
TABLE 5
Do Lenders Fully Account for the Informational Content of Borrower’s Writing?
The first column in Table 5 reports marginal effects (standard errors in parentheses) from a probit regression of the default
indicator on hard credit information, unverified credit information, easily quantifiable nonstandard information,
1/(1 þ BWR_IR), and three linguistic features – readability, positivity, and deception cues. The second column reports
marginal effects (and associated standard errors) from Tobit regression of fraction of principal repaid on the same
independent variables. Table 1 provides variable definitions. ***, **, and * indicate statistical significance at the 1%, 5%,
and 10% levels from two-tailed tests, respectively. The sample includes 16,044 funded loans between Feb. 12, 2007 and Oct.
16, 2008.
Default Indicator %Principal Repaid
information will be reflected in the rate charged to online borrowers – and thus, once
controlling for interest rates, the linguistic dimensions should not help explain
default. In contrast, if peer-to-peer lenders fail to fully incorporate
(i.e., underreact to the informational content) the information contained in the
linguistic measures, then the linguistic measures will continue to help explain
default, even when controlling for interest rates. Alternatively, if peer-to-peer
lenders overweight the information contained in the linguistic measures, then the
predicted sign of the coefficients associated with the linguistic measures will
change once including the rate charged to a borrower as an explanatory variable.
For instance, assume a typical online borrower should be charged 20% and an
otherwise identical second borrower with 1-standard-deviation more readable
description should be charged 18%. If peer-to-peer lenders fail to fully incorporate
the informational content of readability, then the second borrower, in our example,
would be charged 19% and readability will still be negatively associated with
default likelihood, even when controlling for the interest rate. In contrast, if lenders
charge borrowers with more readable descriptions too little, such as 17% for the
second borrower (i.e., “overreact” to the readability), then once accounting for
interest rates, the sign on readability will change and, controlling for interest rates,
more readable loans will be more likely to default.
Analogous to Table 4, Table 5 reports marginal effects (and associated stan-
dard errors) from a probit regression of the default indicator on (1 þ r)1, the three
linguistic measures, and the same full set of controls. That is, the analysis in Table 5
is identical to the analysis in Table 4 except that we add the borrower rate ((1 þ r)1)
as an explanatory variable.28 Consistent with Iyer et al. (2016), the results reveal
28
In the Supplementary Material, we repeat these tests using the interest rate (r), rather than (1 þ r)1,
as an independent variable and find qualitatively identical results.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
20 Journal of Financial and Quantitative Analysis
that even when controlling for the verified hard credit information, unverified credit
information, as well as easily quantifiable nonstandard information, borrowers who
are more likely to default are charged a higher rate, that is, the coefficient associated
with (1 þ r)1 is negative and statistically significant at the 1% level.29
The next three rows suggest, however, that peer-to-peer lenders fail to fully
incorporate the information contained in the linguistic measures into the rate
charged by online borrowers. Specifically, relative to the results in the first column
of Table 4 (that exclude the interest rate variable), the coefficients associated with
readability, tone, and deception cues fall 8%, 13%, and 3%, respectively (e.g., the
coefficient associated with readability falls 8% from 0.0079 in Table 4 to 0.0073 in
Table 5). All three linguistic dimensions remain statistically significant (readability
at the 10% level, tone at the 5% level, and deception cues at the 1% level). The
information captured by the three linguistic style measures remains economically
meaningful. For instance, once controlling for the rate charged (and the other
controls), 1-standard-deviation larger level of deception cues is associated with
2.09% higher default likelihood, representing a 5.51% increase over the mean
default likelihood (0.0209/0.379).
We also investigate the relation between the fraction of principal repaid and
the linguistic measures when adding the interest rate as an explanatory variable
(in addition to verified hard credit information, unverified hard credit information,
and easily quantifiable nonstandard information). The coefficients reported in the
last column of Table 5 also reveal only small reductions, relative to the last column
of Table 4, in the coefficients associated with the linguistic measures. In short, the
results in Table 5 suggest that peer-to-peer lenders fail to incorporate much of the
informational content of online borrowers’ text.
As detailed above, previous studies (e.g., Duarte et al. (2012), Ravina (2019))
examine the relation between photos posted with Prosper listings and funding and
default outcomes. In this section, we examine how images may impact our results.
We begin by partitioning our sample of 215,930 listings into three groups:
i) 104,376 listings that do not contain an image, ii) 71,873 listings that include
an image with a human face, and iii) 39,681 listings that include an image but not of
a human face (e.g., auto, pet, logo, and home). For listings that include a human
image, we use two automated processes (the Microsoft Face API and the Haystack
artificial intelligence algorithm) to generate four attributes for each image: gender,
race, positive emotion, and attractiveness. Analogous to the linguistic dimensions,
there are an infinite number of potential image attributes. We focus on gender, race,
emotion, and attractiveness because technology allows us to evaluate these dimen-
sions at scale. Clearly, there are other potentially important dimensions (e.g.,
29
Iyer et al. (2016) also investigate, and reject, the possibility that causation runs the other way, that
is, high-interest rates cause more defaults. For additional detail, see the authors’ test in their Panel C of
Table 2 and associated discussion.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 21
trustworthiness; see Duarte et al. (2012)) that we do not examine for our sample of
71,873 listings with human images. Construction details are reported in the Sup-
plementary Material.
As detailed in the Supplementary Material, the four image attributes we
examine are largely independent of the three linguistic dimensions. Specifically,
the (absolute) average correlation between the four image attributes (gender, race,
positive emotion, and attractiveness) and the three linguistic attributes (readability,
tone, and deception) is 2.6% and the largest (absolute) correlation is 5.3%.
Our primary tests focus on the relations between the linguistic dimensions and
funding likelihood (column 1 of Table 3), default likelihood (column 1 of Table 4),
and default likelihood once accounting for the rate charged to borrowers (column
1 of Table 5). We repeat each of these three tests for the sample limited to i) listings
and loans without images, and ii) listings and loans with a human image. For the
latter sample, we include the four image dimensions (gender, race, positive emo-
tion, and attractiveness) as additional regressors. As detailed in the Supplementary
Material, our conclusions are largely unchanged when limiting the sample to
listings and loans without images. Moreover, for listings with images, the relations
between the linguistic dimensions and outcomes are largely robust, with the excep-
tion that the coefficient associated with readability is no longer statistically signif-
icant when predicting default (either when excluding the rate charged as in Table 4
or including the rate changed as in Table 5).30
In sum, the four image attributes are largely independent of the three linguistic
attributes. Moreover, our conclusions remain unchanged when limiting the sample
to listings without images or when limiting the sample to listings with human
images and controlling for the four image attributes (gender, race, emotion, and
attractiveness).
30
When limited to the sample with images, the relation between positive tone and funding success is
positive and marginally significant (at the 10% level) based on a one-tail test.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
22 Journal of Financial and Quantitative Analysis
group of listings into high (above the 75th readability percentile), medium (between
the 25th and 75th readability percentiles), and low (below the 25th percentile)
readability groups. In addition, during our sample period, Prosper borrowers
selected one of 20 loan categories for their request (e.g., debt consolidation, home
improvement, and business). We randomly sample one loan request from the high
readability pool and then limit the sample of low-readability listings to loan requests
in the same category (e.g., debt consolidation) and of similar length (within
30 words). From this low-readability pool (matched on loan purpose and length)
we randomly select one “matched” loan request. We repeat the process an additional
nine times generating 10 pairs of high- and low-readability loan requests that
exhibit similar tone, similar deception cues, similar text length, and identical loan
purpose.
We then ask 50 U.S.-based experienced AMT workers to examine each
matched pair of low and high readability loan requests and answer the question31:
On Prosper.com, an online peer-to-peer lending platform, individuals ask
to borrow money from potential lenders. The borrowers’ request includes
a loan description explaining why they need the loan and their financial
situation. As a lender, these are risky investments – historically more than
one-third of borrowers default. After reading these two loan request
descriptions below, please select which request is more readable.
Because AMT workers only view the texts, their views cannot be influenced
by nontext information (i.e., other correlated variables). In addition, because each
readability pair is an Amazon Human Intelligence Task (HIT), an AMT work may
classify all 10 pairs with respect to readability or only a subset of those 10 pairs.32
Thus, in total, we collect 500 relative rankings (50 AMT classifications for each of
the 10 loan request pairs) of high- versus low-readability pairs from at least 50 AMT
workers.
We repeat these procedures for more and less positive tone listings (analo-
gously controlling for readability, deception cues, loan category, and listing length)
and replace the words “more readable” with “more positive.” Last, we repeat the
procedures for high and low deception cue listings (analogously controlling for
tone, readability, loan category, and listing length) and replace the last sentence
with, “Because the information in these descriptions is unverified, some borrowers
provide deceptive information (i.e., lie or fail to tell the whole truth). Please read
the following two texts and select which one is less deceptive (i.e., more likely the
whole truth).”
The first column of Table 6 reports the fraction of the 500 AMT classifications
that agree with our (automated) rankings and associated p-values based on a
31
We require AMT workers to be located in the U.S., have a HIT (Human Intelligence Task) approval
rating greater than 98, and at least 1,000 completed HITs. We pay AMT workers 11 cents per pair
evaluated and require they spend a minimum of 20 seconds on each pair. Thus, a worker classifying
3 pairs/minute would earn approximately $20/hour.
32
If an AMT worker (who meets the criteria discussed above) chooses to participate in coding
readability, a pair will be assigned to the worker. After the worker finishes coding the first pair, the
worker can choose to continue or stop. If the worker chooses to continue another pair will be assigned to
the worker (the worker is never assigned the same pair of loan requests twice).
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 23
TABLE 6
Amazon Mechanical Turk Experiments
In Table 6, we randomly select 10 loan descriptions from the top readability quartile and 10 “similar” loans from the bottom
readability quartile holding loan purpose (e.g., debt consolidation), text length (within 30 words), tone (within 45th and 55th
tone percentile), and deception cues (within 45th and 55th deception cue percentiles) approximately constant. We then ask
50 Amazon Mechanical Turk (AMT) workers to select which of the two loan descriptions is more readable. We analogously
select 10 pairs of loan requests that differ with respect to tone or deception (holding other dimensions approximately constant)
and select which of the two descriptions is more positive (for tone) or less deceptive (for deception cues). Given 50 workers
and 10 loan descriptions yields a total of 500 observations for each dimension. The first column reports the fraction of these
500 AMT rankings that match our automated rankings along with a p-value from a binomial test that the value does not differ
from 0.5. For each loan request pair, we then ask AMT workers to select which of the two loans would they fund with their
money. The results in the third column report the fraction of the 500 AMT choices that select the more readable (top row), more
positive (second row), or less deceptive (third row) request along with a p-value from a binomial test that the value does not
differ from 0.5. Finally, we select 10 random loan requests from the intersection of top readability quartile, top tone quartile, and
bottom deception cue quartile and a matched (same loan purpose; text length within 30 words) from the bottom readability
quartile, the bottom tone quartile, and top deception cue quartile, that is, a sample of high readability, positive tone, low-
deception cue loan requests and a matched sample of low-readability, negative tone, and high-deception cue loan requests.
The bottom row of the last column reports the fraction of 500 AMT observations where the worker selects the more readable,
more positive, fewer deception cue request and associated p-value from a binomial test that the value does not differ from 0.5.
*** indicates statistical significance at the 1% level.
Percent Agree with Ranking Percent Lend To
Sample (p-Value) (p-Value)
binomial test that the fraction does not differ from the expected value if the rankings
were independent (i.e., 50%). For instance, the top cell suggests that 77% of the
500 AMT ranking of readability pairs agree with our ranking. Similarly, we find that
AMT workers’ views of tone and deception cues also tend to match our rankings
(69% for tone and 58% for deception cues). All three values differ significantly
(at the 1% level) from 50%. In sum, the result in the first column suggests our
automated scalable methods capture, to some meaningful degree, human interpre-
tations of readability, tone, and deception.
Our second set of tests focuses on examining whether the linguistic dimen-
sions influence lending decisions. Specifically, we submit the same 10 pairs of low
and high readability listings, the 10 pairs of more and less positive tone listings, and
the 10 pairs of high and low deception cues listing as new tasks and ask AMT
workers33:
On Prosper.com, an online peer-to-peer lending platform, individuals ask
to borrow money from potential lenders. The borrowers’ request includes
a loan description explaining why they need the loan and their financial
33
Because any AMT worker (who meets the criteria discussed above) can choose to participate in
coding for both readability verification and lending decisions, there is some overlap between AMT
workers answering the question of which listing is more readable (and then some time later) also answer
the lending preference question. For example, 118 unique AMT workers answer the question of which of
loan they would fund (for loan pairs that differ with respect to readability). Of those 118 workers, 43 also
evaluated at least some of the loan pairs (previously) answering the question of which listing is more
readable.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
24 Journal of Financial and Quantitative Analysis
C. Limitations
Given evidence the soft information – both text (this study) and images (e.g.,
Duarte et al. (2012), Ravina (2019)) – influences lenders’ decisions, borrowers may
learn to manipulate the information for their benefit. For instance, if attractive
borrowers are more likely to have their loans funded and pay lower rates, a simple
strategy would be to post a photo of somebody beautiful. Similarly, a clever
borrower could hire an editor to ensure their prose is readable, positive, and lacks
deception cues. If enough borrowers took such a strategy, these soft dimensions
would no longer have value, that is, predict default likelihood. Similarly, if clever
lenders recognize clever borrowers are intentionally manipulating soft information,
34
We also consider the possibility that differences between our relative ranking and AMT workers’
rankings may influence the results. Specifically, we rerun the first three tests in the second column of
Table 6, where each loan is classified as more readable, more positive, or less deceptive based on the
majority of AMT workers’ ranking of the loan. (Because the AMT workers selecting lending preference
vary from those selecting which listing is more readable, the revised classification is based on the average
AMT classification.) The results, however, are nearly identical as with a few exceptions, the average
AMT ranking of loan pairs match our rankings. For instance, AMT workers, on average, agree with our
readability rankings for all 10 pairs of more and less readable loan requests.
35
We also computed a preliminary beta-test of our process based on a randomly selected set of
25 pairs of matched loan descriptions categorized by five AMT workers (thus 125 relative rankings;
5 AMT workers 25 pairs). Results based on this smaller (five coders of each pair) were nearly identical
to our broader sample reported here.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 25
lenders will no longer value the information. In short, textual descriptions (and
photos) are cheap signals that can be manipulated.
A platform attempting to maximize the value of soft information could,
ideally, create more costly soft information. For instance, with respect to linguistic
dimensions, platforms could increase the “cost” by requiring potential borrowers to
answer a series of questions with a minimum word count (e.g., all answers must be
at least 100 words) such as i) describe the purpose of the loan, ii) describe your
current financial position, and iii) describe how you plan to pay off the loan.
Although answers to these questions could also be manipulated, additional required
writing would increase the costs of doing so. Of course, platforms face a tradeoff –
increasing the cost (time and effort) to borrowers may attract lenders but repel
borrowers.
VI. Conclusions
Our study finds evidence consistent with the hypotheses that i) peer-to-peer
borrowers’ linguistic style influences online lender behavior and borrower out-
comes; ii) the relation between linguistic style and peer-to-peer lenders’ behavior
results, at least in part, from these investors extracting information from online
borrowers’ linguistic style; iii) peer-to-peer lenders fail to fully incorporate the
informational content of the linguistic dimensions we measure; and iv) borrowers’
writing style appears to contain information, that is, each of our three linguistic style
dimensions is associated with subsequent default risk. Consistent with our hypoth-
esis, loans with descriptions that are more readable, more positive, and contain
fewer deception cues attract peer-to-peer investor attention – controlling for veri-
fied hard credit information, unverified hard credit information, and easily quan-
tifiable nonstandard information – such loans are more likely to be funded, receive
more bids, receive more dollars bid (scaled by the amount requested), and enjoy
lower interest rates.
Consistent with peer-to-peer lenders inferring information from borrowers’
writing style (and also controlling for verified hard credit information, unverified
hard credit information, and easily quantifiable nonstandard information), online
borrowers whose writing is more readable, more positive, and contains fewer
deception cues are less likely to default. Our results provide support for the
hypothesis that peer-to-peer investors can, and do, extract meaningful information
from borrowers’ writing style. On the other hand, although our tests suggest that
peer-to-peer lenders garner information from borrowers’ linguistic style, our results
also suggest that these lenders fail to fully use the informational content available
from the linguistic dimensions we examine. This issue is especially pronounced for
deception cues.
Our results provide evidence that readability, tone, and deception cues provide
useful information for investors in these markets. Moreover, we demonstrate that it
is possible to extract economically meaningful linguistic features in a scalable
fashion. Platforms, or third parties, can potentially find opportunities to improve
market efficiency by better leveraging linguistic features, and existing and future
FinTech platforms should recognize that there may be an informational efficiency
cost associated with eliminating borrowers’ writing, even if there are other
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
26 Journal of Financial and Quantitative Analysis
Supplementary Material
To view supplementary material for this article, please visit https://doi.org/
10.1017/S0022109022000850.
References
Agarwal, S.; B. W. Ambrose; S. Chomsisengphet; and C. Liu. “The Role of Soft Information in a
Dynamic Contract Setting: Evidence from the Home Equity Credit Market.” Journal of Money,
Credit, and Banking, 43 (2011), 633–655.
Agarwal, S., and R. Hauswald. “Distance and Private Information in Lending.” Review of Financial
Studies, 23 (2010), 2757–2788.
Antweiler, W., and M. Z. Frank. “Is All That Talk Just Noise? The Information Content of Internet Stock
Message Boards.” Journal of Finance, 59 (2004), 1259–1294.
Asay, H. S.; W. B. Elliott; and K. M. Rennekamp. “Disclosure Readability and the Sensitivity of
Investors’ Valuation Judgments to Outside Information.” Accounting Review, 92 (2017), 1–25.
Aue, A., and M. Gamon. “Customizing sentiment classifiers to new domains: A case study.” In Pro-
ceedings of the International Conference on Recent Advances in Natural Language Processing,
Vol. 1. Borovets: Bulgaria (2005), 1–2.
Beattie, V. “Accounting Narratives and the Narrative Turn in Accounting Research: Issues, Theory,
Methodology, Methods and a Research Framework.” British Accounting Review, 46 (2014),
111–134.
Bird, B. J. Entrepreneurial Behavior. Glenview, IL: Scott Foresman and Company (1989).
Bonsall, S. B.; A. J. Leone; B. P. Miller; and K. Rennekamp. “A Plain English Measure of Financial
Reporting Readability.” Journal of Accounting and Economics, 63 (2017), 329–357.
Cecchini, M.; H. Aytug; G. J. Koehler; and P. Pathak. “Making Words Work: Using Financial Text as a
Predictor of Financial Events.” Decision Support Systems, 50 (2010), 164–175.
Chen, X.; X. Yao; and S. Kotha. “Entrepreneur Passion and Preparedness in Business Plan Presentations:
A Persuasion Analysis of Venture Capitalists’ Funding Decisions.” Academy of Management Jour-
nal, 52 (2009), 199–214.
Crammer, K., and Y. Singer. “On the Algorithmic Implementation of Multiclass Kernel-based Vector
Machines.” Journal of Machine Learning Research, 2 (2001), 265–292.
Das, S. R., and M. Y. Chen. “Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web.”
Management Science, 53 (2007), 1375–1388.
Dong, W.; S. Liao; and Z. Zhang. “Leveraging Financial Social Media Data for Corporate Fraud
Detection.” Journal of Management Information Systems, 35 (2018), 461–487.
Dorfleitner, G.; C. Priberny; S. Schuster; J. Stoiber; M. Weber; I. de Castro; and J. Kammler.
“Description-text Related Soft Information in Peer-to-Peer Lending–Evidence from Two Leading
European Platforms.” Journal Banking and Finance, 64 (2016), 169–187.
Dougal, C.; J. Engelberg; D. Garcia; and C. Parsons. “Journalists and the Stock Market.” Review of
Financial Studies, 25 (2012), 639–679.
Duarte, J.; S. Siegel; and L. Young. “Trust and Credit: The Role of Appearance in Peer-to-Peer Lending.”
Review of Financial Studies, 25 (2012), 2455–2484.
Federal Reserve Bank of Cleveland. “Peer-to-Peer Lending is Poised to Grow.” Available at https://
www.clevelandfed.org/newsroom-and-events/publications/economic-trends/2014-economic-
trends/et-20140814-peer-to-peer-lending-is-poised-to-grow.aspx (2015).
Feldman, R.; S. Govindaraj; J. Livnant; and B. Segal. “Management’s Tone Change, Post Earnings
Announcement Drift, and Accruals.” Review of Accounting Studies, 15 (2010), 915–953.
Franco, G.; O. Hope; D. Vyas; and Y. Zhou. “Analyst Report Readability.” Contemporary Accounting
Research, 32 (2015), 76–104.
Ghose, A., and P. G. Ipeirotis. “Estimating the Helpfulness and Economic Impact of Product Reviews:
Mining Text and Reviewer Characteristics.” IEEE Transactions on Knowledge and Data Engineer-
ing, 23 (2011), 1498–1512.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
Gao, Lin, and Sias 27
Ghose, A.; P. G. Ipeirotis; and B. Li. “Designing Ranking Systems for Hotels on Travel Search Engines
by Mining User-Generated and Crowdsourced Content.” Marketing Science, 31 (2012), 493–520.
Goel, S.; J. Gangolly; S. R. Faerman; and O. Uzuner. “Can Linguistic Predictors Detect Fraudulent
Financial Filings?” Journal of Emerging Technologies in Accounting, 7 (2010), 25–46.
Graham, J. R.; C. R. Harvey; and M. Puri. “A Corporate Beauty Contest.” Management Science, 63
(2016), 3044–3056.
Gurun, U. G., and A. W. Butler. “Don’t Believe the Hype: Local Media Slant, Local Advertising, and
Firm Value.” Journal of Finance, 67 (2012), 561–598.
Hancock, J. T.; L. E. Curry; S. Goorha; and M. Woodworth. “On Lying and Being Lied To: A Linguistic
Analysis of Deception in Computer-Mediated Communication.” Discourse Processes, 45 (2008),
1–23.
Hargittai, E. “Hurdles to Information Seeking: Spelling and Typographical Mistakes During Users’
Online Behavior.” Journal of the Association for Information Systems, 7 (2006), 52–67.
Henry, E. “Are Investors Influenced By How Earnings Press Releases Are Written?” Journal of Business
Communication, 45 (2008), 363–407.
Hobson, J. L.; W. J. Mayew; and M. Venkatachalam. “Analyzing Speech to Detect Financial
Misreporting.” Journal of Accounting Research, 50 (2012), 349–392.
Huang, A. H.; A. Y. Zang; and R. Zheng. “Evidence on the Information Content of Text in Analyst
Reports.” Accounting Review, 89 (2014), 2151–2180.
Humpherys, S. L.; K. C. Moffitt; M. B. Burns; J. Burgoon; and W. F. Felix. “Identification of Fraudulent
Financial Statements using Linguistic Credibility Analysis.” Decision Support Systems, 50 (2011),
585–594.
Hwang, B., and H. H. Kim. “It Pays to Write Well.” Journal of Financial Economics, 124 (2017),
373–394.
Iyer, R.; A. I. Khwaja; E. F. P. Luttmer; and K. Shue. “Screening Peers Softly: Inferring the Quality of
Small Borrowers.” Management Science, 62 (2016), 1554–1577.
Jegadeesh, N., and D. Wu. “Word Power: A New Approach for Content Analysis.” Journal of Financial
Economics, 110 (2013), 712–729.
Joachims, T.; T. Finley; and C. Yu. “Cutting-Plane Training of Structural SVMs.” Machine Learning, 77
(2009), 27–59.
Khan, S.; A. Goswami; and V. Kumar. “Peer to Peer Lending Market by Business Model (Alternate
Marketplace Lending and Traditional Lending), Type (Consumer Lending and Business Lending),
and End User (Consumer Credit Loans, Small Business Loans, Student Loans, and Real Estate
Loans): Global Opportunity Analysis and Industry Forecast, 2020–2027.” Allied Market Research.
Available at: https://www.alliedmarketresearch.com/peer-to-peer-lending-market (2020).
Knapp, M. L.; R. P. Hart; and H. S. Dennis. “An Exploration of Deception as a Communication
Construct.” Human Communication Research, 1 (1974), 15–29.
Larcker, D. F., and A. A. Zakolyukina. “Detecting Deceptive Discussions in Conference calls.” Journal
of Accounting Research, 50 (2012), 495–540.
Lawrence, A. “Individual Investors and Financial Disclosure.” Journal of Accounting and Economics,
56 (2013), 130–147.
Lehavy, R.; F. Li; and K. Merkley. “The Effect of Annual Report Readability on Analyst Following and
the Properties of their Earnings Forecasts.” Accounting Review, 86 (2011), 1087–1115.
Li, F. “Annual Report Readability, Current Earnings, and Earnings Persistence.” Journal of Accounting
and Economics, 45 (2008), 221–247.
Lin, M.; N. R. Prabhala; and S. Viswanathan. “Judging Borrowers by the Company They Keep:
Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending.” Management
Science, 59 (2013), 17–35.
Lo, K.; F. Ramos; and R. Rogo. “Earnings Management and Annual Report Readability.” Journal of
Accounting and Economics, 63 (2017), 1–25.
Loughran, T., and B. McDonald. “When is a Liability not a Liability? Textual Analysis, Dictionaries, and
10-Ks. Journal of Finance, 66 (2011), 35–65.
Loughran, T., and B. McDonald. “IPO First-day returns, Offer Price Revisions, Volatility, and Form S-1
Language.” Journal of Financial Economics, 109 (2013), 307–326.
Loughran, T., and B. McDonald. “Measuring Readability in Financial Disclosures.” Journal of Finance,
69 (2014), 1643–1671.
Miller, B. P. “The Effects of Reporting Complexity on Small and Large Investor Trading.” Accounting
Review, 85 (2010), 2107–2143.
Newman, M. L.; J. W. Pennebaker; D. S. Berry; and J. M. Richards. “Lying Words: Predicting Deception
from Linguistic Styles.” Personality and Social Psychology Bulletin, 29 (2003), 665–675.
Oppenheimer, D. M. “Consequences of Erudite Vernacular Utilized Irrespective of Necessity: Problems
with using Long Words Needlessly.” Applied Cognitive Psychology, 20 (2006), 139–156.
https://doi.org/10.1017/S0022109022000850 Published online by Cambridge University Press
28 Journal of Financial and Quantitative Analysis