0% found this document useful (0 votes)
321 views29 pages

Do Liquidity Measures Measure Liquidity by Goyenko, Holden, and Trzcinka (JFE 2009)

The role of liquidity in empirical finance has grown rapidly over the past five years influencing conclusions in asset pricing, market efficiency, and corporate finance. Identifying high quality proxies based on daily (as opposed to intraday) data would permit liquidity to be studied over relatively long timeframes and across many countries.

Uploaded by

Eleanor Rigby
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
321 views29 pages

Do Liquidity Measures Measure Liquidity by Goyenko, Holden, and Trzcinka (JFE 2009)

The role of liquidity in empirical finance has grown rapidly over the past five years influencing conclusions in asset pricing, market efficiency, and corporate finance. Identifying high quality proxies based on daily (as opposed to intraday) data would permit liquidity to be studied over relatively long timeframes and across many countries.

Uploaded by

Eleanor Rigby
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

ARTICLE IN PRESS

Journal of Financial Economics 92 (2009) 153181

Contents lists available at ScienceDirect

Journal of Financial Economics


journal homepage: www.elsevier.com/locate/jfec

Do liquidity measures measure liquidity?$


Ruslan Y. Goyenko a, Craig W. Holden b, Charles A. Trzcinka b,
a b

Desautels Faculty of Management, McGill University, Montreal, Quebec, Canada H3A 1G5 Kelley School of Business, Indiana University, 1309 East Tenth Street, Bloomington, IN 47405-1701, USA

a r t i c l e in fo
Article history: Received 21 February 2005 Received in revised form 25 February 2008 Accepted 9 June 2008 Available online 3 February 2009 JEL classications: C15 G12 G20 Keywords: Liquidity Transaction costs Effective spread Price impact Asset pricing

abstract
Given the key role of liquidity in nance research, identifying high quality proxies based on daily (as opposed to intraday) data would permit liquidity to be studied over relatively long timeframes and across many countries. Using new measures and widely employed measures in the literature, we run horseraces of annual and monthly estimates of each measure against liquidity benchmarks. Our benchmarks are effective spread, realized spread, and price impact based on both Trade and Quote (TAQ) and Rule 605 data. We nd that the new effective/realized spread measures win the majority of horseraces, while the Amihud [2002. Illiquidity and stock returns: cross-section and time-series effects. Journal of Financial Markets 5, 3156] measure does well measuring price impact. & 2009 Published by Elsevier B.V.

1. Introduction The role of liquidity in empirical nance has grown rapidly over the past ve years inuencing conclusions in asset pricing, market efciency, and corporate nance. A number of studies have proposed liquidity measures derived from daily return and volume data as proxies for investors liquidity and transaction costs. These studies usually test whether security returns are related to these liquidity measures but rarely test whether the measures are related to actual transaction costs. The assumption

$ We thank Utpal Bhattacharya, Andrew Ellul, Jaden Falcone, Joel Hasbrouck, Christian Lundblad, Darius Miller, Marios Panayides, Xiaoyun Yu, and seminar participants at Indiana University and the Frontiers of Finance Conference in Bonaire, Netherlands Antilles. We also thank Charles Jones for making Dow spreads available. We are solely responsible for any errors. Corresponding author. Tel.: +1812 855 9908; fax: +1812 855 5875. E-mail address: [email protected] (C.A. Trzcinka).

that the available liquidity proxies capture the transaction costs of market participants is often not tested because of the limited availability of actual trading costs. In the US markets transaction data are only available since 1983 and in many countries transaction data are not available at all. The consequences of not testing liquidity proxies on actual trading data is that there is little consensus on which measures are better and little evidence that any of the proposed measures are related to investor experience. Further, while a handful of studies, Lesmond, Ogden, and Trzcinka (1999), Lesmond (2005), and Hasbrouck (2009), test whether some of the available liquidity proxies are related to liquidity benchmarks computed from transaction data, they construct the liquidity proxies on an annual or quarterly basis. Yet the vast majority of the literature using liquidity proxies employs them on monthly (or ner) data. Given the limited number of liquidity proxies previously tested, the limited set of liquidity benchmarks used in the literature, and the absence of monthly proxies, it is not surprising that there

0304-405X/$ - see front matter & 2009 Published by Elsevier B.V. doi:10.1016/j.jneco.2008.06.002

ARTICLE IN PRESS
154 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

are conicting views about which measure is better and that there is little assurance that these measures actually capture the transaction costs of market participants. In short, not much is known about whether transaction cost proxies measure what researchers claim they measure. The purpose of this paper is to address this gap in the literature by providing a comprehensive study of liquidity measures. We run horseraces of all the widely used proxies for liquidity, plus three new proxies for effective and realized spread, and nine new proxies for price impact. We use multiple liquidity benchmarks, two high-frequency data sets (TAQ and Rule 605 data), multiple performance metrics, and a long sample period that includes the decimals regime. We nd a close association between many of the measures and actual transaction costs. Some measures are able to precisely estimate the magnitude of effective and realized spreads and many are highly correlated with both spreads and price impact. We can safely assert that the literature has generally not been mistaken in the assumption that liquidity proxies measure liquidity. The new measures we introduce in this paper consistently win a majority of the effective/realized spread horseraces. A measure commonly used in the literature, Pastor and Stambaughs (2003) Gamma, is clearly dominated by other measures while the widely used Amihud (2002) measure is a good proxy for price impact. The paper is organized as follows. Section 2 discusses the empirical design of the paper. In Section 3 we develop the high-frequency liquidity benchmarks used in the horserace and in Sections 4 and 5 we develop the lowfrequency spread proxies and price impact proxies used in the horserace. Section 6 describes the data sets and methodology. Section 7 presents the horserace results. Section 8 concludes the paper.

2. Empirical design Our basic hypothesis is that useful monthly and annual liquidity measures can be constructed from low-frequency (daily) stock returns and volume data, giving researchers an access to liquidity measures over a long price history and in many markets. The US daily stock returns and volume data are available from the Center for Research in Security Prices (CRSP) covering NYSE/AMEX rms from 1926 to the present and NASDAQ rms from 1983 to the present. A wide variety of vendors provide daily stock returns and volume data for international equity markets. For example, Thomson Financials Datastream provides daily stock returns and volumes covering rms in more than 60 countries from 1994 to the present and daily stock returns for several developed markets going back to the early 1970s. These tests should be of interest to a broad spectrum of empirical research in nancial economics. In the asset pricing literature, Chordia, Roll, and Subrahmanyam (2000) show that various spread measures vary systematically. Goyenko (2006) shows that various spread measures are priced. Sadka (2006), Acharya and Pedersen

(2005), Pastor and Stambaugh (2003), and Watanabe and Watanabe (2006) show that various price impact measures are priced. Fujimoto (2003), Korajczyk and Sadka (2008), Hasbrouck (2009), and others test the pricing of both spread and price impact measures in the US while Bekaert, Harvey, and Lundblad (2007) test the measures in emerging markets where liquidity concerns may be more pronounced. All of these studies use monthly liquidity estimates. Reliable monthly spread and price impact measures going back in time and/or across countries are needed to determine if these asset pricing relationships hold up. In the market efciency literature, De Bondt and Thaler (1985), Jegadeesh and Titman (1993, 2001), Chan, Jegadeesh, and Lakonishok (1996), Rouwenhorst (1998), and many others have found monthly trading strategies that appear to generate signicant abnormal returns. Yet, Chordia, Goyal, Sadka, Sadka, and Shivakumar (2008) show that one of the oldest trading strategies in the literature, the post earnings announcement drift, cannot produce returns greater than the Keim and Madhavan (1997) measures. Clearly liquidity measures over time and/or across countries are needed in order to determine if these trading strategies are truly protable net of a relatively precise measure of cost of trading. Finally there is a growing need in corporate nance research for useful monthly liquidity measures. Kalev, Pham, and Steen (2003), Dennis and Strickland (2003), Cao, Field, and Hanka (2004), Lipson and Mortal (2004a), Schrand and Verrecchia (2004), Lesmond, OConnor, and Senbet (2008), and many others examine the impact of corporate nance events on stock liquidity. Heln and Shaw (2000), Lipson and Mortal (2004b), Lerner and Schoar (2004), and many others examine the inuence of liquidity on capital structure, security issuance form, and other corporate nance decisions. Liquidity measures over a longer period of time would expand the potential sample size of this literature. Liquidity measures across many additional countries would greatly extend the potential diversity of international corporate nance environments that this literature could analyze. To determine which liquidity measures are best, we compare proxies calculated from low-frequency data to sophisticated benchmarks of liquidity calculated from two high-frequency data sets using time-series correlations, cross-sectional correlations, and prediction errors. Specically, we compare spread proxies to effective and realized spreads and we compare price impact proxies to two price impact benchmarks. All four of these benchmarks are calculated using the NYSEs Trade and Quote (TAQ) data set from 1993 to 2005. Our monthly benchmarks are computed as monthly averages based on every trade and corresponding BBO1 quote over the month and our annual benchmarks are computed as annual averages based on every trade and corresponding BBO quote over the year. We also compare spread proxies to the effective

1 BBO means the best bid and offer. It is the highest bid and lowest ask available for a given stock at a moment in time.

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 155

spread for marketable orders2 and compare price impact proxies to the price impact across order sizes.3 Both of these benchmarks are calculated using data disclosed under Securities and Exchange Commission (SEC) Rule 605 of Regulation NMS (formerly Regulation 11Ac1-5) from October 2001 to December 2005. Rule 605 requires that all exchanges and other market centers disclose detailed order-based performance statistics by stock, order type, and order size, providing a cross-check to the TAQ based results. Our tests consist of running monthly and annual horseraces between 12 spread proxies and 12 price impact proxies, gauging their abilities to match the salient features of our high-frequency-based benchmarks. While some contestants are well established in the literature, many are being tested for the rst time. The new spread proxies (described in detail below) are: Effective Tick, and Effective Tick2, developed jointly by this paper and Holden (2009); Holden from Holden (2009); and LOT Y-split developed by this paper. The other spread proxies from the previous literature are: Roll from Roll (1984); Gibbs from Hasbrouck (2004); LOT Mixed, Zeros, and Zeros2 from Lesmond, Ogden, and Trzcinka (1999); Amihud from Amihud (2002); Pastor and Stambaugh from Pastor and Stambaugh (2003); and nally Amivest Liquidity.4 The latter three measures are also tested on price impact dimension. The other nine price impact contestants (also described below) are developed by this paper as extensions of the Amihud measure. Our rst performance metric is the average crosssectional correlation based on individual rms between the low-frequency liquidity proxy and the high-frequency liquidity benchmark (effective spread, realized spread, or one of the price impact benchmarks). Our second performance metric is the time-series correlation based on an equally weighted portfolio between the liquidity proxy and the liquidity benchmark. Both of these performance metrics are most relevant for asset pricing purposes, where the magnitude of the correlation, not the scale of the low-frequency proxy, matters. Our third and fourth performance metrics are the prediction error between the liquidity proxy and the liquidity benchmark as measured by mean bias and the root mean squared error, respectively. These metrics are most relevant for market efciency and corporate nance tests, where the scale of the proxy does matter as one wishes to subtract a correctly scaled proxy for transaction costs. Hasbrouck (2009) runs annual tests between four effective cost measures, comparing each to effective
2 Marketable orders are a combination of market orders and marketable limit orders. 3 Dened as the difference in the effective spread between large and small orders divided by the difference in the average share size between large and small orders. 4 The Amihud, Pastor and Stambaugh, and Amivest measures are perhaps more naturally thought of as price impact measures, but the use of these measures in the literature has been more broadly and loosely justied. Therefore, we test these measures relative to both effective spread and price impact benchmarks.

spread and price impact computed from TAQ data for the 1993 to 2005 period. Among the measures he tests, Gibbs dominates as a proxy for annual effective spread and Illiquidity dominates as a proxy for annual price impact.5 Using three annual measures, Lesmond, Ogden, and Trzcinka (1999) nd that LOT dominates Roll and Zeros. Lesmond (2005) runs quarterly horseraces between ve liquidity measures for 23 emerging countries, and nds that LOT dominates Roll, Illiquidity, Liquidity, and Turnover. We generally conclude that liquidity measures based on daily data provide good measures of high-frequency transaction cost benchmarks (i.e., liquidity measures do measure liquidity). In the monthly and annual effective and realized spread horseraces, we nd that Holden, Effective Tick, and LOT Y-split are the best overall. We also nd that in more recent years, during the decimals regime, the performance of all measures deteriorates with the exception of Zeros and the Amihud measures. In the price impact horseraces, the new class of price impact measures introduced in this paper either marginally dominate the Amihud measure or is insignicantly different from it, depending on the benchmark. The new class of price impact measures is also able to capture the magnitude of the special Rule 605 version of price impact. Pastor and Stambaughs Gamma and Amivests Liquidity are never in the winning group of any horserace and have very low association with the six liquidity benchmarks analyzed.

3. High-frequency liquidity benchmarks 3.1. Spread benchmarks We analyze three spread benchmarks. Our rst spread benchmark is the effective spread as calculated from the high-frequency TAQ database. Specically, for a given stock, the TAQ effective spread on the kth trade is dened as Effective Spread TAQ k 2 j lnPk lnM k j, (1)

where Pk is the price of the kth trade and Mk is the midpoint of the consolidated BBO prevailing at the time of the kth trade. Aggregating over a time interval i (either a month or a year), a stocks Effective Spread (TAQ)i is the dollar-volume-weighted average of Effective Spread (TAQ)k computed over all trades in time interval i. Our second spread benchmark is the realized spread from Huang and Stoll (1996), which is the temporary component of the effective spread. Specically, for a given

5 Hasbrouck extends his basic model to include a latent common liquidity factor for a subsample of stocks. He also estimates his Gibbs measure for all common NYSE/AMEX/NASDAQ stocks from 1927 to 2005 and tests whether liquidity is a priced risk factor.

ARTICLE IN PRESS
156 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

stock, the TAQ realized spread on the kth trade is dened as Realized Spread TAQ k ( 2 lnP k lnP k5 2 lnP k5 lnPk

execution.7 Accordingly, the Rule 605 data provide a useful cross-check to the TAQ-based results; however, the Rule 605 data are only available from mid-2001, so the comparison is limited to only 51 months in our sample. 3.2. Price impact benchmarks Based upon the literature, we analyze three different price impact benchmarks. A static version of price impact is the slope of the price function at a moment in time. Essentially, this is the cost of demanding additional instantaneous liquidity and can be thought of as the rst derivative of the effective spread with respect to order size. Our rst price impact benchmark uses two (aggregated) points on this curve to measure the slope. Specically, for a given stock, the static price impact based on Rule 605 data over time interval i is Static Price Impact 605i 2 3 $Effective Spread 605Big Orders;i =P i 4 5 $Effective Spread 605Small Orders;i =P i " # Ave Trade Size 605Big Orders;i , = Ave Trade Size 605Small Orders;i

when the kth trade is a buy when the kth trade is a sell; (2)

where P(k+5) is the price of trade ve-minutes after the kth trade. The trades are signed according to the Lee and Ready (1991) algorithm. Aggregating over a time interval i (either a month or a year), a stocks Realized Spread (TAQ)k is the dollar-volume-weighted average of Realized Spread (TAQ)k computed over all trades in time interval i. Our third spread benchmark is the effective spread as aggregated from the Rule 605 database. Specically, for a given stock, the Rule 605 dollar effective spread based on the trade generated by the kth order is dened as $Effective Spread 605k ( 2 P k mk for marketable buys 2 mk P k for marketable sells;

(3)

(4)

where mk is the midpoint of the consolidated BBO prevailing at the time of receipt of the kth order at the exchange.6 Aggregating over month i, a stocks Effective Spread (605)i is the share-volume-weighted average of $Effective Spread (605)k computed over all market centers (spanning all trades) in month i divided by Pi, the average price in month i. In principle, Effective Spread (605)i should be an improvement over Effective Spread (TAQ)i, as each market center constructs their Rule 605 gures from order data, which are more rened than trade and quote data for several reasons. First, the Rule 605 midpoint is based on an orders time of receipt, whereas a TAQ midpoint is based on the trades time of executionan orders time of receipt is a closer proxy to the traders information set at the time of order submission. Second, there is no confusion in the Rule 605 data about buys vs. sells or about marketable orders vs. non-marketable orders whereas Lee and Radhakrishna (2000) report that the Lee and Ready (1991) method commonly used with TAQ data incorrectly classies 24% of inside-the-spread trades that have a clear trade initiator. Third, there is no confusion in the Rule 605 data when a marketable buy is crossed with a marketable sell. Lee and Radhakrishna (2000) nd that 40% of the trades in their NYSE Trades, Orders, Reports, and Quotes (TORQ) sample are nondirectional trades, where a marketable buy and marketable sell are crossed. The Rule 605 data correctly treats this case as two marketable executions (both a marketable buy execution and a marketable sell execution). By contrast, users of TAQ data cannot distinguish nondirectional trades vs. directional trades and usually treat this case as a single
6 Marketable buys are market buy orders and marketable limit buy orders. Marketable sells are market sell orders and marketable limit sell orders. Effective spreads are not reported for non-marketable limit orders in the 605 data.

where Big Orders, i is the set of all orders in the range of 20009999 shares that execute in time interval i and small Orders, i is the set of all orders in the range of 100499 shares that execute in time interval i. Our second price impact benchmark introduces a time dimension that is not present in Static Price Impact. Fiveminute price impact measures the derivative of the cost of demanding a certain amount of liquidity over ve minutes which may be very different from the analogous curve for demanding the same amount of liquidity immediately. In constructing this measure, we follow Hasbrouck (2009) and calculate the price impact as the slope coefcient l(TAQ) of the regression r n lTAQ Sn un ,
8

(5)

where for the nth ve-minute period, rn is the stock return, Sn is the signed square-root dollar volume, that is, q  P Sn k signvkn vkn , vkn is the signed dollar volume of the kth trade in the nth ve-minute period, and un is the error term. Our third price impact benchmark focuses on the change in quote midpoint after a signed trade. Price impact is commonly dened as the increase (decrease) in the midpoint over a ve-minute interval beginning at the time of the buyer- (seller-) initiated transaction. This is the permanent price change of a given transaction, or equivalently, the permanent component of the effective
7 There are downsides to 605 data as well. An order that is re-routed between market centers is double-counted. Further, the 605 data do not include block trades. The SEC is therefore an imperfect monitor of data quality. For more discussion of these issues, see Boehmer, Jennings, and Wei (2003). 8 We also tested a 15-minute interval with similar results, suggesting that our results are independent of the time interval over which we aggregate the data.

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 157

spread. Specically, for a given stock, the TAQ ve-minute price impact aggregated over a time interval i is 5-Minute Price Impact TAQ k ( 2 lnM k5 lnM k when the kth trade is a buy 2 lnM k lnM k5 when the kth trade is a sell; (6) where Mk+5 is the midpoint of the consolidated BBO prevailing ve minutes after the kth trade, and Mk is the midpoint prevailing at the time of the kth trade. We follow the Lee and Ready (1991) algorithm to identify buy and sell transactions. For a given stock aggregated over a time interval i (either a month or a year), the 5-Minute Price Impact (TAQ)k is the dollar-volume-weighted average of 5-Minute Price Impact (TAQ)k computed over all trades in time interval i. 4. Low-frequency spread proxies Nine low-frequency spread proxies are explained below. For each measure, we require that the measure always produce a numerical result.9 4.1. Roll Roll (1984) develops an estimator of the effective spread based on the serial covariance of the change in price as follows. Let Vt be the unobservable fundamental value of the stock on day t. Assume that it evolves as V t V t1 et , (7)

version of the Roll estimator: ( p 2 CovDP t ; DP t1 When CovDPt ; DP t1 o0 . Roll 0 When CovDPt ; DP t1 X0 (12)

4.2. Effective tick Holden (2009) and this paper jointly develop a proxy of the effective spread based on observable price clustering.10 Based on the negotiation cost theory of Harris (1991), we assume that trade prices are clustered in order to minimize negotiation costs between potential traders. Let St be the realization of the effective spread at the closing trade of day t. Assume that the realization of the spread on the closing trade of day is randomly drawn from a set of possible spreads sj ; j 1; 2; . . . ; J with corresponding probabilities gj ; j 1; 2; . . . ; J. By convention, the possible effective spreads s1 s2,ysJ are ordered from smallest to largest. For example on a $1 price grid, St is 8 modeled as having a probability g1 of s1 $1 spread, g2 of 8 s2 $1 spread, g3 of s3 $1 spread, and g4 of s4 $1 4 2 spread. Following the intuition of Christie and Schultz (1994), we assume that price clustering is completely determined by spread size. For example, if the spread is $1, the model 4 assumes that the bid and ask prices employ only even 1 1 quarters. The quote could be $254 bid, $252 offered, but never $253 bid, $255 offered. Thus, if odd-eighth transac8 8 tion prices are observed, one infers that the spread must be $1. This implies that the simple frequency with which 8 closing prices occur in particular price clusters can be ^ used to estimate the spread probabilities gj ; j 1; 2; . . . ; J. For example on a $1 fractional price grid, the frequency 8 with which trades occur in four, mutually exclusive price sets (odd 1 s; odd 1 s; odd 1 s; and whole dollars) can be 8 4 2 used to estimate the probability of a $1 spread, $1 spread, 8 4 $1 spread, and a $1 spread. Similarly for a decimal price 2 grid, the frequency with which trades occur in ve, mutually exclusive sets (off pennies, off nickels, off dimes, off half-dollars, and whole dollars) can be used to estimate the probability of a penny spread, nickel spread, dime spread, quarter spread, and whole dollar spread. Let Nj be the number of trades on prices corresponding to the jth spread j 1; 2; . . . ; J using only positivevolume days in the time interval. In the $1 price grid 8 example (where J 4), N1 through N4 are the number of trades on odd 1 prices, the number of trades on odd 1 8 4 prices, the number of trades on odd 1 prices, and the 2 number of trades on whole dollar prices, respectively. Let Fj be the probabilities of trades on prices corresponding to the jth spread j 1; 2; . . . ; J: These empirical probabilities are computed as Nj F j PJ Nj for j 1; 2; . . . ; J. (13)

where et is the mean-zero, serially uncorrelated public information shock on day t. Next, let Pt be the last observed trade price on day t. Assume it is determined by P t V t 1SQ t , 2 (8)

where S is the effective spread and Qt is a buy/sell indicator for the last trade that equals +1 for a buy and 1 for a sell. Assume that Qt is equally likely to be +1 or 1, is serially uncorrelated, and is independent of et. Taking the rst difference of Eq. (8) and combining it with Eq. (7) yields

DP t 1SDQ t et , 2

(9)

where D is the change operator. Given this setup, Roll shows that the serial covariance is CovDPt ; DP t1 1S2 , 4 or equivalently p S 2 CovDPt ; DP t1 . (10)

(11)

When the sample serial covariance is positive, the formula above is undened and so we substitute a default numerical value of zero. We therefore use a modied
9 If a measure cannot be computed we substitute a default value. Our results are not sensitive to the default value selected.

j1

10 Holden (2009) also develops and tests additional versions of the Effective Tick measure.

ARTICLE IN PRESS
158 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

Let Uj be the unconstrained probability of the jth spread j 1; 2; . . . ; J. The unconstrained probability of the effective spread is 8 j1 > 2F j ; < 2F j F j1 ; j 2; 3; . . . ; J 1 (14) Uj > :F F ; j J: j j1 The effective tick model directly assumes price clustering (i.e., a higher frequency on rounder increments). However, in small samples it is possible that reverse price clustering may be realized (i.e., a lower frequency on rounder increments). Reverse price clustering unintentionally causes the unconstrained probability of one or more effective spread sizes to go above one or below zero. Thus, constraints are added to generate proper probabil^ ities. Let gj be the constrained probability of the jth spread j 1; 2; . . . ; J. It is computed in order from smallest to largest as follows: 8 j1 > MinMaxfU j ; 0g; 1; > < " # j1 P ^j g (15) ^ gk ; j 2; 3; . . . ; J: > Min MaxfU j ; 0g; 1 > :
k1

Next, he derives a price change process that is a natural extension of Eq. (9) above

DP t 1St Q t 1 l1St1 Q t1 et , 2 2

(17)

where the effective spread St is allowed to change each day and l is the percentage of the half-spread attributable to the sum of adverse selection and inventory holding costs. Conversely, 1l is the percentage of the half-spread attributable to order processing costs.11 The public information shock et is assumed to be normally distributed with mean e and standard deviation se. Let m be the probability of a trading day and 1 m be the probability of a non-trading day. Consider a $1 price 8 grid where St has a probability g1 of s1 $1 spread, g2 of 8 1 1 s2 $4 spread, g3 of s3 $2 spread, and g4 of s4 $1 spread. Of course, the spread probabilities must sum to PJ one: j1 gj 1. The Holden spread proxy is just the weighted-average of the possible spreads: Holden  SH
J X j1

gj sj .

(18)

Finally, the effective tick measure is simply a probability-weighted average of each effective spread size divided by P i, the average price in time interval i PJ ^ gs j1 j j . (16) Effective Tick Pi A second version, called Effective Tick2, is otherwise the same except that it uses the daily prices from all days, rather than just positive-volume days only. The difference between the two measures depends on the informativeness of the no trade prices. 4.3. Holden Holden (2009) develops a model that uses both serial correlation (like the Roll measure) and price clustering
X
Ht ;Ht1 ;Ht2 2H

Dene the variable Ct as the observable price cluster on day t. Specically, on a zero-volume day, let C t 0: On a positive-volume day, let clusters Ct 1,2,3, and 4 correspond to when the trade price is on odd 1 s; odd 1 s; 8 4 ^ odd 1 s, and whole dollars, respectively. Dene Q as a buy/
2 t

sell/zero volume indicator on day d that equals +1 for a buy, 1 for a sell, and zero for a zero-volume day. Dene ^ the unobserved signed half-spread on day t as Ht 1St Q :
2 t

Considering all spread and indicator combinations, there are nine possible values of the signed half-spread Ht: 1 1 $1; $1; $1; $16; $0; $16; $1; $1; $1. 2 4 8 8 4 2 For three successive trading days we observe a price triplet Pt ; P t1 ; Pt2 , which corresponds to a price cluster triplet C t ; C t1 ; C t2 . Dene H as the set of all half-spread triplets Ht ; Ht1 ; Ht2 that are feasible given the observed price cluster triplet.12 For a given a set of parameter values m; g1 ; g2 ; SH ; e; se ; l; Holden calculates the likelihood of the price triplet
) ,

PrP t ; P t1 ; P t2 jm; g1 ; g2 ; SH ; e; se ; l

PrC t PrC t1 PrC t2 PrHt jC t PrHt1 jC t1 PrHt2 jC t2 nP t1 Ht1 P t 1 lHt nP t2 Ht2 P t1 1 lHt1

(19)

(like Effective Tick) to estimate the effective spread. Indeed, the Holden model formally nests both the Roll model and the Effective Tick model as special cases. His model is based on modifying the model of Huang and Stoll (1997). Huang and Stoll develop a generalized model of the components of the bidask spread. A byproduct of the Holden model is a two-way decomposition of the bidask spread as estimated from low-frequency data. Holden begins by modifying the Huang and Stoll model to account for changing spreads linked to price clustering. Just like the Effective Tick model above, he species a random probability of jumping each period among multiple spreads that are linked price cluster regimes.

where n( ) is the normal density with mean e and standard deviation se. Using three prices at a time allows the serial correlation of the price changes to be picked up, but avoids the combinatoric explosion of feasible half-spread

11 This component also includes any liquidity provider rents due to market power or price discreteness. 12 For example, suppose that the price P t $251, which is an odd8 eighth that corresponds to price cluster C t 1. For this price cluster the 1 only feasible spread is St $8. Thus, there are only two feasible values of 1 1 the signed half-spreads Ht 2 f$16; $16g: Similarly, P t1 and P t2 imply the feasible values of the signed half-spreads Ht1 and Ht2 . Taking all combinations of the feasible values on each day yields the set of feasible half-spread triplets.

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 159

combinations that would result if all observations were used at the same time. Taking the log of Eq. (19), the likelihood function is the sum of the log likelihoods of all price triplets in the time period of aggregation
T 2 X t1

stock j is given by Rjt R a1j jt Rjt R jt Rjt R jt a2j when R oa1j jt when a2j oR . jt (22)

when a1j oR oa2j jt

LnPrP t ; P t1 ; P t2 jm; g1 ; g2 ; SH ; e; se ; l,

(20)

The LOT liquidity measure is simply the difference between the percent buying cost and the percent selling cost: LOT aj2 aj1 . (23)

where T is the number of days in the time period of aggregation. The likelihood function is maximized by choice of the parameters m; g1 ; g2 ; SH ; e; se ; l subject to the constraints that g1 ; g2 ; g3 ; g4 ; m; SH ; se ; and l are greater than or equal to zero and the constraints that g1 ; g2 ; g3 ; g4 ; m; and l are less than or equal to one.13 4.4. Gibbs Hasbrouck (2004) introduces a Gibbs sampler estimation of the Roll model using prices from all days. Hasbrouck assumes that the public information shock et in the Roll model is normally distributed with mean of zero and variance of s2 : He denotes the half-spread in the e Roll model as c  1S. 2 Hasbrouck uses the Gibbs sampler to numerically estimate the model parameters fc; s2 g, the latent buy/ e sell/no-trade indicators Q fQ 1 ; Q 2 ; . . . ; Q T g; and the latent efcient prices V fV 1 ; V 2 ; . . . ; V T g, where T is the number of days in the time interval.14 4.5. LOT Lesmond, Ogden, and Trzcinka (1999) develop an estimator of the effective spread based on the assumption of informed trading on non-zero-return days and the absence of informed trading on zero-return days. A standard market model relationship holds on nonzero-return days, but a at horizontal segment applies on zero-return days. The LOT model assumes that the unobserved true return R of a stock j on day t is given by jt R jt bj Rmt jt , (21)

Lesmond, Ogden, and Trzcinka develop the following maximum likelihood estimator of the models parameters: La1j ; a2j ; bj ; sj jRjt ; Rmt Y 1 Rjt a1j bj Rmt ! n
1

sj   ! a2j bj Rmt a1j bj Rmt N N sj sj 0 Y 1 Rjt a2j bj Rmt ! n sj sj 2


Y  0; aj2 ! 0; bj X0; sj X0, (24)

sj

S:T: aj1

where N( ) is the cumulative normal distribution. A very important issue concerning LOT is the denition of the three regions over which the estimation is done. The original LOT (1999) measure, which we call LOT Mixed, distinguishes the three regions based on both the X-variable and the Y-variable. That is, region 0 is Rjt 0, region 1 is Rjt a0 and Rmt 40, and region 2 is Rjt a0 and Rmt o0. In this paper we develop an alternative measure, LOT Y-split, that breaks out the three regions based on the Y-variable. That is, region 0 is Rjt 0, region 1 is Rjt 40 and region 2 is Rjt o0. Interestingly, LOT Y-split and LOT Mixed sometimes produce very different results, so it is worth tracking both of them. 4.6. Zeros Lesmond, Ogden, and Trzcinka (1999) introduce the proportion of days with zero returns as a proxy for liquidity. Two key arguments support this measure. First, stocks with lower liquidity are more likely to have zerovolume days and thus are more likely to have zero-return days. Second, stocks with higher transaction costs have less private information acquisition (because it is more difcult to overcome higher transaction costs), and thus, even on positive volume days, they are more likely to have no-information-revelation, zero-return days. Lesmond, Ogden, and Trzcinka dene the proportion of days with zero returns as Zeros # of days with zero returns=T, (25)

where bj is the sensitivity of stock j to the market return Rmt on day t and jt is a public information shock on day t. They assume that jt is normally distributed with mean zero and variance s2 . Let a1j p0 be the percent transaction j cost of selling stock j and a2j X0 be the percent transaction cost of buying stock j. Then the observed return Rjt on a

13 The constraints g3 X0 and g3 p1 can be expressed as a function of the parameters to be estimated m; g1 ; g2 ; SH ; e; se ; l as: 21 S g1 7 8 g2 3X0 and 21 S g1 7 g2 3p1, respectively. Similarly, the con4 8 4 straints g4 X0 and g4 p1 can be expressed as: 1 g1 g2 21 S g1 7 g2 3X0 and 1 g1 g2 2 1 S g1 7 g2 3 p1, respec8 4 8 4 tively. 14 Hasbrouck generously provides the programming code to compute the Gibbs estimator on his Web site. We directly use his code without modication of the main routines for both monthly and annual computations.

where T is the number of trading days in a month. An alternative version of this measure, Zeros2, is dened as Zeros2 # of positive-volume days with zero return=T. (26) For emerging markets, the Zeros measure has been used by Bekaert, Harvey, and Lundblad (2007).

ARTICLE IN PRESS
160 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

4.7. Other proxies Three additional proxies are tested in the spread horseraces: (1) Illiquidity from Amihud (2002), (2) Gamma from Pastor and Stambaugh (2003), and (3) the (Amivest) Liquidity. These measures are intended to proxy for price impact. Therefore, they are tested only for correlation with effective and realized spreads. All three are described below. 5. Low-frequency price impact proxies Next, we explain 12 low-frequency price impact proxies. As before, we require that each measure always produce a numerical result. 5.1. Amihud Amihud (2002) develops a price impact measure that captures the daily price response associated with one dollar of trading volume. Specically, he uses the ratio   jr t j , (27) Illiquidity Average Volumet where rt is the stock return on day t and Volumet is the dollar volume on day t. The average is calculated over all positive-volume days, since the ratio is undened for zero-volume days. 5.2. Extended Amihud proxies We develop a new class of price impact proxies by extending the Amihud measure. We start with the Amihud base model. We then decompose the total return in the base model numerator into a liquidity component and a non-liquidity component. This is done by dividing both sides of the modied Huang and Stoll model in Eq. (17) by Pt1 to obtain
1

ciated with one dollar of trading volume as  01  S Q 1 l1S Q 1 2 t t 2 t1 t1  C  B C Pt1 B C AverageB C B Volumet A @

(30)

Essentially, this eliminates a noise term that is unrelated to the variable of interest. The average numerator value is close (at least in magnitude) to the percent effective halfspread. Since we do not observe the numerator in lowfrequency data sets, we construct an extended Amihud proxy for time interval i by using a spread proxy over time interval i and the average daily dollar volume over the same time interval as follows: Extended Amihud Proxyi Spread Proxyi , Average Daily Dollar Volumei (31)

where the whole spread convention is used instead of the half-spread convention. The original Amihud measure computes the average of daily ratios, where each daily ratio is absolute return/dollar volume. The extended Amihud proxies use an alternative convention by computing the ratio of two averages. If we view the spread proxy as representing the average daily spread over interval i, then the ratio can be interpreted as the average daily spread/average daily dollar volume.15 The equation above denes a class of price impact proxies depending on which proxy for percent effective spread is used. For example, one member of this class is Roll Impact for time interval i, which uses the Roll measure for time interval i and the average daily dollar volume over time interval i as follows: Roll Impacti Rolli . Average Daily Dollar Volumei (32)

rt 2

St Q t 1 l1St1 Q t1 et 2 , Pt1 Pt1

(28)

where the rst term on the right-hand side is the liquidity component and the second term is the non-liquidity component. 1St Q t 1 l1St1 Q t1 is the signed effective 2 2 half-spread (which includes three components: adverse selection, order processing, and inventory costs) at time t minus the order processing component of the lagged signed effective half-spread at t1, and et is the meanzero, serially uncorrelated public information shock on day t. This model includes the Glosten (1987) model as a special case when inventory costs are zero. Substituting Eq. (28) into Eq. (27), we get 1 01  S Q 1 l1S Q et   2 t t 2 t1 t1   B Pt1 Pt1 C C B C (29) AverageB C B Volumet A @ By assumption, the random variable et is independent of the liquidity component. We therefore drop the nonliquidity component to measure the liquidity costs asso-

We test nine versions of this class of price impact measures based on nine proxies for percent effective spread. The nine measures we test are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, LOT Y-split Impact, Zeros Impact, and Zeros2 Impact. 5.3. Pastor and Stambaugh Pastor and Stambaugh (2003) develop a measure of price impact called Gamma by running the regression r e y fr t Gammasignr e Volumet t , t1 t (33)

where r e is the stocks excess return above the CRSP valuet weighted market return on day t and Volumet is the dollar volume on day t. Intuitively, Gamma measures the reverse of the previous days order ow shock. Gamma should
15 Both the original Amihud measure and the extended Amihud proxies aggregate trades up to the level of a day. This is justied if all trades are of identical size, but if trades are of varying size, then this is a somewhat arbitrary normalization.

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 161

have a negative sign. The larger the absolute value of Gamma, then the larger the implied price impact. 5.4. Amivest liquidity The Amivest Liquidity ratio is a measure of price impact   Volumet . (34) Liquidity Average jr t j The average is calculated over all non-zero-return days, since the ratio is undened for zero-return days. A larger value of Liquidity implies a lower price impact. This measure has been used by Cooper, Groth, and Avera (1985), Amihud, Mendelson, and Lauterback (1997), Berkman and Eleswarapu (1998), and others. 6. Data To compute our effective spread, realized spread, and price impact benchmarks, we use two high-frequency data sets. First, we use NYSE TAQ data from 1993 to 2005. Because of the computational limits associated with some of the measures, we select a random sample. Following the methodology of Hasbrouck (2009), a stock must meet ve criteria to be eligible: (1) it is a common stock, (2) it is present on the rst and last TAQ master le for the year, (3) it has NYSE, AMEX, or NASDAQ as the primary listing exchange, (4) it does not change primary exchange, ticker symbol, or CUSIP over the year, and (5) it is listed in CRSP. We randomly select 400 stocks each year from the universe of eligible stocks in 1993. Rolling forward, if any of the 1993 selections is not eligible in 1994, we randomly draw a replacement from the universe of eligible stocks in 1994. We continue rolling forward in likewise fashion over a 13-year span. Thus, we have 5,200 stock-years. We use the same set of stocks for the monthly measures. We lose a small number of observations in extremely illiquid stocks because of insufcient trades (two or less) on positive-volume days to run the Bayesian regression that is part of the Gibbs measure. This results in 62,100 stock-months from TAQ. Second, we use data that are required to be disclosed under Rule 605 of Regulation NMS (formerly Regulation 11Ac1-5) from October 2001 to December 2005. The data are collected and manually assembled from the Transaction Auditing Group, Inc. (www.tagaudit.com) from October 2001 to December 2005. We use the same stocks as above. Data on NYSE/AMEX rms are taken from their respective market center statistics. Data on NASDAQ rms are aggregated by volume-weighting the disclosed statistics from the following market centers: Small Order Execution System (SOES), all Electronic Communication Networks (ECNs) (Archipelago (ARCA), Instinet (INET), Island (ISLD), NexTrade (NTRD), Redibook (REDI)), and the top 10 NASDAQ market makers16 (Schwab (SCHB), Brutt (BRUT), Goldman Sachs (GSCO), Knight (NITE and TRIM),
16 The top 10 list is based on NASDAQ composite volume for the month of March 2004 at www.nasdaqtrader.com.

GVR (GVRC), B-Trade (BTRD), Lehman Brothers (LEHM), Credit Suisse First Boston (FBCO), Merrill Lynch (MLCO), and J.P. Morgan (JPMS)). To compute our low-frequency liquidity measures, we use the Daily Stock database from CRSP over the same time period. We notice that the analytic-formula proxies (Roll, Effective Tick, Effective Tick2, Zeros, Zeros2, Illiquidity, Gamma, and Liquidity) are fast to compute. By contrast, the single measure, numerically iterated proxies (Gibbs, LOT Mixed, and LOT Y-split) are slower to compute as is the combination measure, Holden, which is the most computationally intensive. In perspective, all low-frequency proxies, with the exception of the Holden measure, are faster to compute than their high-frequency counterparts. Table 1 provides summary descriptive statistics. Panel A describes monthly spread benchmarks and proxies calculated from 19932005 TAQ data. The high-frequency benchmark, Effective Spread (TAQ), has a mean of 0.029 and a median of 0.016. Since the effective costs are logarithmic, the mean corresponds to effective costs of about 3%. Looking across the spreads proxies, we see that Roll, Effective Tick, Effective Tick 2, Holden, Gibbs, and LOT Y-split are approximately the same in magnitude as the benchmark. LOT Mixed is approximately double the benchmark. The rest of the low-frequency measures are completely different in order of magnitude. Panel B describes annual spread benchmarks and proxies, where the picture about order of magnitude is essentially the same. Realized spread is the temporary component of effective spread. Its mean corresponds to 1.5% which is approximately half of the effective spread for monthly data (Panel A). Effective Tick, Effective Tick 2, Holden, and Gibbs are very close in magnitude to the realized spread. The same pattern persists for annual data (Panel B). Panel C of Table 1 describes monthly spread benchmarks and proxies calculated from 10/200112/2005 Rule 605 data. Effective Spread (605) has a mean of 0.015 and a median of 0.006. Again, the low-frequency proxies have essentially the same magnitude relationships as in Panel A. Compared to monthly TAQ effective spread in Panel A, effective spread (605) is almost twice smaller in magnitude. This difference can be attributed to the following. The TAQ effective spread is the percent dollar-volumeweighted average spread for each month while the Rule 605 effective spread is the dollar share-weighted average monthly spread reported by market centers normalized by the average monthly price. Further, the TAQ effective spread is obtained as the absolute value of the difference between price and the BBO midpoint, while the Rule 605 effective spread is computed by market center as the signed value, where buy and sell transactions are identied by market makers. Panel D of Table 1 describes monthly price impact benchmarks and proxies calculated from 19932005 TAQ data. The high-frequency benchmark, Lambda (TAQ), has a mean of 130.425 and a median of 15.793, after multiplying by 1,000,000. At its median value, the TAQ-based price impact coefcient Lambda implies that a $10,000 buy order would move the log price by approximately

162 Table 1 Descriptive statistics The benchmarks Effective spread (TAQ), Realized spread (TAQ), Lambda (TAQ), and 5-Minute Price Impact (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-month or rm-year. Effective spread (TAQ) is the dollar-volume-weighted average of two times the absolute value of log price minus log midpoint. Realized spread (TAQ) is the dollar-volumeweighted average of two times the log price minus log of the ve-minutes-later price for buys and the negative of previous for sells. Lambda (TAQ) is the coefcient from regressing the stock return over a veminute interval on the signed square-root dollar-volume over the same interval with intercept omitted. 5-Minute Price Impact (TAQ) is the dollar-volume-weighted average of two times the log ve-minuteslater midpoint minus the log midpoint for buys and negative of previous for sells. Lambda (TAQ) is in (percent return)/(square root of dollars). The other three TAQ benchmarks are unitless. The benchmarks Effective Spread (605) and Static Price Impact (605) are calculated from data required to be disclosed under SEC Rule 605 (formerly 11Ac1-5) for a sample rm-month. Effective spread (605) is the shareweighted average of two times the price minus midpoint for buys and of two times the midpoint minus price for sells, then divided by the average price over the month or year. Static Price Impact (605) is dollar effective spread for big orders divided by average price minus dollar effective spread for small orders divided by average price, then divided by the average trade size of big orders minus the average trade size of small orders. Effective spread (605) is unitless. Static Price Impact (605) is in dollars/share. All spread proxies and price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-month or rm-year. The spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The TAQ sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 62,100 rm-months or 5,200 rm-years. The Rule 605 sample spans 10/2001 to 12/2005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 19,039 rm-months. Spread benchmarks Effective spread (TAQ) Effective spread (605) Realized spread (TAQ) Roll Effective Tick Effective Tick2 Spread proxies Holden Gibbs LOT Mixed LOT Y-split Zeros Zeros2 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS

Panel A: Monthly, 19932005, using a TAQ benchmark Average 0.029 Std dev 0.040 Min 0.0001 Median 0.016 Max 0.896

0.015 0.032 0.370 0.005 1.320

0.027 0.037 0.000 0.016 0.906

0.017 0.032 0.000 0.008 0.929

0.016 0.030 0.000 0.007 0.949

0.018 0.030 0.000 0.009 0.917

0.018 0.021 0.000 0.012 0.673

0.056 0.089 0.000 0.031 1.000

0.023 0.051 0.000 0.009 1.000

0.143 0.147 0.000 0.095 0.909

0.127 0.130 0.000 0.095 0.909

Panel B: Annual, 19932005, using a TAQ benchmark Average 0.026 Std dev 0.034 Min 0.0003 Median 0.016 Max 0.672

0.014 0.024 0.044 0.007 0.808

0.025 0.032 0.000 0.016 0.327

0.013 0.019 0.000 0.007 0.289

0.013 0.018 0.000 0.007 0.340

0.014 0.019 0.000 0.008 0.269

0.014 0.018 0.001 0.007 0.190

0.074 0.117 0.000 0.039 1.787

0.027 0.061 0.000 0.011 1.119

0.145 0.126 0.000 0.115 0.917

0.128 0.101 0.000 0.109 0.653

Panel C: Monthly, 10/200112/2005, using a 605 benchmark Average 0.015 Std dev 0.033 Min 0.000 Median 0.006 Max 0.948

0.019 0.028 0.000 0.012 0.906

0.006 0.015 0.000 0.002 0.425

0.005 0.014 0.000 0.002 0.447

0.007 0.014 0.000 0.003 0.482

0.013 0.015 0.000 0.009 0.393

0.025 0.040 0.000 0.014 1.000

0.006 0.018 0.000 0.000 0.581

0.049 0.073 0.000 0.000 0.667

0.046 0.069 0.000 0.000 0.667

Price impact benchmarksa Lambda (TAQ) 5 Minute Price Impact (TAQ) Static Price Impact (605) Roll Impact Effective Tick Effective Tick2 Impact Impact Holden Impact Gibbs Impact

Price impact proxiesa LOT Mixed Impact LOT Y-split Impact Zeros Impact Zero2 Impact Amihud Pastor and Stambaugh Amivest Liquidity

Panel D: Monthly, 19932005, using a TAQ benchmark Average 130.425 0.031 Std dev 2446.202 0.038 Min 41544.120 0.000 Median 15.793 0.020 Max 398507 1.022

3.816 57.617 0.000 0.015 6978

4.587 154.809 0.000 0.020 32742

4.049 147.568 0.000 0.019 32742

4.068 93.306 0.000 0.024 16371

3.626 75.851 0.000 0.029 11399

12.211 288.448 0.000 0.074 42000

9.295 284.875 0.000 0.018 42000

20.917 7.782 305.990 102.754 0.000 0.000 0.202 0.148 38000 21000

6.314 91.957 0.000 0.104 14160

0.179 10.129 1508.411 0.000 798

639,355 155,561,102 0.000 26.622 38,762,898,699

Panel E: Annual, 19932005, Average 70.285 Std dev 300.430 Min 10943.480 Median 15.535 Max 7655.088

using a TAQ benchmark 0.031 0.031 0.002 0.021 0.414

2.045 17.937 0.000 0.015 834.616

1.569 25.274 0.000 0.015 1644.99

1.335 22.932 0.000 0.015 1504.080

1.353 13.734 0.000 0.017 581.405

1.486 14.257 0.000 0.014 578.151

6.604 87.651 0.000 0.089 5381.836

4.346 70.645 0.000 0.023 3826.47

12.879 4.972 6.307 191.552 31.360 46.973 0.000 0.000 0.000 0.237 0.236 0.148 11554.83 1424.65 1681.365

0.018 0.292 5.598 0.000 8.436

586,003 41,202,127 0.007 36.563 2,970,331,874

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS

Panel F: Monthly, 10/200112/2005, using a 605 benchmark Average 1.016 1.600 Std dev 31.278 19.639 Min 1491.101 0.000 Median 0.326 0.003 Max 2407.128 1525.001

1.057 28.910 0.000 0.002 3590.67

0.985 39.373 0.000 0.002 5229.895

0.875 12.177 0.000 0.004 699.319

1.071 15.198 0.000 0.012 1372.41

2.659 40.269 0.000 0.013 3773.920

1.213 27.799 0.000 0.000 3255.19

5.713 2.963 4.046 125.983 20.595 66.740 0.000 0.000 0.000 0.000 0.000 0.034 15587.53 894.38 7245.073

0.025 3.446 91.366 0.000 408.992

2,066,923 280,924,448 0.002 94.631 38,762,898,699

Panel G: Observations classied by exchange listing Data Monthly TAQ, 19932005 Annual TAQ, 19932005 Monthly 605, 10/200112/2005
a

Total 62,100 5,200 19,039

NYSE 15,536 1,295 5,167

AMEX 4,431 370 1,633

NASDAQ 42,133 3,535 12,239

All price impact benchmarks and proxies are multiplied by 1,000,000, except for Liquidity which is divided by 1,000,000 and 5-Minute Price Impact which is not scaled.

163

ARTICLE IN PRESS
164 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

p 10; 000 16 106 0:0016, i.e., 16 basis points. The mean of the 5-Minute Price Impact (TAQ) benchmark corresponds to 3% with a median of 2%. Looking at the means of the price impact proxies, we see that none of the proxies are of the same order of magnitude as Lambda (TAQ) or 5-Minute Price Impact (TAQ). The same holds true in Panel E for annual price impact proxies. Panel F describes monthly price impact benchmarks and proxies calculated from 10/200112/2005 Rule 605 data. Price Impact (605) has a mean of 1.016 and a median of 0.326, after multiplying by 1,000,000. Panel G breaks the rms down by exchange. Roughly 68% are listed on NASDAQ, 25% on the NYSE, and the rest on AMEX. This breakdown is nearly the same as the eligible universe of TAQ and Rule 605 stock symbols. 7. Results 7.1. Monthly/annual spread results Table 2 provides monthly spread evidence. It compares spread proxies calculated from daily prices and volumes each month (e.g., using a maximum of 23 daily prices and volumes per month) with monthly effective and realized spread benchmarks calculated from the TAQ data (e.g., a volume-weighted average of the effective/realized spread of every trade and corresponding BBO quote over the month). In the tables we highlight the winner of each race by drawing a box around the best-performing measure (or measures if there is a tie). Panel A reports the average cross-sectional correlation of each low-frequency spread proxy with the effective and realized spreads calculated from TAQ. This is computed in the spirit of Fama and MacBeth (1973) by: (1) calculating, for each month, the cross-sectional correlation across all 400 rms, and then (2) calculating the average correlation value over all 156 months. We nd that six measures, Effective Tick, Effective Tick2, Holden, Gibbs, LOT Mixed, and LOT Y-split, have average cross-sectional correlations greater than 0.6. The Holden measure has the highest average cross-sectional correlation at 0.682. The crosssectional correlation with the realized spread is lower and uctuates around 0.4 across the same six measures. We test whether the average cross-sectional correlations are different from each other in Tables 28 by running a t-test based on the time-series similar to FamaMacBeth.17 Specically, we calculate the crosssectional correlation each period (month or year) and then compute the pairwise difference in correlations between two candidate measures. We assume that time series of differences is i.i.d. over time, and test whether the average correlation difference is different from zero. Standard errors are adjusted for autocorrelation with a Newey-West correction using four lags for monthly data and three lags for annual data. Table 2, Panel A reports that the correlations of Gibbs and Holden with effective spread are insignicantly different from each other and the remaining proxies are
17

statistically signicantly lower than Holden. Put differently, considering the measure with the highest correlation, Holden, we nd that Gibbs is inside of its 95% condence region and the remaining spread proxies are outside. The same result holds for the realized spread. Next, we form equally weighted portfolios across all 400 stocks in a given month. Specically, we compute a portfolio spread proxy in month i by taking the average of that spread proxy over all 400 stocks in month i. Panel B reports the time-series correlation over 156 months of each low-frequency portfolio spread proxy with the effective and realized spreads of an equally weighted portfolio calculated from TAQ. Asset pricing researchers may be especially interested in the time-series correlations since so much of asset pricing research involves forming portfolios and exploring co-movement over time. It is worth noting that Panel B results may differ from those in Panel A, not only because they are computed over the time-series vs. across the cross-section, but also because some measurement error that affects individual stocks may be diversied away in portfolios. Consistent with a diversication effect, we nd relatively high timeseries correlations. Six measures, Roll, Effective Tick, Effective Tick2, Holden, Gibbs, and LOT Y-split, have time-series correlations greater than 0.9. We test whether time-series correlations are statistically different from each other in Tables 29 using Fishers Z-test. The Holden measure has the highest time-series correlation at 0.951 and Effective Tick, Effective Tick2, and LOT Y are in its 95% condence interval (see Table 2, Panel B). All of the time-series correlations signicantly different from zero are highlighted in boldface.18 Our spread proxies also do a good job in capturing timeseries variation in realized spread. The correlation is as high as 0.972 for LOT Y with Effective Tick, Effective Tick2, and Holden being in its 95% condence interval. Roll and Gibbs, which can be thought of as proxies for the realized spread since the versions we estimate do not include an asymmetric information component, do not do as well. Pastor and Stambaughs Gamma and Amivest signicantly underperform all other proxies in both Panels A and B. To look at the consistency of the measures performance, we break the time-series correlations down by subperiods in Panel C. Specically, we use the same portfolio liquidity measures as above, but compute timeseries correlations for three subperiods that closely correspond to minimum tick-size regimes. The subperiods are 19931996, 19972000, and 20012005, which relate to the minimum tick-size regimes of $1/8, $1/16, and $0.01, respectively. Consistent with Panel B, the same six measures, Roll, Effective Tick, Effective Tick2, Holden, Gibbs, and LOT Y-split, do consistently well in each subperiod in terms of correlation with effective spread. All six measures have time-series correlations greater

18 We test all correlations in Tables 29 to see if they are statistically different from zero at the 5% level of condence and highlight the correlations that are signicant in boldface. For an estimated correlation

s, Swinscow (1997, Ch. 11) gives the appropriate test statistic as t


We are grateful to an anonymous referee for this suggestion.

q
D2 1s

where D is the sample size.

Table 2 Monthly spread proxies compared to TAQ benchmarks The benchmarks Effective spread (TAQ) and Realized spread (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-month. All spread proxies are calculated from CRSP daily stock price and volume data for a sample rm-month. The spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 62,100 rm-months. Bold numbers are statistically signicant at the 5% level. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row.

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 165

ARTICLE IN PRESS

ARTICLE IN PRESS
166 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 167

than 0.900 in 19931996, in the interval [0.663, 0.886] in 19972000, and greater than 0.86 in 20012005. It is not clear why all six measures did worse during the $1/16 years. Gibbs has the highest correlation in 19931996, Effective Tick is the highest in 19972000, and Roll is the highest in 20012005. While the measures based on the price clustering do slightly worse in the third subperiod compared to the rst subperiod, the performance of the Amihud measure moves in the opposite direction. Thus, Amihud seems to represent the effective spread better in the last subperiod, during the decimalization era, where it achieves a correlation of 0.833. This might be associated with a decrease in price clustering during the decimals regime as a result of the majority of trading being done automatically via computerized systems. A slightly different picture emerges for correlations with realized spread. Measures based on price clustering, Effective Tick, Effective Tick2, and Holden, achieve the highest correlation during decimalization, ranging between 0.933 and 0.956. LOT Mixed, which does not show up as a winner so far, has the highest correlation with realized spread, 0.96. Similar to effective spread, the correlations are lower for all measures during the second subperiod. The drop in correlations is very severe for Roll and Gibbs. We form decile portfolios stratied by rm size (market capitalization) and by effective spread to check the robustness of the measures. For rm size, we sort the 400 stocks each month by market capitalization, assigning the rst 40 stocks with the smallest size to Portfolio 1, and so on. Each decile portfolio is equally weighted. Panel D reports the time-series correlation of size decile portfolios for both effective and realized spreads. Four measures do quite well across the decile portfolios. Effective Tick, Effective Tick2, Holden, and LOT Y-split have high and statistically signicant time-series correlations overall with mildly lower correlations for larger size portfolios. By contrast, Roll and Gibbs do very poorly with the larger rms in Portfolios 710. Specically, they obtain timeseries correlations of 0.4 or lower for effective spread and negative but insignicant correlations for realized spread, which appears to be a serious robustness problem. They do much better with the small and medium-size rms in Portfolios 16. All measures do much worse than their own average with the largest rms in Portfolio 10. Next, we form decile portfolios stratied by effective spread in the same manner as above, assigning the 40 stocks with the lowest effective spread to Portfolio 1, and so on. Each decile portfolio is equally weighted. Panel E reports the time-series correlations of these decile portfolios for both effective and realized spreads. Consistent with Panel D, the same four measures, Effective Tick, Effective Tick2, Holden, and LOT Y-split, do quite well with high and statistically signicant time-series correlations overall and mildly lower correlations in lower effective spread portfolios. By contrast, Roll and Gibbs do very poorly in Portfolios 14. Specically, they obtain time-series correlations lower than 0.322 for effective spread and lower than 0.161 for realized spread, which continues to represent a serious robustness problem. Undoubtedly, there is a great deal of overlap between

these low effective spread portfolios and the large size portfolios. Roll and Gibbs do far better in Portfolios 610. Nearly all measures do worse than their own average with the lowest effective spread rms in Portfolio 1. It therefore appears that large rms and rms with small effective spreads are the most challenging rms for all lowfrequency spread proxies. Finally, we calculate the prediction error between the low-frequency spread proxies and effective spread as calculated from TAQ. Panel F reports two performance metrics: (1) mean bias (e.g., the difference between the low-frequency mean and the high-frequency mean) and (2) root mean squared error. The mean bias corresponds to all 62,100 rm-months. The root mean squared error is calculated every month and then averaged over 156 months. We exclude Zeros, Zeros2, Amihud, Pastor and Stambaugh, and Amivest from these tests because they are measured in different units than the effective spread. We nd that Roll, Effective Tick, Effective Tick2, Holden, Gibbs, and LOT Y-split have relatively small biases compared to the effective spread benchmark, ranging from 0.002 to 0.013. However, all of these biases are signicantly different from zero based on a t-test. Roll has the smallest bias. This is consistent with Schultz (2000) who shows that Roll well captures the magnitude of the effective spread for intraday data. Roll, Effective Tick, Effective Tick2, Gibbs, and Holden have relatively low root mean squared errors ranging from 0.029 to 0.032.19 Holden and Gibbs have the lowest root mean squared errors, which are not signicantly different from each other based on a paired t-test. For the realized spread, Panel G, Effective Tick2 has the smallest mean bias of 0.001, and Gibbs has the lowest root mean squared error. Interestingly, Roll, which can be thought of as a proxy for realized spread, is outperformed by the new measures on this dimension. Summarizing the monthly spread evidence in Table 2, we generally conclude that low-frequency measures designed to estimate spread do, in fact, provide accurate measures of both effective and realized spreads computed from TAQ data. These measures are highly correlated at the rm and the portfolio levels, and provide low bias and small mean squared error. Not surprisingly, we nd that measures intended to capture other features of transaction costs, Amihud, Pastor and Stambaugh, and Amivest, do a poor job estimating effective and realized spreads, and zero returns is inferior to all other measures designed to capture effective spread. Note that we think of winning as providing high and consistent correlations together with low bias and low root mean squared error. Clearly, Effective Tick, Effective Tick2, Holden, and LOT Ysplit t this denition. Roll and Gibbs do well in many

19 We test all root mean squared errors generated by the liquidity proxies in Tables 2, 3, 6, and 8 to see if they are statistically signicant using the U-statistic developed by Theil (1966). Here, if U2 1 then the low-frequency liquidity proxy has no predictive power beyond just assuming no deviation from the sample mean. If U2 0 ,then the lowfrequency liquidity proxy predicts perfectly. U2 has an F distribution where the number of degrees of freedom for both the numerator and denominator is the sample size.

168 Table 3 Annual spread proxies compared to TAQ benchmarks The benchmarks Effective spread (TAQ) and Realized spread (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-year. All spread proxies are calculated from CRSP daily stock price and volume data for a sample rm-year. The Spread Proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 5,200 rm-years. Bold numbers are statistically signicant at the 5% level. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row. R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 169

cases, but they are not consistent: they have periods of much lower correlation (19972000) and subsamples that are much lower (large cap stocks and low effective spread stocks) than the other measures. The annual results in Table 3 are mostly consistent with the monthly evidence. We therefore summarize them briey.20 We again generally conclude that lowfrequency measures designed to estimate spread provide accurate measures of effective/realized spread computed from TAQ data. Overall, six measures dominate, in the sense of having a high and consistent correlation together with low bias and mean squared error: Roll, Effective Tick, Effective Tick2, Holden, Gibbs, and LOT Y-split. The discussion of Table 9 below highlights a failure of Roll and Gibbs over annual data in an out-of-sample test. Therefore, effectively, Effective Tick/Tick2, Holden, and LOT Y-split are the best measures on this dimension.

7.2. Monthly/annual price impact results Table 4 provides monthly price impact evidence, comparing price impact proxies calculated from daily prices and volumes each month with two monthly price impact benchmarks (Lambda and 5-Minute Price Impact) calculated from TAQ data. Panel A reports the average cross-sectional correlation of each low-frequency price impact proxy with each price impact benchmark. If we look at the measure with the largest correlation and then consider the measures within its condence interval, we get a picture of which measures are superior. Amihud has the highest correlation with the Lambda of 0.317 and is insignicantly different from Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, LOT Ysplit Impact, and Zeros Impact. Therefore, all nine measures are in the top leadership group for this horserace. For the 5-Minute Price Impact, Amihud has the highest correlation at 0.516 and is statistically signicantly higher than any other measure. Next, we form equally weighted portfolios across all 400 stocks in a given month. Panel B reports the timeseries correlation over 156 months of each low-frequency price impact proxy portfolio with each price impact benchmark portfolio calculated from TAQ. As before, most portfolio correlations are higher than the individual stock correlations. Roll Impact has the highest correlation with the Lambda of 0.562 and is insignicantly different from all measures except Gamma and Amivest at the 5% level. Roll Impact, however, is signicantly different from Effective Tick/Tick2 Impact and Amihud at the 10% level. Overall, all measures except Pastor and Stambaughs Gamma and Amivest do a reasonable job on this dimension. Roll Impact has the highest correlation with 5-Minute Price Impact of 0.517 and is insignicantly different from Gibbs Impact, Holden Impact, Lot Mixed Impact, LOT Y Impact, Zeros Impact, Zeros2 Impact, and
20 A detailed discussion of the results is available from the authors upon request.

170 Table 4 Monthly price impact proxies compared to TAQ benchmarks The benchmarks Lambda (TAQ) and 5-Minute Price Impact (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-month. All price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-month. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 62,100 rm-months. Bold numbers are statistically signicant at the 5% level. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row. R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS

Table 5 Annual price impact proxies compared to TAQ benchmarks The benchmarks Lambda (TAQ) and 5-Minute Price Impact (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-year. All price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-year. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 5,200 rm-years. Bold numbers are statistically signicant at the 5% level. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row.

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 171

ARTICLE IN PRESS

ARTICLE IN PRESS
172 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

Amihud. These eight measures are in the top leadership group for this horserace. The prediction error and mean squared error comparisons do not provide any meaningful information if the two variables are on completely different scales. Therefore, we omit the mean bias and root mean squared error calculation for price impact measures. The annual results of Table 5 are generally consistent with the monthly evidence. For brevity, we skip the discussion of Table 5 and summarize monthly and annual results together. Summarizing the Lambda (TAQ) horseraces of Tables 4 and 5, Roll Impact seems to have a slight edge because it has the highest correlation in two of the four horseraces. However, in most horseraces, it is statistically insignificantly different from the rest of the new class of price impact proxies developed in this paper and the Amihud measure. Gamma and Amivest are consistently dominated. Summarizing the 5-Minute Price Impact horseraces of Tables 4 and 5, Amihud is the best single proxy of the veminute price impact, being in the leadership group in all four correlation tests and standing by itself in one of them. In three of the four horseraces, the new class of price impact proxies is insignicantly different from Amihud. Roll Impact yields the highest correlations of the new class, so it is a close second behind Amihud.

7.3. Rule 605 results As discussed above, the new Rule 605 data allow us to test the robustness of our previous results by using a completely different high-frequency database. Accordingly, Table 6 presents evidence based on Rule 605 data from October 2001 to December 2005. Panels A, B, and C compare spread proxies with effective spread calculated from the Rule 605 data. Panels D, E, and F compare price impact proxies with static price impact calculated from Rule 605 data. The Rule 605 results presented in Panel A are relatively similar to the corresponding TAQ results. The same six measures have relatively high average cross-sectional correlations in nearly the same range as the TAQ data and are statistically signicant. Amihud has the highest correlation at 0.533 and Effective Tick and Holden are in its 95% condence interval. The time-series correlations are presented in Panel B for the Rule 605 data. Like the TAQ results, the time-series correlations of the portfolios are much higher than the cross-sectional correlations of individual stocks. The top measure for the time-series, Effective Tick, has the highest correlation and all measures except Gamma and Amivest are in their 95% condence interval. Unlike the TAQ results, the highest time-series correlation with Rule 605 effective spreads is 0.528 vs. a time-series correlation of 0.951 with the TAQ effective spread. It is not clear why the correlations are so different, but two benchmarks are fundamentally different. Effective Spread (TAQ) is the average cost of all trades, whereas Effective Spread (605) is the average cost of all marketable orders executed.

A market buy and market sell that cross at the midpoint (with a zero effective spread) counts as one TAQ trade, but counts as two Rule 605 marketable order executions. In addition, there are differences in: (1) trade type uncertainty in TAQ vs. certainty in Rule 605, (2) effective spread computation (absolute value in TAQ vs. signed value in Rule 605), (3) aggregation (dollar-volume-weighted with TAQ vs. share-volume-weighted with Rule 605), and (4) midpoint timing (midpoint at time of trade in TAQ vs. midpoint at time of order submission in Rule 605). However, the leading low-frequency proxies remain in the leadership group no matter which benchmark (TAQ or Rule 605 effective spread) we select. Next, Rule 605 results presented in Panel C on the prediction error are roughly similar to those in Table 2. Effective Tick2 has the smallest bias and is statistically signicantly smaller than any other measure. Gibbs has the smallest root mean squared error and is insignicantly different from Holden. Summarizing Panels A to C, the monthly Rule 605 spreads results show that lowfrequency measures computed from daily returns are able to capture effective spreads reported by the market centers. Overall, in terms of correlations and prediction errors, Holden, Effective Tick, and Effective Tick2 are the best proxies of Rule 605 effective spread. In Panel D, we present evidence on price impact for the Rule 605 data. Recall that Lambda (TAQ) is calculated from a regression, whereas Static Price Impact (605) is calculated as the difference between the effective spreads associated with large and small orders, divided by the difference between large and small order shares. Thus, it is not especially surprising to see very different results for Static Price Impact (605) presented in Panel D and for Lambda (TAQ). Essentially, all of the average crosssectional correlations between the price impact proxies and Static Price Impact (605) are insignicantly different from zero. All of the proxies fail to pick up Static Price Impact (605). In Panel E, we get similar results. Finally, Panel F reports the prediction errors of the price impact proxies with respect to Static Price Impact (605). We report mean prediction bias and root mean squared error only for the measures that are on the same scale as Static Price Impact (605). While Panels D and E show that the measures fail to capture most of the variation of Static Price Impact (605), they do reasonably well in estimating the level in Panel F. The mean bias is the smallest in absolute value for Effective Tick2 Impact, 0.031, with Holden and Gibbs Impact falling in its 95% condence interval. Root mean squared error is the smallest for Gibbs Impact with Effective Tick/Tick2 and Holden Impact being in its 95% condence interval. Summarizing Panels D to F, while all of the price impact proxies fail to capture timeseries or cross-sectional variations in Static Price Impact (605), the new class of price impact does a good job of predicting the level. Overall, Table 6 shows that actual effective spread data reported by the market centers can be accurately estimated using measures computed from daily returns. The table also shows that the new price impact measures developed in this paper can be used to estimate the level of Static Price Impact (605).

Table 6 Monthly spread and price impact proxies compared to 605 benchmarks The benchmarks Effective Spread (605) and Static Price Impact (605) are calculated from data required to be disclosed under SEC Rule 605 (formerly 11Ac1-5) for a sample rm-month. All spread proxies and price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-month. The spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 10/2001 to 12/2005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 19,039 rm-months. Bold numbers are statistically signicant at the 5% level. * means that the correlation is statistically signicantly different at the 5% level from all other correlations in the same row.

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 173

ARTICLE IN PRESS

174

Table 6. (continued)

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

ARTICLE IN PRESS

All price impact measures are multiplied by 1,000,000, except for Liquidity which is divided by 1,000,000.

Table 7 NYSE/AMEX Vs. NASDAQ breakdown for monthly proxies compared to TAQ benchmarks The benchmarks Effective spread (TAQ), Realized spread (TAQ), Lambda (TAQ), and 5-Minute Price Impact (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-month. All spread proxies and price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-month. The spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 62,100 rm-months. Bold numbers are statistically signicant at the 5% level.

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 175

ARTICLE IN PRESS

ARTICLE IN PRESS
176 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

7.4. Results by exchange For robustness, we explore the degree to which our results vary across exchanges. In Table 7, we break out the monthly spread and price impact evidence by exchange, sorting rms into two groups based on NYSE/AMEX and NASDAQ. In Panel A, with respect to average crosssectional correlations with effective and realized spreads, all spread proxies except Gibbs and Roll21 show a lower correlation for NASDAQ stocks than for NYSE stocks. The largest differences are associated with Effective Tick and Holden where the rst digit of the correlation coefcient changes. In contrast, the time-series correlations, Panel B, show that the measures do better for NASDAQ stocks than NYSE. Nearly the same pattern holds for correlations with the realized spread. Finally, the price impact measures are mixed across exchanges. The conclusion from this table is that the exchange does not matter very much and should not be a factor in using low-frequency spread or price impact proxies.

(Lambda), and deterioration, except Amihud, in Panel F (5-Minute Price Impact). 7.6. Dow Jones data Our nal robustness test is to test the spread measures out-of-sample. We examine the stocks in the Dow Jones Industrial Average from 1962 to 2000.22 The spread benchmark is the percent quoted spread of the Dow portfolio as computed by Jones (2002). For every year we compute each of the low-frequency spread proxies for each of the 30 Dow stocks and then equally weight the measures across stocks for the year since the historical spreads for the Dow stocks are available only on an annual basis. Table 9 shows the results. The biggest surprise is the large negative and signicantly negative correlation coefcients on the Roll and Gibbs measures. Rolls timeseries correlation is 0.642 and Gibbs time-series correlation is 0.395. Of course, the Dow Jones stocks are large capitalization stocks with low effective spreads. In that respect, the poor annual performance of Roll and Gibbs with the Dow Jones stocks is very consistent with the poor monthly performance of Roll and Gibbs with large capitalization deciles and low effective spread deciles in Table 2, Panels D and E. As a double-check on this result, we estimate the average autocovariance of daily price changes for each stock. Whenever we have positive autocovariance we change it to a zero value, consistent with the way we construct the Roll measure. We then correlate the average absolute value of the autocovariance with the spread and nd a 55% correlation. Thus, in this sample of large, liquid stocks, the lower the spread the higher the absolute value of the autocovariance. This is the opposite relationship supposed by Roll, who argues that liquid stocks should have lower autocovariance than illiquid stocks. For the other measures in Table 9, the correlations between the average measure and the average quoted spread are generally smaller than the time-series portfolio correlations of Table 3 Panel B, but they are still large and signicant. Effective Tick, Effective Tick2, and Holden all have time-series correlations greater than 0.840 and are statistically insignicantly different from each other. Also, LOT Y and Zeros/Zeros2 fall in their 95% condence interval. Fig. 1 shows the time series for the quoted spread of the Dow Jones portfolio and the low-frequency measures Holden, LOT Y-split, and Effective Tick. These data generate the correlations of Table 9. The lowfrequency measures track the quoted spread very well, especially at the end of the sample. The conclusion of Table 9 and Fig. 1 is that the measures are useful on a different sample of stocks over a different time period. 8. Conclusion The purpose of this paper is to test the hypothesis that low-frequency measures of transaction costs, measured
22

7.5. Results by year Our next robustness check is to explore how our results vary over time. Specically, Table 8 breaks out the monthly effective spread, realized spread, and price impact evidence by year. Panels A and B report the time variation of cross-sectional correlations and root mean squared error for the effective spread benchmark. In each month there are 400 observations for a correlation and root mean squared error, which are averaged over the year. The two panels tell opposite stories. Panel A shows that the cross-sectional correlations decrease over time for seven measures (Roll, Effective Tick, Effective Tick 2, Holden, Gibbs, LOT Mixed, and LOT Y-split). The decline is strongest during the decimal era (20012005). By contrast, the Amihud measure does not decline over time and joins the leadership group in the decimal era only. This result contrasts with the Table 2, Panel C result that the $1/8 era and decimal era had very high time-series correlations, while the $1/16 era had somewhat lower time-series correlations. In Panel B, all measures improve in their ability to predict the effective spread. LOT Mixed has a root mean squared error that is 81% more accurate in 2005 than in 1993. The same pattern is observed for the realized spread benchmark in Panels C and D. The mean squared error is the square of the bias plus the variance of the estimator. The fact that the correlation coefcient has fallen but the errors are smaller is the result of the measure having lower bias and smaller variance. In Panels E and F we present the average correlations between the price impact measures and the two highfrequency measures of price impact used in this paper. Generally, the measures are statistically signicant in all tables and demonstrate considerable volatility in Panel E
Schultz (2000) estimates the Roll measure using intraday TAQ data. He nds that the intraday Roll measure is a very accurate estimate of effective spread, because various biases in Roll tend to offset each other in his NASDAQ sample.
21

We thank Charles Jones for these data.

Table 8 Year-by-year breakdown for monthly proxies compared to TAQ benchmarks The benchmarks Effective spread (TAQ), Realized spread (TAQ), Lambda (TAQ), and 5-Minute Price Impact (TAQ) are calculated from every trade and corresponding BBO quote in the NYSE TAQ database for a sample rm-month. All spread proxies and price impact proxies are calculated from CRSP daily stock price and volume data for a sample rm-month. The effective spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick 2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The price impact proxies are: Roll Impact, Effective Tick Impact, Effective Tick2 Impact, Holden Impact, Gibbs Impact, LOT Mixed Impact, and LOT Y-split Impact developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest Liquidity ratio. The sample spans 19932005 inclusive and consists of 400 randomly selected stocks with annual replacement of stocks that do not survive, resulting in 62,100 rm-months. Bold numbers are statistically signicant at the 5% level.

R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 177

ARTICLE IN PRESS

ARTICLE IN PRESS
178 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

Table 8. (continued)

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 Table 9 Annual spread proxies compared to the quoted spread of the Dow portfolio 19622000 For a given year, the benchmark Quoted spread (Dow) is the percentage quoted spread of the Dow Jones Industrial Average portfolio as compiled by Charles Jones. For a given year, all spread proxies are calculated from CRSP daily stock price and volume data for each stock in the Dow 30 and then equally weighted to get the portfolio value. The spread proxies are: Roll from Roll (1984), Effective Tick and Effective Tick2 developed here and in Holden (2009), Holden from Holden (2009), Gibbs from Hasbrouck (2004), LOT Mixed, Zeros, and Zeros2 from Lesmond, Odgen, and Trzcinka (1999), LOT Y-split developed here, Amihud from Amihud (2002), Pastor and Stambaugh from Pastor and Stambaugh (2003), and the Amivest ratio. The sample size is 39 portfolio-years. Bold numbers are statistically signicant at the 5% level. 179

monthly and annually, can usefully estimate high-frequency measures, and if so, to determine which measures are best. Using a sample of 400 randomly selected stocks over the period 1993 to 2005, we compare all prior proxies, three new spread measures, and nine new price impact measures. Specically, we rst compute the effective and realized spreads and several measures of price impact from two high-frequency data sets: TAQ and Rule 605 data disclosed by market centers to the SEC. We then compute the low-frequency measures from daily return and volume data available on CRSP on a monthly and annual basis. We statistically determine how well the low-frequency measures capture high-frequency benchmarks. The evidence is overwhelming that both monthly and annual low-frequency measures capture high-frequency measures of transaction costs. Indeed, in many applications the correlations are high and the mean squared error low enough that the effort of using high-frequency measures is simply not worth the cost. The only real question then is: which measure should a researcher use? The answer depends on what, exactly, the researcher wants to measure. For monthly and annual effective and realized spreads, we nd that three measures dominate the remaining nine in correlations and mean squared prediction errors. The simplest of the dominant measures is the analytic Effective Tick. The most computationally intensive is the Holden measure. Intermediate in computational requirements is LOT Y-split. All provide statistically signicant and useful measures, high correlations, and low root mean squared errors, regardless of the database we use (TAQ or Rule 605). Without considering computational requirements, Holden delivers the best performance overall. Considering ease of computation, Effective Tick is the best measure to use. Measures widely used in the literature, namely, Amihuds Illiquidity, Pastor and Stambaughs Gamma, and Amivests Liquidity, are not appropriate to use as proxies for effective or realized spreads. We nd that price impact is more difcult to capture in our data than effective or realized spread. The measures are not designed to capture the magnitude of highfrequency price impact benchmarks and the correlations with price impact are lower than in the effective/realized spread tests. However, both the new class of price impact measures we introduce in this paper and the Amihud measure do a reasonably good job in the sense that they produce statistically signicant positive correlations. Pastor and Stambaughs Gamma and Amivests Liquidity are ineffective in capturing price impact in our data. We suggest using either the Amihud measure or using one of our effective spread measures divided by volume if a researcher wants to capture price impact. For specic high-frequency transaction costs benchmarks we suggest different low-frequency measures. To capture Lambda (TAQ), which is the coefcient from regressing return on the square root of signed trading volume over ve-minute intervals, we suggest either Amihuds Illiquidity or one of the new measures. To measure 5-Minute Price Impact, or the ve-minute

ARTICLE IN PRESS
180 R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181

0.9%
Percent quoted spread

Dow port. percent quoted spread and spread proxies

0.8% 0.7% 0.6% 0.5% 0.4% 0.3% 0.2% 0.1% 0.0% 1960

LOT Y-split Holden Effective Tick

1965

1970

1975

1980 Year

1985

1990

1995

2000

Fig. 1. Dow portfolio percent quoted spread and spread proxies (19622000). For a given year, the benchmark is the percent quoted spread of the Dow Jones Industrial Average portfolio as compiled by Charles Jones. For a given year, all spread proxies are calculated from CRSP daily stock price and volume data for each stock in the Dow 30 and then equally weighted to get the portfolio value.

change in midpoint after the trade, we suggest using the Amihud Illiquidity measure. All price impact measures fail to capture crosssectional or time-series variation of Static Price Impact (605). It is possible that this difculty lies primarily in the fact that Rule 605 data exclude block trades, where price impact should be most severe. In other words, much of the variation of Static Price Impact (605) may be noise. However, the new class of price impact measures does a good job in predicting the level of Static Price Impact (605) and has very low mean bias and root mean squared error. We conduct several robustness checks on these conclusions. First, we examine the pattern of these measures over time. Second, we examine whether listing exchange matters. Finally, we test the ability of these measures to predict the percent quoted spread of the Dow portfolio from 1962 to 2000. The conclusions are essentially the same in these tests. The measures vary over time in their ability to capture high-frequency measures, but the dominant measures are the same group over time. Interestingly, all measures based on price clustering seem to deteriorate in capturing the effective spread during the decimals regime, while the Amihud correlations continue to perform reasonably well during the last years of the sample. Further, exchange listing does not matter and the low-frequency measures do well in predicting the quoted spreads on Dow stocks. As with any empirical paper several caveats should be mentioned. First, using a random sample in this paper means that caution should be used in applying these measures to other samples or other time periods. Second, we do not know whether the measures are effective on

international data, especially in relation to those stocks with extremely thin trading. Both limitations suggest avenues for future research. With these limitations in mind, we think the results of this paper are strong enough that use of the low-frequency proxies to extend asset pricing, market efciency, and corporate nance research back in time and around the world is a step that the nance literature needs to take. References
Acharya, V., Pedersen, L., 2005. Asset pricing with liquidity risk. Journal of Financial Economics 77, 375410. Amihud, Y., 2002. Illiquidity and stock returns: cross-section and timeseries effects. Journal of Financial Markets 5, 3156. Amihud, Y., Mendelson, H., Lauterbach, B., 1997. Market microstructure and securities values: evidence from the Tel Aviv stock exchange. Journal of Financial Economics 45, 365390. Bekaert, G., Harvey, C., Lundblad, C., 2007. Liquidity and expected returns: lessons from emerging markets. Review of Financial Studies 20, 17831831. Berkman, H., Eleswarapu, V., 1998. Short-term traders and liquidity: a test using Bombay Stock Exchange data. Journal of Financial Economics 47, 339355. Boehmer, E., Jennings, R., Wei, L., 2003. Public disclosure and private decisions: the case of equity market execution quality. Working Paper, Indiana University. Cao, C., Field, L., Hanka, G., 2004. Does insider trading impair market liquidity? Evidence from IPO lockup expirations. Journal of Financial and Quantitative Analysis 39, 2546. Chan, L., Jegadeesh, N., Lakonishok, J., 1996. Momentum strategies. Journal of Finance 51, 16811713. Chordia, T., Roll, R., Subrahmanyam, A., 2000. Commonality in liquidity. Journal of Financial Economics 56, 328. Chordia, T., Goyal, A., Sadka, G., Sadka, R., Shivakumar, L., 2008. Liquidity and the post-earnings-announcement-drift. Working Paper, University of Washington.

ARTICLE IN PRESS
R.Y. Goyenko et al. / Journal of Financial Economics 92 (2009) 153181 181

Christie, W., Schultz, P., 1994. Why do NASDAQ market makers avoid odd-eighth quotes? Journal of Finance 49, 18131840. Cooper, S., Groth, K., Avera, W., 1985. Liquidity, exchange listing and common stock performance. Journal of Economics and Business 37, 1933. De Bondt, W., Thaler, R., 1985. Does the stock market overreact? Journal of Finance 40, 793805. Dennis, P., Strickland, D., 2003. The effect of stock splits on liquidity and excess returns: evidence from shareholder ownership composition. Journal of Financial Research 26, 355370. Fama, E., MacBeth, J., 1973. Risk, return, and equilibrium: empirical tests. Journal of Political Economy 81, 607636. Fujimoto, A., 2003. Macroeconomic sources of systematic liquidity. Working Paper, Yale University. Glosten, L., 1987. Components of the bidask spread and the statistical properties of transaction prices. Journal of Finance 42, 12931307. Goyenko, R., 2006. Stock and bond pricing with liquidity risk. Working Paper, Indiana University. Harris, L., 1991. Stock price clustering and discreteness. Review of Financial Studies 4, 389415. Hasbrouck, J., 2004. Liquidity in the futures pits: inferring market dynamics from incomplete data. Journal of Financial and Quantitative Analysis 39, 305326. Hasbrouck, J., 2009. Trading costs and returns for US equities: estimating effective costs from daily data. Journal of Finance, forthcoming. Heln, F., Shaw, K., 2000. Blockholder ownership and market liquidity. Journal of Financial and Quantitative Analysis 35, 621633. Holden, C., 2009. New low-frequency liquidity measures. Working Paper, Indiana University. Huang, R., Stoll, H., 1996. Dealer versus auction markets: a paired comparison of execution costs on NASDAQ and the NYSE. Journal of Financial Economics 41, 313357. Huang, R., Stoll, H., 1997. The components of the bidask spread: a general approach. Review of Financial Studies 10, 9951034. Jegadeesh, N., Titman, S., 1993. Returns to buying winners and selling losers: implications for market efciency. Journal of Finance 48, 6592. Jegadeesh, N., Titman, S., 2001. Protability of momentum strategies: an evaluation of alternative explanations. Journal of Finance 56, 699720. Jones, C., 2002. A century of stock market liquidity and trading costs. Working Paper, Columbia University.

Kalev, P., Pham, P., Steen, A., 2003. Underpricing, stock allocation, ownership structure and post-listing liquidity of newly listed rms. Journal of Banking and Finance 27, 919947. Keim, D., Madhavan, A., 1997. Transactions costs and investment style: an inter-exchange analysis of institutional equity trades. Journal of Financial Economics 46, 265292. Korajczyk, R., Sadka, R., 2008. Pricing the commonality across alternative measures of liquidity. Journal of Financial Economics, forthcoming. Lee, C., Radhakrishna, B., 2000. Inferring investor behavior: evidence from TORQ data. Journal of Financial Markets 3, 83112. Lee, C., Ready, M., 1991. Inferring trade direction from intraday data. Journal of Finance 46, 733746. Lerner, J., Schoar, A., 2004. The illiquidity puzzle: theory and evidence from private equity. Journal of Financial Economics 72, 340. Lesmond, D., 2005. Liquidity of emerging markets. Journal of Financial Eonomics 77, 411452. Lesmond, D., OConnor, P, Senbet, L., 2008. Capital structure and equity liquidity. Working Paper, Tulane University. Lesmond, D., Ogden, J., Trzcinka, C., 1999. A new estimate of transaction costs. Review of Financial Studies 12, 11131141. Lipson, M., Mortal, S., 2004a. Liquidity and rm characteristics: evidence from mergers and acquisitions. Working Paper, University of Georgia. Lipson, M., Mortal, S., 2004b. Capital structure decision and equity market liquidity. Working Paper, University of Georgia. Pastor, L., Stambaugh, R., 2003. Liquidity risk and expected stock returns. Journal of Political Economy 111, 642685. Roll, R., 1984. A simple implicit measure of the effective bidask spread in an efcient market. Journal of Finance 39, 11271139. Rouwenhorst, G., 1998. International momentum strategies. Journal of Finance 53, 267284. Sadka, R., 2006. Liquidity risk and asset pricing. Journal of Financial Economics 80, 309349. Schrand, C., Verrecchia, R., 2004. Disclosure choice and cost of capital: evidence from underpricing in initial public offerings. Working Paper, University of Pennsylvania. Schultz, P., 2000. Regulatory and legal pressures and the costs of NASDAQ trading. Review of Financial Studies 13, 917957. Swinscow, T., 1997. Statistics at Square One, ninth ed. BMJ Publishing Group, London. Theil, H., 1966. Applied Economic Forecasts. North-Holland, Amsterdam. Watanabe, A., Watanabe, M., 2006. Time-varying liquidity risk and the cross section of stock returns. Review of Financial Studies, forthcoming.

You might also like