Greene GravityModelsofTrade Mar2014
Greene GravityModelsofTrade Mar2014
and endogeneity∗
Felix Chan† , Mark N. Harris† , William Greene# , and László Kónya\
†
School of Economics and Finance, Curtin University, Australia
#
Economics Department, Stern School of Business, New York University, USA
\
School of Economics, La Trobe University, Australia
March, 2014
Abstract
We consider the estimation of the usual Gravity model of trade, which involves
flows of trade, say exports, from country i to country j in time period t. We sug-
gest an easy-to-impliment generalised method of moments estimator that avoids the
issues associated with the usual fixed effects treatment of the unobserved hetero-
geneity in this type of models and at the same time provides consistent parameter
estimates in the face of potentially endogenous covariates.
JEL Classification: C33, F14, F15
Keywords: Gravity model, unobserved heterogeneity, Generealised Method of Mo-
ments.
∗
Corresponding author: Mark Harris, [email protected]. The usual caveats apply.
1
1 Introduction
With the increasing integration of national economies, societies, cultures and ideas, the
current phase of globalization has seen international trade at unprecedented levels: in
spite of the recent global recession and debt crisis, world trade of goods and services
amounted to about 27 to 31 percent of world GDP every year between 2007 and 2012.
Given the sheer size of international trade and the important role it has in improving
productivity and efficiency by providing access to enlarged world markets, stimulating
stronger competition, and generating technological spillover, specialization and division
of labour, understanding the main empirical drivers of international bilateral trade is
clearly a key issue for policymakers. This is true both at the national level, and also at
the international level, for groups and trading blocs of countries, such as APEC and the
European Union.
There are several approaches formally modelling trade behaviour, like for example
partial equilibrium models if the primary interest is in the effects of policy on a specific
sector, or computable general equilibrium models if interest is more economy wide and
the focus is on the relationship between production, consumption, goods and factors of
production. Still, since the pioneering studies of Tinbergen (1962) and Pyhnen (1963) the
so-called gravity model has proved to be the workhorse of empirical models of bilateral
trade flows. Originally developed from Isaac Newton’s Law of Universal Gravitation, this
model embodies the somewhat vague notion that the strength of interaction between two
units is primarily determined by their sizes and the distance or friction (in physical or
non-physical sense) between them. In particular, in its simplest form the gravity model of
international trade posits that export (import) flows are positively related to the “masses”
of the two bodies (i.e. to the “masses” of the trading partners as proxied, for example,
by their GDP and population), and inversely related to the distance between them.1
The popularity of the basic gravity model of international trade and of its various ex-
tensions is mainly due to their empirical success in capturing the impact of such factors as
common language, religion, border, trade liberalization etc. on trade, and in assessing the
1
For a useful summary see Anderson (2011). Apart from bilateral trade flows the gravity model has
been successfully adapted to a wide range of research topics ranging from Reilly’s law of retail gravitation
(Reilly, William J., 1931, The Laws of Retail Gravitation, New York, Knickerbocker Press) through cross-
border equity flows (Magee 2008) and productivity flows (Anderson 2009) to the movement of people and
ideas between places (see e.g. Karemera et al., 2000, A gravity model analysis of international migration
to North America, Applied Economics, 32, 1745-1755).
2
effect of geographic regions and international agreements on trade. Moreover, by consid-
ering many more goods than production factors and allowing for complete specialization
in different product varieties across countries, several different theoretical models, like
e.g. the monopolistic competition model and the Heckscher-Ohlin-Samuelson model with
a continuum of goods, are consistent with the gravity model of international trade. Nev-
ertheless, in spite of its versatility and commendable empirical performance, the gravity
model has long been criticised by theorists for the lack of a solid theoretical foundation.
In fact, the simple analogy with Newtons law, which makes it so attractive and easily
adaptable to many different issues and conditions, also makes it void of serious economic
theory, but, at the same time, consistent with several potentially contradicting theories.
Recent advances in this field have focused on the underlying economic theory (e.g.
Anderson and Wincoop (2003)), on the empirical specification (e.g. Rose (2004), Liu
(2009), Subramanian and Wei (2007)), and on various econometric issues (e.g. Egger
(2000), Anderson and Wincoop (2003), Silva and Tenreyro (2006), Baier and Bergstrand
(2007)). The current paper belongs to this latter branch of the literature. In particular,
it focuses on the treatment of the unobserved heterogeneity that has been the topic of
many recent papers (such as Wall and Cheng (1999), Egger (2000), Egger (2002), Egger
and Pfaffermayr (2003), Baier and Bergstrand (2007)). We suggest a parsimonious and
consistent, in the presence of endogeneity, estimation procedure based on a generalised
method of moments approach. We illustrate our technique with an application to a model
of export flows within the OECD group of countries.
3
partners and derived a system of non-linear equations that was later linearized and solved
analytically by Straathof (2008). An alternative solution based on the first-order Taylor-
series expansion to MRTs was proposed by Baier and Bergstrand (2009). Still another
possible way to deal with this problem is to use country-time fixed effects to approximate
the unobservable but potentially time-varying MRTs, along with ordered country-pair
fixed effects to account for unobservable but time-constant heterogeneity (see e.g. Baltagi
et al., 2003, Baldwin and Taglioni, 2006; Baier and Bergstrand (2007)). Adopting the
popular log-linear specification of the gravity model, this latter empirical strategy leads
to the following linear three-way error components model:
for i, j = 1, ..., N , i 6= j and t = 1, ..., Tij where Tij denotes the number of time series
observations for the (i, j) pair. yijt represents the volume of trade (typically either export
or import flows) from country i to j at time t; xijt is the K × 1 vector of structural
explanatory variables (such as GDP and population), which may, or may not, vary in the
complete ijt index space; β0 is the K×1 true parameter vector, which is typically unknown
and is the item of interest; N and T correspond to the number of countries and time
periods, respectively; and uijt is the usual disturbance term. As indicated by the notation
Tij , there is no requirement for this panel to be balanced in any dimension. Importantly
the αij , γit and λjt are the unobserved country-pair and country-time specific effects (such
as unobserved supply, demand, and time effects), which have been consistently shown to
be very important in the literature, see for example,Wall and Cheng (1999), Egger (2000),
Egger (2002), Egger and Pfaffermayr (2003), Baier and Bergstrand (2007).
A seemingly straightforward way to estimate this model is by the Least Squares
Dummy Variables (LSDV) estimator, whereby the unobserved individual characteristics
are captured by sets of dummy variables for exporter-importer country pair and exporter-
time, importer-time country-time specific effects. This brute force method, however, can
be very cumbersome on many countries and time periods because of the excessively large
number of dummy variables. In fact, it might be even unfeasible due to computer mem-
ory limitations and/or intrinsic restrictions of popular software packages on the number of
variables and on the maximum allowable matrix size. It is not by chance that all studies
that report LSDV estimation results for gravity models with MRTs and reasonable large
4
N and T (like e.g. Baier and Bergstrand, 2007; Subramanian and Wei, 2007; Eicher and
Henn, 2011) work with 5-yearly data, effectively reducing the number of γit and λjt spe-
cific effects by 80 percent. This data reduction, however, has its own price tag. Namely,
valuable information might be lost by reducing the data frequency and averaging yearly
observations and there is no guarantee that the five-yearly estimation results faithfully
describe all facets of the underlying data generating process.
If one is not particularly concerned with the estimation of the individual fixed effects,
in certain situations these problems can be overcome by the application of some fixed
effects (FE) data transformation prior to estimation.2 However, there is not a single,
relatively simple transformation for unbalanced data sets that could wipe out all αij , γit
and λjt two-dimensional FEs simultaneously. Alternatively, one might combine FE data
transformation and the LSDV estimator (FELSDV method). For example, one might
eliminate the country-pair FEs by the appropriate transformation and then apply the
LSDV estimator on the transformed data to take care of the country-time FEs. How-
ever, the FE transformation has to be performed on γit and λjt as well and the resulting
transfomed variables are not simple 0/1 dummy variables any more, so the computational
burden and computer memory requirement might still prove to be insurmountable obsta-
cles. A further possibility is to eliminate γit and λjt by the difference of difference method
(Head et al., 2010). The disadvantage of this approach is that while the estimation results
might be sensitive to the choice of the partner countries, there is no simple rule for the
selection of the partner countries, especially not when there are many country pairs in
the data set. Moreover, in case of unbalanced panel data sets it might be necessary to
consider different partner countries for different time periods.
5
reasons:
1. Standard panel data literature; “With a large number of random draws from the
cross section, it almost always makes sense to treat the unobserved effects as ran-
dom...” (Wooldridge 2010)3 .
2. Very few, if any, of the key variables that are of interest to both researchers and
policy-makers alike, vary in the ijt index. Therefore, including a full set of dummy
variables as implied by equation (1) essentially means that it is impossible to identify
the effects of most of our variables of interest.
3. As the data sets grow in any dimension (N or T, but especially the former) the
number of required dummy variables, and the associated loss in degrees of freedom,
is enormous. This is due to the fact for every N countries, there are N (N −1) possible
pairs of trade relationships, so the total number of observations for a balanced panel
is N (N − 1)T . Hence, it is effectively impossible to estimate a Gravity model of the
form of equation (1) for a reasonable large set of countries over an extended period
of time as the computational task is beyond the capabilities of standard software
packages such as Stata, Limdep, Gauss etc. running on even very large and fast
desktop computers.
Clearly an issue with treating these unobserved effects as random, is the risk of poten-
tial endogeneity arising from correlations between any of αij , γit and λjt and the observed
covariates (Baier and Bergstrand 2007). Moreover, although an instrumental variable
approach could be considered here (Serlenga and Shin 2007), it appears that, in general,
finding appropriate instruments that are both strictly exogenous to all of the unobserved
elements of the model and strongly related to the observed covariates, is somewhat of a
search for the Holy Grail.
However, given the stochastic structure of vijt as defined in equation (2), it is possible
to derive a set of second order moment conditions to yield a consistent estimate for
β0 by applying non-linear generalised method of moment (GMM). It is the aim of this
section is to derive such moment conditions based on the variance-covariance structure of
vijt and to show that nonlinear GMM under these moment conditions is consistent and
asymptotically normal. An important contribution is that the nonlinear GMM estimator
3
p. 286.
6
under the proposed moment conditions is consistent in the presence of endogeneity without
the need to identify other variables as instruments. In other words, this approach utilises
the second order moments to eliminate the effects of endogeneity.
To understand this, it is useful to point out that in his seminal work Hansen (1982)
showed (Theorem 2.1) that under certain conditions GMM is still consistent in the pres-
ence of endogeneity. This paper argues that endogeneity generally affects only one type
of conditions required for consistency, namely, the population moment conditions are sat-
isfied only at β0 . Therefore, it is possible to obtain a consistent GMM estimator in the
presence of endogeneity without instrumental variables, granted that there exists a set of
suitable moment conditions that can only be satisfied at β0 .
0
The following notations will be used for the rest of the paper. Let xijt = x0it , x0jt , x0ijt
where xit and xjt are K1 ×1 and K2 ×1 vectors denoting the variables specific to countries
i and j, respectively, and xijt is a K3 × 1 vector denoting the variables specific to the (i, j)
pair. Note that K = K1 + K2 + K3 . Let {Yij , Xij } denote the time series data for
yijt and xijt which is an outcome of a sequence of random variable {Wij } such that
{Yij , Xij } ⊂ {Wij }. Similarly, {Yt , Xt } denotes the cross section data for the time
period, t. Let Yi•t = (yi1t , ..., yiN t )0 and Y•jt = (y1jt , ..., yN jt )0 be (N − 1) × 1 vectors 4 and
{Y, X} denotes the full dataset. ⊗ denotes the Kronecker product, iN denotes a N × 1
vector of 1’s, 0K denotes a K × 1 vector of 0’s and IK denotes a K × K identity matrix.
The subscript may be omitted if the size of the matrix is clear from the context of the
argument. Di denotes a (N − 1) × N selection matrix such that Di A = iN −1 ⊗ Ai where
Ai denotes the ith row of the matrix A. Ei is the (N − 1) × N elimination matrix such
p
that Ei A removes the ith row of A. ||A|| denotes the Euclidean norm of A, → denotes
d
converges in probability and → denotes converges in distribution. If x is m × n matrix
then x < y denotes element-wise inequality if y is a m × n matrix and if y is a scalar, then
all elements in x is less than y. The same definition extends to ≤, and ≥ in a natural
way.
Consider the following assumptions:
0
A1. For every i = 1, ..., N , E(αi• αi• ) = σα IN −1 for all i = 1, ..., N .
4
yiit does not exist.
7
A2. Let γt = (γ1t , ..., γN t )0 , (
σγ2 IN t = s
E(γt γs0 ) =
0N t 6= s.
A5. The four unobservable components, namely, αij , γit , λjt and uijt are independent
from each other for all (i, j) pairs and for all t = 1, ..., T .
A6. For each (i, j) pair {Wij }Tt=1 is stationary and ergodic. Moreover E (||X ⊗ X||2 ) < ∞.
A7. For each (i, j) pair E(uijt |xijt ) = 0, E(αij |xijt ) = 0, E(xit γit ) = γ < ∞, E(xjt λjt ) =
λ < ∞ and E(γit |xjt ) = E(λjt |xit ) = 0 for t = 1, ..., T and i 6= j.
Remark 2 Assumption 7 specified the correlation structure between xijt and vijt . The
moment conditions presented below will obviously change, if different correlation structure
is assumed. However, the proof of Proposition 1 demonstrates that it is possible to derive
a set of valid moment conditions subject to the correlation structure between xijt and
vijt . An interesting question would be how general can one allow this correlation structure
before it is no longer possible to eliminate the problem of endogeneity using second moment
information. This is beyond the scope of current paper and will be left for further research.
8
0M with M ≥ K. The GMM estimator is defined to be
where Σ−1 is the optimal weight matrix. Following the standard approach in the litera-
ture, define β ∗ as the solution to the optimisation in equation (3) with Σ = IM , then the
optimal weight matrix can be estimated by
T
X
∗ −1
Σ̂(β ) = T g(β ∗ ; Ys , Xs )g 0 (β ∗ ; Ys , Xs ). (4)
s=1
Hence, an efficient GMM estimator can be obtained as
0 0
C2. For all i, j = 1, ..., N , i 6= j, E v•it v•it − v•jt v•jt = 0(N −1)×(N −1) , t = 1, ..., T .
0 0
C3. For all i, j = 1, ..., N , i 6= j, E vi•t vi•t−1 − vj•t vj•t−1 = 0(N −1)×(N −1) , t = 2, ..., T .
0 0
C4. For all i, j = 1, ..., N , i 6= j, E vi•t vi•t − vj•t−1 vj•t−1 = 0(N −1)×(N −1) , t = 2, ..., T .
0
C5. For all i, j = 1, ..., N , E vi•t vj•t−1 = 0(N −1)×(N −1) , for t = 2, ..., T .
0
C6. For all i, j = 1, ..., N , E (v•it v•jt−1 ) = 0(N −1)×(N −1) , for t = 2, ..., T .
5
C1-C6 lead to a maximum of T N 2 (N − 1) moment conditions. Their validity is implied
2
by the following propositions:
Proposition 1 If {Yij , Xij } follows the gravity model as defined in equations (1) - (2),
then under Assumptions 1 -7, Assumptions 2.1-2.5 in Hansen (1982) and the moment
conditions C1-C6, the GMM estimator as defined in equation (3) is consistent, that is
p
β̂ → β0 as T → ∞.
Proposition 2 Under the Assumptions in Proposition 1 and Assumptions 3.5 and 3.6
√ d
in Hansen (1982), T (β̂ − β0 ) → N (0, V ) where
0 −1
∂g (β) 0 −1 ∂g(β)
V =E (g(β)g (β)) . (6)
∂β ∂β 0
β=β̂
9
Proof. See Appendix.
Given the results in Propositions 1 and 2, the test of over-identifying moment condi-
tions can be conducted in the usual way. That is,
√ 0 d
J= T g (β̂; Y, X)Σ−1 g(β̂; Y, X) → χ2 (M − K). (7)
4 Empirical Application
For the sake of illustration we consider an unbalanced dataset of 1,056 trading partners
and 33,514 observations of the current OECD countries over the years 1960-2005. The
data was primarily sourced from: the IMF’s Direction of Trade Statistics; the IMF’s
International Financial Statistics; the World Bank’s World Development Indicators; the
5
World Trade Organization’s website (www.wto.org); and the CIA’s World Factbook.
The dependent variable is the (log of) real export flows. The explanatory variables
are:
• ON EIN : dummy variable for one of the two countries being in GATT/WTO.
• LN RGDP P OPit , LN RGDP P OPjt : (log of) real per capita GDP.
10
• COM COLij : dummy variable for a common coloniser.
Given the definitions of the dependent and independent variables, the coefficients of
LN RGDPit , LN RGDPjt , LN RGDP P OPit and LN RGDP P OPjt are elasticities, while
the remaining coefficients are semi-elasticities. As regards their expected signs, if GATT/WTO
membership has a positive effect on trade, one would expect ON EINijt and BOT HINijt
to have positive coefficients and the coefficient of BOT HINijt to dominate that of ON EINijt .
However, if these variables represent trade diversion and trade creation effects of GATT/WTO
membership, then one would expect a negative coeffient for ON EINijt and a positive
coefficient for BOT HINijt (although the former one could still be positive due to the ex-
ternalities of GATT/WTO on non-member countries). Moreover, one would expect richer
countries; countries that share a language, land border, or colonial history; countries that
belong to the same trading group, monetary union, or have some special bilateral agree-
ment; all to trade more with each other. On the contrary, countries that are far apart,
or geographically larger are likely to trade less. Finally, whether being landlocked or an
island nation encourages trade or not seems to be ambiguous.
Let yijt denote the log of real export flow from country i to country j at time t and
xijt be the vector of the corresponding explanatory variables. Given the model defined
in equations (1) and (2) and ν(β) = Y − Xβ, the sample counterparts of the moment
11
restrictions as stated in equations (1)-(6) are:
" N
#
X
0
g1t (β) =vech νi•t (β)νi•t (β) − N −1 0
νj•t (β)νj•t (β) i = 1, . . . , N
j=1
" N
#
X
0
g2t (β) =vech ν•it (β)ν•it (β) − N −1 0
ν•jt (β)ν•jt (β) i = 1, . . . , N
j=1
" N
#
X
0
g3t (β) =vech νi•t (β)νi•t−1 (β) − N −1 0
νj•t (β)νj•t−1 (β) i = 1, . . . , N
j=1
" N
#
X
0
g4t (β) =vech ν•it (β)ν•it−1 (β) − N −1 0
ν•jt (β)ν•jt−1 (β) i = 1, . . . , N
j=1
" N
#
X
g5t (β) =vech N −1 0
νi•t (β)νj•t−1 (β) i = 1, . . . , N
j=1
" N
#
X
g6t (β) =vech N −1 0
ν•it (β)ν•jt−1 (β) i = 1, . . . , N
j=1
The parameter vector β can be estimated by the non-linear GMM estimator as defined
in equation (5) with g(β) = (T − 1)−1 Tt=2 gt (β) where gt (β) = (g1t
0 0
(β))0 .
P
(β), . . . , g6t
Following the standard approach and introducing β ∗ = arg max g 0 (β)g(β), the optimal
β
weight matrix is constructed as
T
X
∗ −1
Σ̂(β ) = (T − 1) gt (β ∗ )gt (β ∗ )0 . (8)
t=2
The nonlinear GMM estimator, is then the solution to the optimisation problem: β̂ =
arg min g 0 (β)Σ−1 (β ∗ )g(β). Moreover, the variance-covariance matrix of β̂ can be esti-
β
mated by
T
" T
#−1 T
∂gt0 (β)
−1
X X X ∂gt (β)
V̂ = T gt (β̂)gt0 (β̂) .
∂β
t=2 β=β̂ t=2 t=2
∂β 0 β=β̂
Note that for any K × K matrix, A, vechA = SvecA where S is an K(K + 1)/2 × K 2
selection matrix that will select the appropriate elements in vec A. This implies
∂vecg1t (β)
S
T ∂β 0
∂g(β) −1
X .
= (T − 1) .
. , (9)
∂β 0
t=2 ∂vecg6t (β)
S
∂β 0
12
where
∂vecg1t (β)
= − [νi•t (β) ⊗ Xi•t + Xi•t ⊗ νi•t (β)]
∂β 0
N
X
−1
−N [νj•t (β) ⊗ Xj•t + Xj•t ⊗ νj•t (β)] i = 1, . . . , N
j=1
∂vecg2t (β)
= − [ν•it (β) ⊗ X•it + X•it ⊗ ν•it (β)]
∂β 0
N
X
−1
−N [ν•jt (β) ⊗ X•jt + X•jt ⊗ ν•jt (β)] i = 1, . . . , N
j=1
∂vecg3t (β)
= − [νi•t−1 (β) ⊗ Xi•t + Xi•t−1 ⊗ νi•t (β)]
∂β 0
XN
− N −1 [νj•t−1 (β) ⊗ Xj•t + Xj•t−1 ⊗ νj•t (β)] i = 1, . . . , N
j=1
∂vecg4t (β)
= − [ν•it−1 (β) ⊗ X•it + X•it−1 ⊗ ν•it (β)]
∂β 0
N
X
−1
−N [ν•jt−1 (β) ⊗ X•jt + X•jt−1 ⊗ ν•jt (β)] i = 1, . . . , N
j=1
N
∂vecg5t (β) −1
X
= − N [νj•t−1 (β) ⊗ Xi•t + Xj•t−1 ⊗ νi•t ] i = 1, . . . , N
∂β 0 j=1
N
∂vecg6t (β) −1
X
= − N [ν•jt−1 (β) ⊗ X•it + X•jt−1 ⊗ ν•it ] i = 1, . . . , N.
∂β 0 j=1
Table 1 presents two sets of estimation results. The first column labelled “Identity”
contains the results for the case Σ = I and the second column labelled “Optimal” contains
the results from the two-step estimator. That is, Σ = Σ̂ (β ∗ ) where Σ̂ (β ∗ ) is estimated
optimal weight matrix as defined in equation (8). It is based on the parameter estimates
from the first case where Σ = I.
The J test based on the estimated optimal weight matrix as defined in equation (7)
gives 9.081 which does not provide sufficient evidence to suggest over-identifying restric-
tions. Considering the ‘Identity’ column, the slope coefficients are all strongly signif-
icant and have logical signs, expect those of LN LAN Di , LN LAN Dj , CBORDij and
EV COLij . The coefficients of the two most important independent variables, ON EINijt
and BOT HINijt , imply that bilateral trade between a GATT/WTO member country and
a non-member country is expected to be about 77 percent and between two GATT/WTO
members about 161 percent more than between two non-members. The results are con-
13
sistent with the “Optimal” case and the two sets of results are not qualitatively different
from each other. This is expected as the “Optimal” results should be more efficient than
the “Identity” results but they are not expected to be significantly different in terms of
the parameter estimates.
5 Conclusion
Gravity models with multiple indices (here i, j and t) are extremely popular in interna-
tional trade models as they adequately account for the likely presence of several sets of
unobserved heterogeneity and tend to fit the data very well. This is often undertaken
using a fixed-effects approach; essentially including thousands of dummy variables in the
regression model. Alternatively, one can adopt a random-effects approach, though it yields
inconsistent estimates if some explanatory variables are correlated with the unobserved
heterogeneity, which is often the case. As a possibly remedy, here we propose the use of a
nonlinear GMM estimator based on moment conditions implied by the usual assumptions
of the empirical gravity model in a panel data setting. This approach avoids well-known
problems of the fixed effects approach, such as the huge loss of degrees of freedom, iden-
tification of the effects of all covariates, estimation on large data sets etc. Moreover, it
provides consistent parameter estimates in the face of potential endogeneity, so long as
the moment conditions used are valid.
14
Table 1: Estimation Results
Identity Optimal
ON EINijt 0.569 0.632
(6.979) (7.336)
BOT HINijt 0.958 1.038
(12.220) (12.711)
LN RGDPit 0.038 0.090
(1.872) (1.831)
LN RGDPjt 0.710 0.740
(363.035) (363.122)
LN RGDP OPit 0.822 0.863
(24.839) (25.403)
LN RGDP OPjt 0.210 0.144
(67.373) (67.705)
LN DISTij -0.852 -0.834
(-173.368) (-174.025)
LN LAN Di 1.073 1.156
(61.547) (62.348)
LN LAN Dj 0.027 -0.005
(17.416) (17.557)
CLAN Gij 0.748 0.771
(57.952) (58.176)
CBORDij -0.387 -0.289
(-9.832) (-9.207)
LLOCKi 0.771 0.711
(9.710) (9.157)
LLOCKj -0.426 -0.520
(-47.048) (-47.364)
ISLAN Di -0.541 -0.591
(-6.570) (-6.988)
ISLAN Dj -0.075 -0.169
(-13.266) (-14.176)
EV COLij -3.640 -3.556
(-63.628) (-64.032)
COM COLij 1.150 1.165
(23.303) (24.139)
M U N Iijt 0.906 1.004
(40.433) (41.021)
T Aijt 0.076 0.147
(6.889) (7.680)
C -25.138 -25.117
(-88.646) (-89.537)
σγ2 36.660 44.820
(3.797) (9.517)
σλ2 0.247 0.154
(3.284) (3.479)
σu2 6.656 6.732
(2.970) (4.099)
σα2 366.924 349.281
(2.277) (6.916)
σv2 410.488 400.988
(3.379) (20.578)
*t-statistics are in the parentheses.
15
References
Anderson, J. (2011): “The Gravity Model,” Annual Review of Economics, 3, 133–160.
Baier, S., and J. Bergstrand (2007): “Do Free Trade Agreements Actually Increase
Members International Trade?,” Journal of International Economics, 71, 72–95.
Egger, P. (2000): “A Note on the Proper Specification of the Gravity Model,” Eco-
nomics Letters, 66, 25–31.
(2002): “An Econometric View on the Estimation of Gravity Models and the
Calculations of Trade Potentials,” The World Economy, 25, 297–312.
Liu, X. (2009): “GATT/WTO Promotes Trade Strongly: Sample Selection and Model
Specification,” Review of International Economics, 17, 428–446.
Rose, A. (2004): “Do We Really Know That the WTO Increases Trade?,” American
Economc Review, 94(1), 98–114.
Serlenga, L., and Y. Shin (2007): “Gravity Models of Intra-EU Trade: Application of
the CCEP-HT Estimation in Heterogeneous Panels with Unobserved Common Time-
Specific Factors,” Journal of Applied Econometrics, 22(2), 361–381.
16
Silva, S., and S. Tenreyro (2006): “The Log of Gravity,” Review of Economics and
Statistics, 88, 151–175.
Subramanian, A., and S.-J. Wei (2007): “The WTO Promotes Trade, Strongly but
Unevenly,” Journal of International Economics, 72, 151–175.
Wall, H., and I. Cheng (1999): “Controlling for Heterogeneity in Gravity Models of
Trade,” Working Paper 99-010, Federal Reserve Bank of St. Louis.
Wooldridge, J. (2010): Econometric Analysis of Cross Section and Panel Data, 2e.
MIT Press, Cambridge, Massachusetts.
Appendix
The following lemma is useful for the proof of Proposition 1.
17
and hence, for any β ∈ Θ, define
Similarly,
ν•it (β) = X•it (β0 − β) + α•i + Ei γt Di λt + u•it . (12)
Note that νi•t (β) is used to denote the residual function which depends on β whereas
vi•t denotes the unobserved random variables. Under Assumptions 1-7, direct calculation
gives:
0
(β)] = E Xi•t (β0 − β) (β0 − β)0 X0i•t + vi•t vi•t
0 0
+ vi•t (β0 − β)0 X0i•t
E [νi•t (β)νi•t + Xi•t (β0 − β) vi•t
with
0 0
E (vi•t vi•t ) =E [αi• αi• + Di γt γt0 D0i + Ei λt λ0t Ei + ui•t u0i•t ]
= σα2 + σλ2 + σu2 I + σγ2 ii 0
The last line follows from Lemma 1 and Assumptions 1 - 5. For C1, this implies
0 0
(β) =E [Xi•t ⊗ Xi•t − Xj•t ⊗ Xj•t ] vec (β0 − β) (β0 − β)0
vecE νi•t (β)νi•t (β) − νj•t (β)νj•t
+ E (vi•t ⊗ Xi•t − vj•t ⊗ Xj•t ) vec (β0 − β)
+ E (Xi•t ⊗ vi•t − Xj•t ⊗ vj•t ) vec (β0 − β)
=E [Xi•t ⊗ Xi•t − Xj•t ⊗ Xj•t ] vec (β0 − β) (β0 − β)0 .
0
The last line follows from Assumption 7 and the fact that E (vi•t vi•t ) = (σα2 + σλ2 + σu2 )I +
σγ2 ii 0 for all i = 1, ..., N . Under Assumption 6, E (Xi•t ⊗ Xi•t − Xj•t ⊗ Xj•t ) exists and
has full rank and thus
if and only β = β0 . The same arguments apply to C2. For C3, first note that
0
(β)) =E X0i•t (β0 − β) (β0 − β)0 Xi•t−1
E (νi•t (β)νi•t
0
+ X0i•t (β0 − β) vi•t−1
0
+ E vi•t vi•t−1
+ E vi•t (β0 − β)0 Xi•t−1
18
0
Under Assumptions 1 - 5, it is straightforward to show that E vi•t vi•t−1 = σα2 I and hence
0 0
= E (Xi•t−1 ⊗ Xi•t − Xj•t−1 ⊗ Xj•t ) vec (β0 − β) (β0 − β)0
vecE vi•t vi•t−1 − vj•t vj•t−1
under Assumption 6. Since E (Xi•t−1 ⊗ Xi•t − Xj•t−1 ⊗ Xj•t ) has full rank, it implies that
if and only β = β0 . The same arguments apply to C4. For C5, note that
0
(β) =E Xi•t (β0 − β) (β0 − β)0 X0j•t−1
E νi•t (β)νj•t−1
0
+ E vi•t (β0 − β)0 X0j•t−1
+ E Xi•t (β0 − β) vj•t−1
0
+ E vi• vj•t−1
=E Xi•t (β0 − β) (β0 − β)0 Xj•t−1
The last line follows from the fact that all terms on the right hand side except the first
identically equal to zeros under Assumptions 1 - 5 and 7. This implies:
0
=E [Xj•t−1 ⊗ Xi•t ] vec (β0 − β) (β0 − β)0
vecE vi•t vj•t−1
=0
if and only if β = β0 . The same arguments apply to C6. This completes the proof.
Proof of Proposition 2. Under Assumptions 1-7 and the result as presented in
Proposition 1, Proposition 2 is a straightforward application of Theorem 3.1 in Hansen
(1982). This completes the proof.
19