Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol96.Issue1.May2000:

Journal of Econometrics 96 (2000)
}23
1

A simple framework for nonparametric
speci
"cation testing
Glenn Ellison
*, Sara Fisher Ellison
MIT Department of Economics, 50 Memorial
ve, Cambridge,
Dri
MA 02139, USA
Received 1 November 1993; received in revised form 1 April 1999

Abstract
This paper presents a simple framework for testing
"cation
the speci
of parametric
conditional means. The test statistics are based on quadratic forms in the

null model. Under general assumptions the test statistics are asymptotically
under the null. With an appropriate choice of the weight matrix, the tests
consistent and to have good local power.
"c implementations
Speci
involving matrices
bin and kernel weights are discussed. Finite sample properties are explored
tions.
( 2000 Elsevier Science S.A. All rights reserved.
JEL classi
xcation:
C14; C12
Keywords:
Consistent testing;
"cation
Speci testing; Nonparametric; Quadratic form

1. Introduction
Speci
"cation testing has become commonplace in econometrics,

a means of testing economic theories which predict
"c functional
speci forms and
1 although useful in many
as a regression diagnostic. Early
"cation
speci
tests,
settings, are not consistent, i.e., there are alternatives which
detect regardless of the amount of data available. Partly in response
*Corrresponding author.
1See, for example, Ramsey (1969), Hausman (1978), Davidson and MacKinnon (1981),
(1985), Tauchen (1985), and White (1987).

0304-4076/00/$ - see front(matter
2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 9 9 ) 0 0 0 4 8 - 2

2


G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

concern a large recent literature has examined the behavior of speci"cation tests
which exploit nonparametric techniques.2 The literature considers a variety of
techniques including series estimation, spline estimation, and kernel estimation
to test a null (parametric) model, with some of the tests having been shown to be
consistent against all alternatives.
Our approach to testing a null model y "f (x ; a)#u employs test statistics
i
i
i
based on quadratic forms in the model's residuals, +w u8 u8 . One intuition is
ij i j
straightforward: quadratic forms can detect a spatial correlation in the residuals
which would result from a functional form misspeci"cation. We provide general
conditions su$cient to ensure that the test statistics will be asymptotically
normally distributed under the null and will be consistent and explore the "nite
sample performance of speci"c implementations of the test via Monte Carlo
simulations.
Much of the previous literature on nonparametric speci"cation testing has

been motivated as testing the orthogonality between a model's residuals and an
alternative nonparametric model.3 Our testing framework can also been seen in
this light. Consider a nonparametric estimator y( "=y, e.g., kernel, spline,
series, or other linear smoother. A Davidson}MacKinnon style test of orthogonality with y( as the misspeci"cation indicator would be of the form:
1
¹" + y( u8
i i
c
N i
1
" + + w (x bI #u8 ) u8
ij j
j i
c
N i
j
1
1
" + w u8 u8 # + u8 + w x bI .
ij i j c

i
ij j
c
N i
N ij
j
The "rst term is a quadratic form in the residuals. The second measures the
orthogonality between the residuals and something that is of the form of an
estimate of X and should be small. Hence, we can think of a quadratic form test
with a weight matrix = as similar to an orthogonality test with =y as the
misspeci"cation indicator.4

A

B
A

B

2 Work in this vein includes Azzalini et al. (1989), Bierens (1982, 1984, 1990), Bierens and

Ploberger (1997), de Jong and Bierens (1994), Delgado and Stengos (1994), Eubank and Spiegelman
(1990), Fan and Li (1996), Gozalo (1993), HaK rdle and Mammen (1993), Hidalgo (1992), Hong and
White (1995), Horowitz and HaK rdle (1994), Rodriguez and Stoker (1993), White and Hong (1993),
Wooldridge (1992), Yatchew (1992), and Zheng (1996).
3 This, for example, is the approach of Hong and White (1995), Eubank and Spiegelman (1990)
and Wooldridge (1992).
4 A previous version of this paper showed the equivalence between a quadratic form test and an
othogonality test in some (but only some) situations.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

3

We hope that our approach may be seen as useful for a few reasons. The
construction is general and transparent, allowing researchers to base tests on
a variety of nonparametric estimation techniques and to easily tailor tests to
detect various types of misspeci"cation, if desired. The tests have good local
power and in simulations appear to have reliable size in small samples. The
framework is also well-suited to the application of standard binning techniques
and thus allows for tests which are undemanding computationally.

The remainder of the paper is structured as follows. Section 2 introduces the
class of quadratic form test statistics with which we shall be concerned and
establishes their asymptotic normality with correct speci"cation in a fairly
general environment. Several speci"c implementations are then discussed, including one based on a kernel regression estimator that is similar to the test
which was independently proposed by Zheng (1996). Potential "nite sample
corrections are also discussed. Section 3 contains a fairly general theorem
establishing the consistency of the tests. With an appropriate kernel implementation the local power of the tests is equal to that of the best of the prior and
contemporaneously proposed tests, e.g. Eubank and Spiegelman (1990) and
Hong and White (1995), and is superior to that of many other approaches.5
Section 4 presents the results of a set of Monte Carlo simulations which examine
the power of our tests and the reliability of the asymptotic critical values in "nite
samples. In a comparison with several other tests, our tests (with the suggested
"nite sample corrections) appear to have an advantage in the "nite-sample
accuracy of asymptotic critical values, and perform fairly well also in terms of
power.

2. The test statistic: De5nition and asymptotic distribution
We "rst introduce a general class of speci"cation tests for a nonlinear
regression model and prove that the statistics are asymptotically normal under
the null. We then discuss several special cases in more detail to illustrate the

utility of the framework.
2.1. General dexnition and asymptotic normality
We will be concerned with testing the speci"cation of a nonlinear regression
model of the form E(y D X)"f (x ; a). We explore procedures consisting of two
i
i
steps. The null model is "rst estimated by some JN-consistent procedure
5 The one test we are aware of which has slightly better local power is that of Bierens and
Ploberger (1997) which is consistent not only against local alternatives which shrink at a rate slower
than N~1@2, but also against local alternatives which shrink at a rate of exactly N~1@2.

4

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

producing residuals u8 . Test statistics based on a quadratic form u8 @=u8 are then
formed. Large positive values of the test statistic indicate misspeci"cation.
We begin with a very general proposition which de"nes a class of test statistics
T and gives their asymptotic distribution. Given conditions on the eigenvalues
N

of the weight matrix, the quadratic form test statistics are asymptotically
normal. We would like to emphasize that Proposition 1 is applicable not only to
consistent tests, but also to quadratic form tests tailored to detect particular
forms of misspeci"cation.
To state the proposition, we need a few de"nitions. The matrix A is said to be
nonnegative if each of its elements is nonnegative. For an N]N matrix A, we
write r (A) for the spectral radius of A de"ned by
DDAvDD
r (A), sup
.
DDvDD
v|RN, vE0
When A is symmetric with eigenvalues Dc D*Dc D*2*Dc D, it is well known
1
2
N
that r(A)"Dc D. De"ne
1
1@2
.

s(A), + a2
ij
i, j
When A is symmetric, it is easy to see that s (A)"(+c2)1@2.
i

A B

Proposition 1. Suppose y "f (x ; a )#u , where Mx N is a sequence of i.i.d.
i
i 0
i
i
random variables having compact support DLRd, and Mu N is a sequence of
i
independent random variables with u independent of x for jOi,
i
j
E(u D x )"0, 0(p2)Var(u D x ))p62(R, and E(u4 D x ))m(R for all
i i

i i
i i l
i and x . Assume also that f: D]R PR is twice continuously diwerentiable. Let
i
a8 be a JN-consistent estimate for a . Dexne u8 ,y !f (x ; a8 ). Write u8 N for
0
iN
i
i N
N
the N-vector (u8 ,2,u8 ), and ;I N for the N]N diagonal matrix with iith element
1N
NN
u8 . Let = : DNPRN2 be a function associating a symmetric N]N matrix to each
iN
N
realization of (x 2x ). Suppose that w "0 for i"1, 2,2, N,r
1
N
ii
1 as NPR, and that FSC P0
1 as NPR. Let
(= )/s(= ) P0
N
N
N
u8 N{= u8 N
N
#FSC .
T "
N
N J2s (;I N= ;I N)
N
L

Then, T PZ&N(0,1).6
N
6 We take this opportunity to explain a few of the notational liberties we will take. First, we will
normally write = for the matrix = (x ,2, x ). Note that = is stochastic when Mx N is stochastic.
N
N 1
N
N
i
Second, we will often drop the N subscript or superscript when no confusion will arise. For example,
the elements of the matrix = will be written w , and the vector uN will often be simply u. Finally, to
N
ij
refer to the matrix of explanatory variables in our models, we will sometimes use the notation xN and
sometimes X, depending on the context.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

5

Remark 1. Proposition 1 shows that T converges to a standard normal when
N
1 0. We will see in Sections 2.3 and 2.4 that this condition is
r (= )/s(= ) P
N
N
automatically satis"ed given appropriate regularity conditions when = is
a matrix of bin or kernel weights. For any other sequence of weight matrices, one
can always try to apply Proposition 1 directly. In any case, it may be useful to
compute the ratio r (= )/s(= ) and see how close it is to zero to get a rough
N
N
idea of how far the statistics may be from normal in the given "nite sample. In
simulations described in Ellison and Ellison (1998), we "nd the true size of a test
with 5% asymptotic critical values to be between 4% and 6% in each of the
speci"cations we tried for which r (= )/s (= ) was less than 0.4.
N
N
2. The term FSC in the test statistic is a "nite-sample correction. The most
N
straightforward application of Proposition 1 would be to the development of
a consistent test for misspeci"cation of a linear regression model (with a constant). For such an application with X the matrix of nonconstant explanatory
variables and with = being a weight matrix such that = y is a consistent
N
N
estimator of f (such as a matrix of kernel weights) we recommend and use in our
Monte Carlo study the correction
1#rank(X)
FSC "
.
N
J2 s (= )
N
While the use of nonparametric speci"cation tests is often motivated by
a desire for consistency, at other times an applied econometrician might want to
use a nonparametric test designed to detect speci"c forms of misspeci"cation.
For example, one might be particularly interested in nonlinearity in one variable
or in the presence of an omitted variable.7 Our more general recommendation
for such cases would be to use the correction
+d bK
FSC " k/0 k ,
N J2 s (= )
N
where bK is the coe$cient on X (the kth explanatory variable in the null model)
k
>k
in a regression of = X on X (and a constant) and bK is the constant term from
N >k
0
a regression of = 1 on X. We provide motivation for our suggested correction
N N
in Section 2.2.
The proof of Proposition 1, as well as all other proofs, is in the Appendix.
Intuitively, the reason why a quadratic form in the residuals is asymptotically
normal is that any symmetric matrix A can be written as A"U@KU with

7 Chevalier and Ellison (1997), for example, use a quadratic form test to investigate whether
a regression is nonlinear in one of several independent variables.

6

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

K diagonal and U the orthogonal matrix of eigenvectors of A. If u is a vector of
independent random variables we then have
u@Au"u@U@KUu"v@Kv"+ j v2,
i i
i
where v"Uu is a vector of uncorrelated random variables. We thus have that
u@=u is a weighted sum of the squares of a set of uncorrelated random variables,
and this is asymptotically normal provided the square of the largest weight
(which is equal to r (A)2) becomes arbitrarily small compared to the sum of the
squares of the weights (which is equal to s (A)2).
2.2. Motivation for a xnite-sample correction
The form of our suggested "nite-sample correction, FSC , is motivated by an
N
analysis of the "nite sample mean of the numerator of the test statistic in the
simplest case } the parametric null being a linear regression with homoskedastic
errors estimated by OLS. In this case we have
E(u8 @=u8 )"E(u@(I!P )=(I!P )u)
X
X
"E(u@=u)!E(u@P =u)!E(u@=P u)#E(u@P =P u)
X
X
X
X
"!p2 ¹r(P =)"!p2 ¹r((X@X)~1X@=X).
X
(The last line follows from repeatedly applying the identities E(u@Au)"p2 ¹r(A)
and ¹r(AB)"¹r(BA).)
Writing XK for =X, the kth diagonal element of the matrix (X@X)~1X@=X is
simply the coe$cient on X in a regression of XK on X. This motivates our
>k
>k
more general suggested "nite sample correction. When = is the weight matrix
corresponding to a consistent estimator, XK
will approach X as NPR
>k
>k
(under appropriate conditions), and hence each of these regression coe$cients
should approach one. The simpler "nite sample correction we recommend for
such = is motivated by there being 1#rank(X) such regression coe$cients
when X is augmented by a column of ones.
2.3. A special case: The kernel test
The proposition of the previous section identi"es the asymptotic distribution
of a broad class of test statistics. Recall that one motivation for looking at
a statistic of the form u8 @=u8 is that it is similar to a test of orthogonality between
u8 and the nonparametric estimate y( "=y. A typical application of our framework suggested by this motivation is to take the matrix = to be the weight
matrix from a kernel regression of y on X.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

7

An easy application of Proposition 1 shows that a test statistic formed from
kernel weights is asymptotically normal given very minimal restrictions on the
kernel and the rate at which the window width shrinks to zero.
Corollary 1. Let y, f, Mx N, Mu N, a8, u8 , ;I , D be as in Proposition 1. Suppose also
i
i
that the distribution of x has a twice continuously diwerentiable density
i
p(x)*p'0 on D.
Let the
6 kernel K(x) be a nonnegative function satisfying :Rd K(x) dx"1 and
:Rd K(x)2 dx(R, and dexne = by
N

G

K(1/h ) (x !x ))
N i
j
w " + K(1/h ) (x !x ))
ijN
kEi
N i
k
0

if jOi and + K(1/h ) (x !x ))'0,
kEi
N i
k
otherwise.

Let =s "(= #=@ )/2. Suppose that for some d'0, Ndh P0 and
N
N
N
N
Nhd h~dPR. Then
N N
L
u8 @= u8
1#d
N
T "
#
P Z&N(0,1).
N J2s (;I N=s ;I N) J2s (=s )
N
N

Remark 1. The test described in Corollary 1 is quite similar to the test which was
independently proposed in Zheng (1996). The principal di!erence is that the
density weighting of the various terms in the quadratic form di!ers because we
have made each row of the weight matrix sum to one rather than using raw
kernel weights.
2. Some assumptions have been made purely for convenience. For example,
the assumption that the kernel is nonnegative is used only because it makes it
easy to conclude that + w w *0 and that r (= ) remains bounded as
ij ij ji
N
NPR. If one wanted to use a kernel which was sometimes negative, a variety
of other assumptions could be added to obtain these conclusions.8
3. The calculations in the proof illustrate our earlier comment that nonparametric test statistics may converge to their asymptotic distributions very
slowly. In the proof we note that s(= )"O (h~d@2). This implies that our
N
1
recommended "nite-sample correction only tends to zero like hd@2. If, for
example, one is testing a model with one explanatory variable and chooses
h "h N~1@5, our "nite sample correction term will only vanish at the rate of
N
0
8 Without the assumption that the kernel is nonnegative, an additional assumption, e.g.
:Rd DK(x)D dx(R, would also be needed to ensure that the kernel density estimates used in the proof
are consistent.

8

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

[email protected] Finite sample corrections may thus be important, even when hundreds
of thousands of observations are available.
2.4. &Binning' and other tests
The testing framework can accomodate a wide variety of weight matrices. For
example, = could be the weight matrix corresponding to any smoother of the
form = y, such as k-nearest neighbor estimators, splines, orthogonal series
N
estimators, and convolution smoothing.10 If one suspects nonlinearity in one of
several X variables, one could choose weights which depend only on di!erences
in that one variable. Such a test would not be consistent of course } misspeci"cations in other X variables could go undetected } but it might have better power
in detecting that particular form of misspeci"cation.
One simple alternate implementation is a &bin' version of the test.11 It may be
obtained by dividing the data into m(N) bins and setting all nondiagonal weights
equal to each other inside the bins and equal to zero outside the bins. In contrast
to a kernel test statistic, which requires O(N2h) computations, a bin test statistic
requires O(N) computations. An O(N) computation of the test statistic and
general conditions su$cient to ensure its asymptotic normality are described by
the following corollary.
Corollary 2. Let y, f, Mx N, Mu N, a8, u8 , D be as in Proposition 1. Suppose also that
i
i
each x is drawn from a distribution with measure l on D.
i
Consider a sequence of partitions MP N with D"P XP X2XP
,
kN
1N
2N
m(N)N
P WP "0, kOj, m(N)PR, and N inf l(P )PR. Write C for the rankN
jN
k kN
kN
dom variable giving the number of elements of Mx ,2, x N which lie in P , S for
1
N
kN N
(+
C /(C !1))1@2, and < (n) for +
u8 n . Dexne
k4.5.CkN w2 kN kN
kN
i4.5.xi |PkN iN
< (1)2!< (2)
kN
kN
+
k4.5.CkN w2
C !1
1#d
kN
T "
#
.
N
< (2)2!< (4) 1@2 J2S
kN
kN
2+
N
k4.5.CkN w2 (C !1)2
kN

A

B

L

Then T P Z&N(0,1).12
N
9 Local power calculations presented later will suggest using window widths which shrink even
more slowly.
10 See HaK rdle (1990) for a discussion of the weight matrices corresponding to the estimators
mentioned above and others.
11 See Tukey (1961) and Loftsgaarden and Quesenberry (1965) for early discussions of computationally simpler smoothing estimators, such as the regressogram. These estimators would result in
a weight matrix similar to that which we suggest for the bin test.
12 If one wishes to use bins which are based on "ner and "ner divisions of only one, or more
generally z, of the X variables, the (1#d) term in the "nite sample correction could be replaced by
(1#z).

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

9

3. Consistency and local power
An important motivation for nonparametric speci"cation testing is that
parametric tests will fail to detect departures from the null in certain directions.
In this section we verify that, given fairly general conditions on the choice of
a weight matrix, the tests described in the previous chapter are indeed consistent.
As is standard, we consider whether the test can detect alternatives g (x)
N
which approach f (x; a ) as NPR, e.g., g (x)"f (x; a )#N~me(x) with m*0
0
N
0
and e(x) orthogonal to the space of null functions f. The proposition shows that
if the alternatives do not approach the null too quickly, i.e., if m(mM , then the test
will detect the alternative with probability one. Only two fairly weak conditions
on the weight matrix are required: that the eigenvalues satisfy r(= )"1 and
N
1 R, and that the nonparametric estimator e( (x),= e(x) have
s(= ) P
N
N
N
a mean squared error smaller than the function being estimated. These conditions are satis"ed for bin and kernel weight matrices, among others, if e(x) is
piecewise continuous.
Proposition 2. Suppose y "g (x )#u with Mx N, Mu N, u8 , ;I , D as in Corollary 1,
i
N i
i
i
i
and g : DPR a sequence of functions. Write X for the matrix (x 2x ). Let
N
1
N
a8(y, X) be an estimator for which there exists a sequence aH such that
N
L
JN(a8(y, X)!aH ) P Z&N(0, X) and aH Pa as NPR for some a . Let
N
N
0
0
1 1, and
= (X) be a sequence of matrices with w "0 ∀i, r(= ) P
N
ii
N
1 0 as NPR. Suppose FSC P
1 0 as NPR. Let fI be the vector
1/s(=s ) P
N
N
whose ith element is f (x ; a8) and similarly for other functions of x .
i
i
1
Let mM be dexned by mM ,supMmDN2m~1s (RN=s RN) P0N, and suppose there exists
N
a constant m, 0)m(mM and a bounded function e(x) with : e(x)2p(x) dxO0 such
D
that Nm(g (x)!f (x; aH ))Pe(x) uniformly in x. Suppose also that = is such that
N
N
N
PrMDD= e!eDD)(1!d)DDeDDNP1 as NPR for some d'0.
N
Let
u8 @= u8
N
T "
#FSC .
N J2s(;I N=s ;I N)
N
N
1 R as NPR.
Then, T P
N
It is instructive here to comment on the rate at which the local alternatives
may approach the null. Recall that in the case of the kernel test we saw (in the
proof of Corollary 1) that s(=s )"O (h~d@2). Hence, the de"nition of mM gives
N
1 N
mM "1#lim
(d/2) log h . For example, if d"1 and the kernel is chosen to
2
N?=
N N

10

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

be the standard &optimal'13 kernel with h "O(N~1@5) then the test is consistent
N
against alternatives of order N~m for m(2. With a more slowly shrinking
5
window width, the test can be made to have power against local alternatives of
order N~m for any m(1. Similarly, the bin test has s(=s )"O (m(N)) so
2
N
1
mM "1#lim
!1 log m(N). We again get mM close to 1 if we let the number of
2
2
N?= 2 N
bins grow slowly.
This local power is equal to that of the best of the prior and contemporaneously proposed nonparametric tests, such as Eubank and Spiegelman (1990)
and Hong and White (1995), and is superior to that of many other approaches.
The only test of which we are aware that obtains slightly superior local power is
the test of Bierens and Ploberger (1997) which is consistent against local
alternatives which shrink at a rate of exactly N~1@2 as well.

4. Simulation results
Here we present a Monte Carlo study of the "nite sample power of our tests
and the reliability of asymptotic critical values. A more comprehensive set of
Monte Carlo results can be found in Ellison and Ellison (1998). For easy
comparison (and to convince the reader that we had not designed the simulations to highlight our tests' attributes), we have chosen to piggyback on the work
done by Hong and White (1995) by simply adding statistics on the performance
of our tests to tables containing the results of their Monte Carlo study.
Table 1 speaks to the ability of applied researchers to rely on the asymptotic
critical values of the various tests. For this table, we used 10,000 simulations to
estimate the size of two kernel implementations of our tests when they are
performed using 5% asymptotic critical values on the null speci"cation used in
Hong and White (1995). The speci"cation involves a linear model with two
explanatory variables. The test labelled Ellison-Ellison1 is based on a kernel
weight matrix with h "1.0, and that labelled Ellison-Ellison2 is a kernel test
100
with h "1.5. The sizes of other tests are merely reprinted from Hong and
100
White (1995). We did not repeat their simulations. The tests labelled Bierensi,
ES&Ji, Hong}Whitei, Wooldridgei, and Yatchewi are versions of the tests of
Bierens (1990), Eubank and Spiegelman (1990) and Jayasuriya (1996), Hong and
White (1995), Wooldridge (1992), and Yatchew (1992), respectively, with the
particular smoothing parameters, series expansions, etc. described in Hong and
White (1995).
The true sizes of our tests in the six sample size-window width combinations
are 3.8%, 4.1%, 4.4%, 4.8%, 4.9%, and 4.9%, "gures which are much closer to
5% than are those for any of the other tests. Several of the other tests often have

13 Here we mean optimal kernel in the context of estimation, not testing.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

11

Table 1
Comparison of "nite-sample ACV size
Rejection rates with 5% ACV under null
Test statistic

N"100

N"300

N"500

Ellison-Ellison1
Ellison-Ellison2
Bierens1
Bierens2
ES & J1
ES & J2
Hong}White1
Hong}White2
Wooldridge1
Wooldridge2
Yatchew1
Yatchew2

4.9
3.8
4.8
6.7
2.0
2.7
1.6
2.8
8.7
10.0
7.2
10.4

4.9
4.1
3.6
7.5
1.3
3.2
2.0
2.8
5.3
9.7
6.2
9.9

4.8
4.4
6.1
10.7
2.2
3.0
2.0
2.7
7.5
10.5
8.5
13.1

Source: Figures for Ellison-Ellison tests computed from 10,000 simulations. Figures for other tests
taken from Table 2 of Hong and White (1995).

an ACV size subtantially in excess of 5%. We noted earlier that nonparametric
test statistics tend to converge to their asymptotic distributions relatively slowly.
Here, many of the test statistics do no better with 500 observations than they do
with 100 observations.14 Because ACV sizes improve so slowly, "nite sample
performance is of great importance.
Duncan and Jones (1994) have in the course of their empirical work on labor
supply also performed a Monte Carlo study which compares our test with those
of Gozalo (1993) and Delgado and Stengos (1994). Their results also indicate
that the asymptotic critical values of our test are more reliable than those of the
other two tests in "nite samples.
Table 2 compares the "nite sample power of our test with that of the other
nonparametric tests mentioned above. The table reports the rejection rates of
our and other tests (using empirically estimated critical values) against the three
other alternatives mentioned in Hong and White (1995).15 For each alternative
we report rejection rates from simulations involving 100 and 300 observations.16
14 We would guess that they would probably show little improvement even with ten thousand
observations.
15 We have chosen to use the set of simulations with p2"1 for alternatives 1 and 3, and those with
p2"4 for alternative 2 so as to make the power of the tests against the three alternatives more
comparable. Note that in the case of alternative 2 the exponent within the exponential term is 2 not
!2, which we believe is the proper correction to a misprint in the text of Hong and White (1995).
16 The rejection rates for the tests other than ours are drawn from Hong and White's Tables 3}5.

12

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

Table 2
Comparison of power
Rejection rates with 5% empirical critical values
Alternative 1

Alternative 2

Alternative 3

Test statistic

N"100

N"300

N"100

N"300

N"100

N"300

Ellison-Ellison1
Ellison-Ellison2
Bierens1
Bierens2
ES & J1
ES & J2
Hong}White1
Hong}White2
Wooldridge1
Wooldridge2
Yatchew1
Yatchew2

31.2
37.2
12.1
10.5
43.8
36.2
46.5
36.8
6.2
4.7
8.6
8.4

81.2
89.4
31.6
31.3
88.9
79.1
91.6
81.5
2.8
4.1
9.3
10.4

21.8
34.3
38.6
43.0
22.8
20.0
28.0
21.4
19.5
15.4
7.5
7.7

59.0
75.7
83.7
88.9
61.3
54.9
71.1
59.5
29.3
38.4
8.2
7.8

5.5
4.3
4.8
5.1
5.3
5.7
5.4
5.4
19.8
22.9
6.5
6.2

5.1
6.0
4.6
4.7
4.6
5.3
5.0
5.2
30.8
43.6
5.0
5.2

Source: Figures for Ellison-Ellison tests computed from 1000 simulations. Figures for other tests
taken from Tables 3}5 of Hong and White (1995).

Looking at each of the alternatives in turn, it appears that the Hong}White
test, Eubank}Spiegelman}Jayasuriya test, and our test do much better than the
others against alternative 1. Bierens' test appears to be most powerful against
alternative 2, followed by our test, Hong and White's, and Eubank and Spiegelman and Jayasuriya's. Only Wooldridge's test does at all well against alternative
3. It should be noted, of course, that assessing power from performance against
so few alternatives may be misleading. This is particularly true here because the
power of the tests based on series expansions is greatly a!ected by the degree to
which the alternatives and the included series terms are collinear and because all
of the alternatives in Hong and White's study are similar in that they involve low
frequency misspeci"cations.

5. Conclusion
In this paper, we have presented a framework for speci"cation testing which
involves working directly with quadratic forms in a model's residuals. The
framework allows one to construct asymptotically normal test statistics exploiting a variety of nonparametric techniques, and we have seen that these tests can
be consistent and have good local power.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

13

We hope that several factors may make our tests attractive to applied
researchers. First, the tests are very intuitive, which we feel is important not only
because one is always more comfortable with a test which one understands well,
but because this understanding makes it easy to adapt the test to the particular
situation one is facing. Second, because the null distributions of nonparametric
tests tend to converge slowly to their asymptotic limits, and computational
concerns make simulating null distributions undesirable, it is particularly important that the asymptotic approximation to the null distribution of a nonparametric test be accurate in small samples. Using the "nite sample correction
we suggest, our test does substantially better on this count than other tests in
our Monte Carlo simulation. Finally, it is easy to construct computationally
undemanding versions of the test.
As for future extensions, we see the greatest loose end in the current formulation as being the need for a choice of a smoothing parameter. While simulations
can provide some guidance, it would be interesting to explore criteria for
choosing the smoothing parameter automatically. A preliminary idea is to base
a test on choosing the smoothing parameter to maximize a quadratic form test
statistic. Given the success of Bierens and Ploberger (1997), we hope that such
a construction might both eliminate an arbitrary choice and improve local
power.

Acknowledgements
We thank the following persons for assistance and helpful comments: Joseph
Beaulieu, Jerry Hausman, Guido Imbens, Roger Klein, Whitney Newey, Danny
Quah, Tom Stoker, Je! Wooldridge, and anonymous referees. We also thank
Emek Basker and Sean May for excellent research assistance. Financial support
from National Science Foundation grant SBR-95-15076 and a Sloan Research
Fellowship is acknowledged. This paper is a revised version of Sara Fisher
Ellison's 1991 paper, `A Nonparametric Residual-based Speci"cation Test:
Asymptotic, Finite-sample, and Computational Propertiesa.

Appendix A
Proof of Proposition 1. To begin, we note that an elementary probabilistic
argument shows that it su$ces to prove that the result holds whenever Mx N is
i
nonstochastic and r (= )/s (= )P0 as NPR. To see this, apply Lemma A.1
N
N
below with a (xN)"r (= )/s (= ) and t (xN, uN)"T .
N
N
N
N
N
L
To show that TN P Z&N (0,1) when Mx N is nonstochastic and
i
r(= )/s(= )P0, note "rst that Theorem 1.1 of Mikosch (1991) (using
N
N

14

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

A "RN= RN) implies that
N
N
uN{= uN
L
N
P Z&N(0,1),
J2s(RN= RN)
N
where RN is an N]N diagonal matrix with RN"Var(u )1@2. Two additional
ii
i
lemmas then let us conclude that T also converges in distribution to a stanN
dard normal. The conclusion of Lemma A.2 is that
u8 N{= u8 N!uN{= uN 1
N
N P 0,
J2s(RN= RN)
N
which implies that
u8 N{= u8 N
L
N
P Z&N(0,1).
J2s(RN= RN)
N
Lemma A.3 establishes that
+ w2 u8 2 u8 2 1
ij ij iN jN P 1,
s(RN= RN)2
N
which implies that
u8 N{= u8 N
L
N
P Z&N(0,1).
J2s(;I N= ;I N)
N
L
1 0 so this implies TN P
Z&N (0,1). h
We have assumed that FSC P
N
Lemma A.1. Let Mu N and Mx N be as above. Write uN for (u ,2, u ), xN for
i
i
1
N
(x 2x ) and xN for a realization of xN. Let a (xN) and t (xN, uN) be measurable
1
N
N
N
6
functions. If
L

a (xN)P0Nt (xN, uN) P Z&N(0,1),
N 6
N 6
and
1 0,
a (xN) P
N
L

then t (xN, uN) P Z&N(0,1).
N
Proof of Lemma 1. As a "rst step we note that the result follows easily from the
!.4. 0 using the Dominated Convergence
"rst condition in the lemma and a (xN)P
N

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

15

Theorem. Under those assumptions

P

lim ProbMt (xN, uN))zN" lim
ProbMt (xN, uN))zNdk(x)
N
N 6
N?=
N?= x

P
P

"

lim ProbMt (xN, uN))zNdk(x)
N 6
x N?=

"

U(z)dk(x)"U(z)

x

with the last line following from the almost sure convergence of a (xN).
N
Next we show that only convergence in probability of a (xN), not almost sure
N
onvergence, is necessary. To see this, let G (z)"ProbMt (xN, uN))zN. Note
N
N
that G (z) depends on the joint distribution of the sequence of matrices MxNuNN
N
only through the distribution of xNuN. Hence the sequence MG (z)N is unchanged
N
if we choose any other joint distribution on the sequence MxNuNN with the same
marginal distribution on each xNuN. It is always possible to choose such a joint
!.4. 0 and hence we know G (z)PU(z).
distribution so that a (xN)P
N
N
(To see that such a joint distribution exists, write H for the cumulative
N
distribution function of a (xN), let h : DNPDN`1 be a sequence of mappings
N
N
with
H
(h (xN))"H (xN),
N`1 N
N
and let the joint distribution on MxNN be the distribution ofMx1, h (x1),
1
h (h (x1)),2N, with the uN being related to the xN in the obvious way.
i
i
2 1
With this construction, any realization x1, x2, x3,2, has a (xN)"H~1(H
N
1
N 6
6
6
6
(x1))P0.) h
6
Lemma A.2. Let M= N be a sequence of symmetric matrices (with = being
N
N
N]N) such that r(= )/s(= )P0 as NPR. Let uN and u8 N be as in Proposition
N
N
1 for a given sequence Mx N. Then,
i
u8 N@= u8 N!uN{= uN 1
N
N P 0.
J2s(RN= RN)
N
Proof of Lemma 2. To begin, note that
Rf
1
u8 !u "! (x ; a )(a8!a )!
i
i
0
Ra i 0
2
(a8 !a )(a8 !a ),
k
0k j
0j

l
R2f
+
(x ; a6(x ,a8))
i
i
Ra Ra
k, j/1 k j

16

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

for some a6(x , a8) between a and a8. Let
i
0
Rf
(x; a )
B "sup
0
1
Ra
x
R2f
l2
B (a)"
sup
(x; a t#a(1!t))
2
0
2
Ra Ra
k j
t|*0,1+, x|D, k, j
Because f has two continuous derivatives and D is compact, B and B (a) exist,
1
2
and B (a) is continuous. If we write
2
u8 !u "v (a8!a )#v8 ,
i
i
1i
0
2i
we immediately have Dv D)B and Dv8 D)B (a8)(a8!a )@(a8!a ).
1i
1
2i
2
0
0
Now, consider the expansion,

K

K

K

K

u8 @= u8 !u@= u"(u@= u#2(u8 !u)@= u#(u8 !u)@= (u8 !u))!u@= u
N
N
N
N
N
N
"2((a8!a )@v@ = u)#2(v8 @ = u)#(a8!a )@v@ = v (a8!a )
0
2 N
0 1 N 1
0 1 N
# 2(a8!a )@v@ = v8 #v8 @ = v8 .
2 N 2
0 1 N 2
1 0 by showing that each
We now show that (u8 @= u8 !u@= u)/s(RN= RN) P
N
N
N
of the "ve terms on the right-hand side of the expression above have plim zero
when divided by s(= ) (which su$ces because s(RN= RN)*p2s(= ).)
N
N
N
1. (1/s(= ))(a8!a )@v@ = u"JN(a8!a )@1(1/JNs(= ))v@ = u.
N
0 1 N
0
N 1 N
JN(a8!a ) has an asymptotic distribution. The vector it is multiplied by has
0
1
p62
Var
v@ = u )
v@ = = v .
1
N
Ns(= )2 1 N N 1
JNs(= )
N
N

A

B

Each column of v is an N]1 vector of norm )JNB , so each column of
1
1
= v has a norm of at most r(= )JNB . Each element of v@ = = v is then
1 N N 1
N
1
N 1
at most Nr(= )2B2 , and the variance}covariance matrix thus goes to the zero
N 1
matrix.
2. D(1/s(= ))v8 @ = uD2)(1/s(= )2)DDv8 = DD2DDuDD2)(r(= )2/s(= )2)DDv8 DD2DDuDD2.
N 2 N
N
2 N
N
N
2
Var(u ))p62 and E(u4))m implies that DDuDD/N"O (1). Also,
i
1
i
NDDv8 DD2)NNB (a8)2(a8!a )@(a8!a ) ) (a8!a )@(a8!a )
2
2
0
0
0
0
1 B (a )2 and (a8!a )@(a8!a )N is bounded in probability
As NPR, B (a8)2 P
2 0
0
0
2
so NDDv8 DD2"O (1). Hence, r(= )/s(= )P0 implies that term 2 has plim 0.
2
1
N
N
3. (1/ s (= ))(a8 !a )@v@ = v (a8 !a )"JN(a8 !a )@(v@ = v /N s (= ))
N
0 1 N 1
0 1 N 1
N
0
JN(a8!a ).
0

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

17

The middle term is a l]l matrix, and as in 1, each term is bounded by
B2 r(= )/s(= ) so the matrix converges in probability to zero.
1
N
N
4. The fourth term has

K

K

2
1
1
(a8!a )@v@ = v8 )DDJN(a8!a )DD2DDv@ = v8 DD2
1
N
2
1
N
2
0
0
Ns(= )2
s(= )2
N
N
1
NDDv8 DD2
2
)DDJN(a8!a )DD2
lNB2 r(= )2
1
0 Ns(= )2
N
N
N
1 0.
P

1 0. h
5. (1/s(= )2)Dv8 @ = v8 D2)(1/N2s(= )2)DDv8 DD2 Nr(= )2NDDv8 DD2 P
2 N 2
N
N
2
N
2
Lemma A.3. Let M= N be a sequence of symmetric matrices (with = being
N
N
N]N) such that r(= )/s(= )P0 as NPR. Let uN and u8 N be as in Proposition
N
N
1 for a given sequence Mx N. Then,
i
+N +N w2 u8 2 u8 2 1
i/1 j/1 ijN iN jN P 1.
s(RN= RN)2
N
Proof. We show this in two steps: showing "rst that
+ w2 u2u2!s(RN= RN)2 1
ij ij i j
N
P0
s(RN= RN)2
N
and then that
+ w2 u8 2 u8 2 !+ w2 u2u2
ij ij iN jN
ij i j
1 0.
ij
P
s(RN= RN)2
N
For the "rst step, let z be a vector with ith element z "u2!Var(u ). Note
i
i
i
that E(z )"0 and Var(z )"E(u4)!Var(u )2(m. Let = be a matrix with
i
i
i
i
2
ijth element equal to w2 . Let v be a vector with ith element Var(u ). We then
ij
i
have
+ w2 u2u2!s(RN= RN)2"(z#v)@= (z#v)!v= v
ij i j
N
2
2
ij
"z@= z#2z@= v.
2
2
We now show that each of the terms on the right-hand side of this expression
has plim zero when divided by s(RN= RN)2.
N

18

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

First, note that z ,2, z are independent random variables with E(z )"0
1
N
i
and Var(z )(m.17 For any symmetric nonnegative matrix with zeros on the
i
diagonal we then have E(z@Az)"0 and

A

B

Var(z@Az)"Var + a z z " + a a l Cov(z z , z zl )
ij i j
ij k
i j k
ij
ijkl
"2+ a2 Var(z z )(2m2s(A)2.
ij
i j
ij
Hence, E(z@= z)"0 and Var(z@= z)(2m2+ w4 , and
2
2
ij ij
z@= z
2m2 + w4
2m2 (max w2 )+ w2
ij ij )
ij ij ij ij
2
Var
(
(+ w2 )2
p8 (+ w2 )2
p8
s(RN= RN)2
ij ij
ij ij
N
2m2 r(= )2
N P0.
)
p8 s(= )2
N
Second, we similarly have E(2v@= z)"0 and
2
2
2v@= z
4p64m
4
2
Var + w2 v z (
+ + w2
)
Var
ij
i
j
ij
s(RN= RN)2
p8s(= )4
p8s(= )4
N
N
N j
ij
i
4p64m
)
max + w2 + + w2
ij
p8s(= )4
j i ij j
N
i
4p64m
r(= )2s(= )2P0.
)
N
N
p8s(= )4
N
Turning now to the second main step, note that

A

B

A

B

A

A

B

A B

B A B

+ w2 u8 2 u8 2 !+ w2 u2u2 u8 2{= u8 2!u2{= u2
ij ij i j "
ij ij iN jN
2
2
s(RN= RN)2
s(RN= RN)2
N
N
2(u8 2!u2)@= u2 (u8 2!u2)@= (u8 2!u2)
2 #
2
"
,
s(RN= RN)2
s(RN= RN)2
N
N
where we have written u8 2 for the vector with ith element u8 2 and u2 for the vector
i
with ith element u2. To show that both of the terms on the right-hand side of this
i
expression have plim 0 it will su$ce to show that (i) DDu8 2!u2DD"O (1), (ii)
1
1 0.
r(= )/s(= )2P0, and (iii) DD= u2/s(= )2DD P
2
N
2
N
Result (i) is standard: DDu8 2!u2DD is just the di!erence between the sum of
squared residuals and the sum of squared errors.

17 Here as in Lemma A.2 the sequence Mx N is taken to be "xed and thus we write E(z ) rather than
i
i
E(z Dx ) and similarly for other expectations and variances.
i i

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

19

To derive result (ii) note that for any symmetric matrix A, r(A))s(A). Hence,
r(= )/s(= )2)s(= )/s(= )2, and (ii) follows from
2
N
2
N
s(= )2
r(= )2s(= )2
+ w4
(max w2 )+ w2
2 " ij ij )
N
N P0.
ij ij ij ij )
s(= )4 s(= )4
s(= )4
s(= )4
N
N
N
N
Finally, to derive result (iii) note that
DD= u2/s(= )2DD)DD= v/s(= )2DD#DD= z/s(= )2DD,
2
N
2
N
2
N
where again we have written v for the vector with ith element v "Var(u ) and
i
i
z for u2!v. Using calculations similar to those above it is easy to see that the
"rst of these terms converges to zero.

K

K

= v 2 p64+ (+ w2 )2 p64s(= )2r(= )2
2
N
N P0,
i j ij )
)
s(= )2
s(= )4
s(= )4
N
N
N
To see that the second term has plim 0 also we write
N
DD= z/s(= )2DD2"s(= )~4z@=@ = z"s(= )~4z@B z# + c z2,
2 2
N
N
N
N
iN i
2
i/1
where B is the N]N matrix with b "0 and b "(=@ = ) for iOj and
N
iiN
ijN
2 2 ij
1 0 we
c is the iith element of =@ = /s(= )4. To see that s(= )~4z@B z P
2 2
iN
N
N
N
note as before that E(z@B z)"0 and
N
z@B z
2m2s(B )2 2m2s(=@ = )2 2m2s(= )4
2 2 )
N )
2 P0,
N
)
Var
s(= )4
s(= )8
s(= )8
s(= )8
N
N
N
N
with the second to last conclusion following from the fact that for any symmetric
matrix A, s(A@A)2"+ j4)(+ j2)2"s(A)4. Finally, to see that
i i
i i
1 0 as NPR we note that Theorem 3.4.9 of Taylor (1978)
+N c z2 P
i/1 iN i
concludes that a weighted sum of independent random variables of the form
+N c e has plim zero provided that "ve conditions hold: (1) E(e )"0; (2)
i/1 iN i
i
E(De D)(R; (3) max Dc DP0 as NPR; (4) there exists a C such that
i
i iN
+N Dc D)C for all N; and (5) there exists a random variable e such that
i/1 iN
Prob(De D*t))Prob(DeD*t) for all t*0 and all N. Each of these hypotheses
i
holds for e "z2!E(z2): the "rst is trivial; the second and "fth are immediate
i
i
i
consequences of z2 being nonnegative and the assumed uniform bounds on the
i
second and fourth moments of the u conditional on x ; the third and fourth
i
i
follow from the facts that c "s(= )~4+ w4
is nonnegative and
iN
N
j ijN
+N c "s(= )~4s(= )2P0 as NP0. Applying the theorem gives that
i/1 iN
N
2
N
1 0,
+ c (z2!E(z2)) P
iN i
i
i/1

A

B

20

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

and the desired result follows from noting that +N c E(z2))m+ c P0 as
i iN
i/1 iN i
well. h
Proof of Corollary 1. Clearly ws "0, so it su$ces to show that
ii
1 0. Note "rst that because = and
1 0 and (1#d)/J2s(=s ) P
r(=s )/s(=s ) P
N
N
N
=@ have the same eigenvalues,
1 DD(= #=@ )vDD 1
DD= vDD 1
DD=@ vDD
N ) sup
N
N
N # sup
r(=s )"sup
N
DDvDD
2
DDvDD
DDvDD
2
2
vE0
vE0
vE0
1
" (r(= )#r(=@ ))"r(= ).
N
N
N
2
Also, because K is nonnegative

s(=s )2"+
N
ij

A

B

w #w 2 1
1
1
ij
ji " s(= )2# + w w * s(= )2.
N
ij ji 2
N
2
2
2
ij

1 0 and 1/s(= ) P
1 0. Let =H be
Hence it su$ces to show that r(= )/s(= ) P
N
N
N
N
the N]N matrix with wH"1 if the ith row of = is identically zero and
ii
N
wH "w for all other i, j. =H is a Markov transition matrix so
ij
ij
N
1 R. This follows
r(= ))r(=H )"1. Hence, we need only show s(= ) P
N
N
N
directly from standard results.
To see this, note that when the model y"m(x)#e (with m(x) continuous and
e and i.i.d. N(0, p2) error) is estimated by y( "= y, the average conditional
N
variance of y( is

A

B

1
1
+ Var(y( D x ,2, x )" + Var + w y
i
i
N
ij j
N
N
i
i
j
p2
p2
" + + w2 " s(= )2.
ij
N
N
N
i j
The fact that the average conditional variance of the kernel estimator in this
environment is O (N~1h~d) thus implies that s(= )"O (h~d@2). h
1
N
1
Proof of Corollary 2. Note that T is of the same form as the test statistic in
N
Proposition 1 where = is the symmetric matrix given by
N
1
if iOj, x , x 3P ,
i j
kN
w " C !1
ij
kN
0
otherwise.

G

Again = is non-negative with all rows summing to at most one, so r(= ))1
N
N
with equality holding provided that not all bins are empty. Hence, we need only
1 0.
show 1/s(= ) P
N

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

21

To see this, note that s(= )2*+ a , where a is a random variable given
N
k kN
kN
by a "0 if C )1, a "1 if C *2.
kN
kN
kN
kN
+ a
m(N)
E k kN *
inf E(a )"inf (1!((1!l )N#Nl (1!l )N~1))P1,
kN
k
k
k
m(N)
m(N)
k
k
where we have written l for l(P ).
k
kN
+ a
Var k kN )sup Var(a ))sup E[(a !1)2]
kN
kN
m(N)
k
k

A

B

A

B

"sup ((1!l )N#Nl (1!l )N~1)P0.
k
k
k
k
Therefore, r(= )/s(= )"O (1/Jm(N)), as desired. h
N
N
p
Proof of Proposition 2. Given the bounds on the error moments and on
1 0 implies that N2m~1s(;I N=s ;I N) P
1 0. Hence, to
e(x), N2m~1s(RN=s RN) P
N
N
1
it su$ces to show that N2m~1u8 @= u8 is bounded away from
show that T PR
N
N
zero (in probability) as NPR. To see this note that
u8 "y !f (x ; a8)"u #g (x )!f (x ; aH )#f (x ; aH )!f (x ; a8).
i
i N
i
i
i
i
N i
i N
H
Writing e for g !f we have
N
N
N2m~1u8 @= u8 "N2m~1[u@= u#2u@= ( f H!fI )#2u@= e
N
N
N
N N
#( f H!fI )@= ( f H!fI )
N
#2( f H!fI )@= e #e@ = e ]
N N N
N N
1
From the proof of Lemma A.2 and N2m~1s(RN=s RN) P0,
we know that
N
N2m~1u@= u, N2m~1u@= ( f H!fI ), and N2m~1( f H!fI )@= ( f H!fI ) each have
N
N
N
plim zero.
1 we write ( f H!fI )@"(a8!aH )@v@ #v8 @ as
To see that N2m~1( f H!fI )= e P0
2
N 1
N N
in the proof of Lemma A.2, and "rst note that N2m~1(a8!aH )@v@ = e "
N 1 N N
JN(a8!aH )@v@ = Nme Nm~3@2, which has plim zero because JN(a8!aH ) has
N 1 N
N
N
an asymptotic distribution and
1
DDNme DD2N2m~1.
DDv@ = Nme Nm~3@2DD2)l2B2
1 N
1 N
N
N
N2m~1v8 @ = e has plim zero because
2 N N
DDv8 @ = e DDN2m~1)N2m~1DDv8 DDDDe DD"N2m~1O (1/JN)O (JN ) N~m).
2 N N
2 N
1
1

22

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

Finally,
N2m~1e@ = e "N2m~1DDe DD2#N2m~1e@ (= e !e ).
N N N
N
N N N
N
The "rst term is

P

1
N2m~1DDe DD2" + (Nme (x ))2P e(x)2p(x) dx.
N
N i
N
D
i
The second term has magnitude at most N2m~1DDe DDDD= e !e DD
N
N N
N
)N2m~1(1!d)DDe DD2 with probability approaching one. Hence,
N
PrMN2m~1u8 @=u8 'd/2

P

e(x)2p(x) dxNP1

D

as desired. h

References
Azzalini, A., Bowman, A., HaK rdle, W., 1989. On the use of nonparametric regression for model
checking. Biometrika 76, 1}11.
Bierens, H., 1982. Consistent model speci"cation tests. Journal of Econometrics 20, 105}134.
Bierens, H., 1984. Model speci"cation testing for time series regression. Journal of Econometrics 26,
323}353.
Bierens, H., 1990. A consistent conditional moment test of functional form. Econometrica 58,
1443}1458.
Bierens, H., Ploberger, W., 1997. Asymptotic theory of integrated conditional moment tests.
Econometrica 65, 1129}1151.
Chevalier, J., Ellison, G., 1997. Risk-taking by mutual funds as a response to incentives. Journal of
Political Economy 105, 1167}1200.
Davidson, R.J., MacKinnon, J., 1981. Several tests for model speci"cation in the presence of
alternative hypotheses. Econometrica 49, 781}793.
Delgado, M.A., Stengos, T., 1994. Semiparametric testing of non-nested models. Review of Economic Studies 61, 291}303.
deJong, P., Bierens, H.J., 1994. On the limiting behavior of a Chi-square type test if the number of
conditional moments tested approaches in"nity. Econometric Theory 10, 53}90.
Duncan, A., Jones, A., 1994. On the speci"cation of labour supply models: A nonparametric
evaluation. Discussion Paper No. 94/6, University of York.
Ellison, G., Ellison, S.F., 1998. A simple framework for nonparametric speci"cation testing. NBER
Technical Working Paper 234.
Eubank, R., Spiegelman, C., 1990. Testing the goodness of "t of a linear model via nonparametric
regression techniques. Journal of the American Statistical Association 85, 387}392.
Fan, Y., Li, Q., 1996. Consistent model misspeci"cation tests: omitted variables and semiparametric
functional forms. Econometrica 64, 865}890.
Gozalo, P., 1993. A consistent model speci"cation test for nonparametric estimation of regression
function models. Econometric Theory 9, 451}477.

G. Ellison, S.F. Ellison / Journal of Econometrics 96 (2000) 1}23

23

HaK rdle, W., 1990. Applied Nonparametric Regression. Cambridge University Press, Cambridge.
HaK rdle, W., Mammen, E., 1993. Comparing nonparametric versus parametric regression "ts. Annals
of Statistics 21, 1926}1947.
Hausman, J., 1978. Speci"cation tests in econometrics. Econometrica 46, 1251}1271.
Hidalgo, J., 1992. A nonparametric test for nested and non-nested models. Mimeo, London School
of Economics.
Hong, Y., White, H., 1995. Consistent speci"cation testing via nonparametric series regression.
Econometrica 63, 1133}1159.
Horowitz, J., HaK rdle, W., 1994. Testing a parametric model against a semiparametric alternative.
Econometric Theory 10, 821}848.
Jayasuriya, B., 1996. Testing for polynomial regression using nonparametric regression techniques.
Journal of the American Statistical Association 96, 1626}1631.
Loftsgaarden, D., Quesenberry, G., 1965. A nonparametric estimate of a multivariate density
function. Annals of Mathematical Statistics 36, 1049}1051.
Mikosch, T., 1991. Functional limit theorems for random quadratic forms. Stochastic Processes and
their Applications 37, 81}98.
Newey, W., 1985. Maximum likelihood speci"cation testing and conditional moment tests. Econometrica 53, 1047}1070.
Ramsey, J.B., 1969. Tests for speci"cation errors in classical least squares regression analysis.
Journal of the Royal Statistical Society, Series B 31, 350}371.
Rodriguez, D., Stoker, T., 1993. Regression test of semiparametric index model speci"cation.
Working paper 9200-6, Center for Energy and Environmental Policy.
Tauchen, G., 1985. Diagnostic testing and evaluation of maximum likelihood models. Journal of
Econometrics 30, 415}443.
Taylor, R.L., 1978. Stochastic Convergence of Weighted Sums of Random Elements in Linear
Spaces, Springer, Berlin.
Tukey, J.W., 1961. Curves as parameters and touch estimation. Proceedings of the Fourth Berkeley
Symposium, pp. 681}694.
White, H., 1987. Speci"cation testing in dynamic models. Advances in Econometrics Fifth World
Congress, vol. I. Cambridge University Press, Cambridge, pp. 1}58.
White, H., Hong, Y., 1993. M testing using "nite and in"nite dimensional parametric estimators.
University of California at San Diego, Department of Economics Working paper 93-01.
Wooldridge, J., 1992. A test for functional form against nonparametric alternatives. Econometric
Theory 8, 452}475.
Yatchew, A., 1992. Nonparametric regression tests based on an in"nite dimensional least squares
procedure. Econometric Theory 8, 435}451.
Zheng, J., 1996. A consistent test of functional form via nonparametric estimation techniques.
Journal of Econometrics 75, 263}289.