MODEL-CHECKS FOR HOMOSCHEDASTIC SPATIAL LINEAR REGRESSION MODEL BASED ON BOOTSTRAP METHOD | Somayasa | JURNAL ILMIAH MATEMATIKA DAN TERAPAN 138 470 1 PB

MODEL-CHECKS FOR HOMOSCHEDASTIC SPATIAL LINEAR REGRESSION MODEL BASED
ON BOOTSTRAP METHOD

W. Somayasa
Mathematics Department of Haluoleo University, Kendari
email: [email protected]
Abstract
In this paper we propose Efron residual based-bootstrap approximation methods in asymptotic modelchecks for homoschedastic spatial linear regression models. It is shown that under some regularity
conditions given to the known regression functions the bootstrap version of the sequence of least squares
residual partial sums processes converges in distribution to a centred Gaussian process having sample
paths in the space of continuous functions on

I : 0,1

0,1

. Thus, Efron residual based-bootstrap is

a consistent approximation in the usual sense. The finite sample performance of the bootstrap level
Kolmogorov-Smirnov (KS) type test is also investigated by means of Monte Carlo simulation.
Key words: residual based-bootstrap, asymptotic model-check, homoschedastic spatial linear regression

models, partial sums, Gaussian process.

I.

Introduction
Practically the correctness of the assumed linear models is usually evaluated by analysing the
cumulative sums (CUSUM) of the least squares residuals. To this end a huge amount of literature is
available, see among others MacNeill (1978) or Bischoff and Miller (2000) for one dimensional
context. In the spatial context we refer the reader to MacNeill and Jandhyala (1993) and Xie and
MacNeill (2006).
To see the above mentioned problem in more detail let us consider an experiment performed
on an experimental region I :

0,1

0,1 under n 2 experimental conditions taken from a regular

lattice, given by
n


:

( / n, k / n ) : 1 , k

I , n 1.

n

(1)

Accordingly, we put together the observations carried out in

Yn

n

:

n ,n
Yk k 1, 1


( / n, k / n ) , and
product

A

n n

A, B

n n

n n
n n

, where for

is the space of

: trace(A t B)


trace(A t A) , for A, B

1 , k n , Yk

in an

n

is the observation at the point

n n real matrices furnished with the Euclidean inner
and

n n

the

corresponding


norm

given

. It is worth noting that for our model we take

experimental region without loss of generality instead of any compact subset of

f :I

, let

f(

n ):

f ( / n, k / n)

n n matrix


n ,n
k 1, 1

n n

, then we have Yn

n

g(

2

by

I as the

. For any

n)


where g is a true but unknown real valued regression function having bounded variation on

En n ,

I and

Model-Checks For Homoschedastic Spatial Linear Regression Model Based On Bootstrap Method

En

n

n ,n
k k 1, 1 is the corresponding matrix of independent and identically distributed random

:

errors having mean zero and variance


2

(0, ) , for 1 , k n . We are interested in testing

the hypotheses

H 0 : g(
where

fp (

n ),, f p (

Wn : f1 (

n)

n)

Wn , n 1 ,


n n

is a sub space of

(2)
n n

generated by f1 (

,

n),

n).

H0 ,

Under


Rn

Wn vs. K : g(

n)

n

: pr

Yn

Wn

the

pr

n


Wn

matrix

En

of

least

squares

(Stapleton, 1995), where

n

pr

Wn

residuals

is

given

by

is the orthogonal projector onto

the orthogonal complement of W n . Without loss of the generality we assume throughout this work
that f1 (

n),

 , fp (

n ) is an orthogonal basis of

Wn . Hence, by the elementary linear algebra

we get under H 0 ,
p

Rn

n

: En

n

n ), E n n

fi (

i 1

vec

Equivalently by the

fi (

n n

n ), f i (

fi (

n)

n)

.

(3)

n n

operator, we have

X n (X nt X n ) 1 X nt vec(E n n ) ,

vec(R n n ) vec(E n n )

where X n is the design matrix of the model under H 0 , i.e., an n
is given by the n
2
is given by ˆ n

2

dimensional vector vec(f k (

n

p matrix whose k th column

k 1, , p . A consistent estimator for

2

2

Rn

:

n )) ,

2

n
2

n n

p

2
in the sense ˆ n converges in probability to

Suppose that the regression functions f 1 ,  ,
integrable on I with respect to the Lebesgue measure

2

, as

n

.

f p are linearly independent and squared
I , then under

H 0 , MacNeill and Jandhyala

(1993) and Xie and MacNeill (2006) showed that
1
n

Tn (vec(R n n ))( )

where for

D

Bf ( ) in C I , as n

( t , s) I ,
f td

B f ( t , s) : B 2 ( t , s)
0, t

and

,

Tn :

n2

C(I)

is

the

I

G

1

(R )

partial

fdB 2 ,

(4)

I

0 ,s

sums

operator

defined

in

Park

(1971).

Here

p, p

G:

fkf d
I

p p
I

, B 2 is the standard Brownian sheet having sample pats in C I

k 1, 1

(R )

and

stands for the Riemann-Stieltjes integral.

52

JIMT, Vol. 7, No.2, Nopember 2010 : 51 – 59

tn

If

is

a

realisation

of

the

Kolmogorov-Smirnov

(KS)

statistics

1
Tn (vec(R n n ))( / n, k / n ) , then a level test of KS type based on residual
n
partial sums processes for testing (2) will reject H 0 if and only if t n c1 , where c1
is a
KS n ,f : max

0 ,k n

constant that satisfies the equation

PH 0

max

0 ,k n

1
Tn ( vec(R n n ))( / n, k / n )
n

c1

.

Hence, by the continuous mapping theorem (Billingsley, 1968, p. 29-30), c1
by means of the distribution of

can be approximated

sup B f ( t , s) .
0 t ,s 1

As it is presented in (4), the limiting distribution of the KS type statistics under H 0 is not
mathematically tractable, because it depends on the structure of the designs and the regression
functions. Therefore the application of the preceding tests procedure is limited. It is the purpose of
the preceding paper to investigate the performance of bootstrap test of (2). Although the application
of the bootstrap appears naturally in the present context, to the knowledge of the author this topic
has not found much attention in the literature. One purpose of the present paper is to demonstrate
that Efron residual based-bootstrap (Shao and Tu, 1995) provides simple and reliable alternative for
constructing asymptotic critical region of such a KS type test. This will be discussed in the Section 2.
In Section 3 we develop Monte Carlo simulation in investigating the finite sample characteristics of
the bootstrap test. An application of the developed method to a real data will be presented in Section
4. Finally we close the paper in Section 5 presenting conclusion and some remarks for future
researches.
II.

Bootstrap Approximation

k th row and  th column computed
under H 0 , 1 , k n . We define an n n matrix of

Let rk be the component of the residual matrix in the
based on the original observations Yn
centred residuals

R̂ n

n

:

rk

n

n ,n
k 1, 1 ,

r

where

r:

empirical distribution function of the components of

1 , k n , i.e., for x
of

,

FR̂ (x) :

1
n2

n n

1

k 1 1

,x

1
n2

R̂ n

(rk

n

n

k 1  1

n

rk . Further, let FR̂ be the

putting mass

1
n2

to rk

r ) , where 1 A is the indicator function

A . Efron residual based-bootstrap defines the bootstrap observations by

Yn* n : Y*k
where

E *n

n

n ,n
k 1, 1

* n ,n
k k 1, 1

:

g(

n)

E *n n ,

is a random matrix whose components are a random sample from

and

g(

53

r , for

n)

p
i 1

fi (

n ), Yn n

fi (

n ), f i (

n n
n)

fi (
n n

n)

(5)

FR̂

Model-Checks For Homoschedastic Spatial Linear Regression Model Based On Bootstrap Method

is the ordinary least squares estimator of g (
and variance operators conditioned on

ˆ n*2 : var * (

and

n2 p
n2

*
k )

ˆ n2

n

) under H 0 . Let E * and var * be the expectation

FR̂ . Then by the construction it is clear that E * (

(r ) 2 which clearly converges in probability to

2

*
k )

0

by the

famous weak law of large number and Slutsky’s theorem.
The bootstrap analogous of the residual partial sums process is given by

1
Tn ( vec(R *n n ))( )
n ˆ n*

1
Tn ( vec(E *n n ) X n (X nt X n ) 1 X nt vec(E *n n ))( ) .
n ˆ n*

(6)

Hence, the bootstrap version of the KS type statistics can be represented by

KS *n ,f :

1
Tn ( vec(R *n n ))( / n, k / n ) .
*
n ˆn

max

0 ,k n

(7)

Theorem 1
Suppose that the regression functions f 1 ,  ,
squared integrable functions with respect to

f p are linearly independent in the space of

I , denoted by

L 2 ( I ) , continuous and have bounded

variation on I. Then under H 0 it holds,

1
Tn ( vec(R *n n ))( )
n ˆ n*

D

Bf ( ) in C I , as n

.

Proof

A : t1, t 2 s1, s 2 I , t1 t 2 , s1
f on I , the increment of f over A is defined by
I f : f (t 2 , s 2 ) - f (t 2 , s1 ) f (t1, s 2 ) f (t1, s1 ) .

s 2 and any real-valued function

For any rectangle

Further, for a fixed natural number n, let us define operators

O n : C( I )

U n : C(I)

n2

and

C(I) , such that for any u C(I) and ( t , s) I ,
n ,n
Ik u k 1, 1

U n (u ) : vec

,

(8)

(9)
O n (u)(t, s) : Tn U n (u) X n (X nt X n ) 1 X nt U n (u) (t, s) ,
(k 1) / n, k / n , 1 , k n . It is obvious that O n is linear by
where I k : ( 1) / n ,  / n
n2

the linearity of Tn . Moreover, since for every a

, U n (Tn (a ))

a , it follows that O n is

idempotent. It can also be shown by the expression

O n (u )(t , s)

Tn ( U n (u ))(t , s)
p

vec(f i (

n )), U n ( u )

vec(f i (

i 1

Tn ( vec(f i (

n )), vec(f i (

n )))(t , s)

n ))

,

by the definition of norm of a operator on Banach space (Conway, 1985, p. 70), that
p

On

13 1 4
i 1

fi

2

2

inf f i ( t, s)

.

0 t ,s 1

54

JIMT, Vol. 7, No.2, Nopember 2010 : 51 – 59

Thus, O n is uniformly bounded on
where .

C(I) . This will directly imply to the continuity of O n on C(I) ,

is the supremum norm. Let us define an operator

O : C(I)

C(I) , such that for every

u C(I) and ( t, s) I ,
O(u )(t , s) :

0, t

0 ,s

f td

u
0, t

G

I

1

(R )

fdu .

0 ,s

I

The right hand side of (9) also can be written by

O n (u )(t , s) : Tn U n (u ) ( t , s)

1
n2
where

n2
nt ns

t
nt ns

1 t
X n X n X nt U n (u) ,
2
n

Xn

is the vector whose first nt ns

zero. By the definition and by the continuity of
n

Tn U n (u ) ( t , s)

0, t

Furthermore, since for

1
n2

t
nt ns

u,

u , we have

( t , s) I .

(10)

i 1, , p , f i is bounded on I, then it holds component wise

n

Xn

0 ,s

components are one and the reminder is

f 1d I ,  ,
0, t

0 ,s

fpd
0,t

I

.

(11)

0 ,s

The convergence component wise
1

p,p

1

1 t
XnXn
n2

n

f i f jd

G 1,

I

I

(12)

i 1, j 1

follows from the fact that the mapping B

B

1

is continuous on the space of invertible matrices.

i 1, , p , f i has bounded variation and u is continuous on I, the RiemannStieltjes integral of f i with respect to u exists for all i. Consequently, we have the following
Further, since for

convergence component wise
(R )

(R )

n

X nt U n (u)

f1du, , f p du .

Thus, from (10), (11), (12) and (13) , for every
Next let (u n ) n
bounded on

0

be any sequence in

1

(13)

I

I

u C(I) , it holds O n (u ) O(u )

C(I) such that u n

u

n

n

0.

0 , then since O n is

C(I) , we finally have
O n ( u n ) O( u )
On

un

u

O n (u ) O(u )

n

0.

C(I) that satisfies the condition there exists a
that converges to u in C(I) , but the sequence (O n (u n )) n 1 does not converge

The last result shows that the only subset of
sequence (u n ) n
to

(1971), we get

55

1

O(u ) is an empty set. Hence, by Theorem 5.5 in Billingsley (1968) and Theorem 4 in Park

Model-Checks For Homoschedastic Spatial Linear Regression Model Based On Bootstrap Method

1
Tn vec(R *n n )
n ˆ n*

On

1
Tn vec(E *n n )
n ˆ n*
D

.
O(B 2 ) , as n
The proof of the theorem is complete since B 2 ( t , s) 0 , almost surely if t 0 or s 0 .
Corollary 1
By the continuous mapping theorem, under H 0 we have

D

KS *n ,f

Sup B f ( t , s) , as n

.

0 t ,s 1

(0,1) , the constant c 1

Hence, for
constant

c1*

that satisfies the equation P

defined in page 4 can now be approximated by the
*

KS *n ,f

c1*

KS type test will reject H 0 if

then an asymptotic level

. If t n is a realisation of

tn

KS n,f ,

c1* . We will illustrate in Section 4

the finite sample properties of this approach by means of simulation study.
Remark 3.
The extension of Theorem 1 to the case of
design
III.

n 1 n 2  n p regular lattice with n i

p 3 dimensional unit cube and experimental

n j , for i

j is straightforward.

Simulation
We develop Monte Carlo simulation for constructing the finite sample sizes critical region of

the bootstrap level

W vs. K : g

H0 : g
where W :

test of the hypotheses

f1 , f 2 , f 3

W' ,
(15)
and W ' : f 1 , f 2 , f 3 , f 4 , f 5 , f 6 , f 1 , f 2 , f 3 , f 4 , f 5 , f 6 : I
, defined

by f 1 ( t , s) : 1 , f 2 ( t , s) : t , f 3 ( t , s) : s ,

f 4 ( t , s) : t 2 , f 5 ( t , s) : ts , and f 6 ( t, s) : s 2 . The

simulation is developed according to the following algorithm.
Begin Algorithm

n n matrix of pseudo random errors E n n from a distribution having mean zero

Step 1: Generate

and variance

2

.

Step 2: Compute the least squares residuals by the equation

vec(R n n ) vec(E n n ) X n (X nt X n ) 1 X nt vec(E n n ) .
Step 3: Generate

n n matrix of bootstrap errors E *n n by sampling with replacement from the

centred residuals.
Step 4: Compute the bootstrap residuals by the equation

vec(R *n n ) vec(E *n n ) X n (X nt X n ) 1 X nt vec(E *n n ) .
Step 5: Compute

ˆ n* and KS *n ,f .

Step 6: Repeat Step 1 to Step 5 M times.
Step 7: Compute

c1* : sort M values of KS *n ,f obtained in Step 5 in ascending order, i.e.,
1)
KS *(
n ,f

2)
KS *(
n ,f

M)
 KS *(
n ,f ,

56

JIMT, Vol. 7, No.2, Nopember 2010 : 51 – 59

(1
KS *(M
n,f

then

α))

; if M (1 - α) Ν

c1*

,
(1 α) 1)
KS *(M
n,f

; if M (1 - α) Ν

where N is the set of natural numbers.
End Algorithm

n

30

40

50

M=500

0.010

0.025

0.050

df 0

0.965

0.893

0.844

df1

1.183

1.119

df 2

1.381

df 3

0.100

0.150

0.200

0.776

0.746

0.717

1.057

0.994

0.954

0.928

1.287

1.168

1.094

1.047

1.013

1.327

1.240

1.150

1.092

1.026

0.970

df 0

0.934

0.868

0.816

0.757

0.730

0.697

df1

1.490

1.331

1.214

1.129

1.056

1.017

df 2

1.304

1.232

1.184

1.093

1.030

0.989

df 3

1.133

1.085

1.052

0.995

0.945

0.919

df 0

0.918

0.893

0.853

0.776

0.736

0.717

df1

1.316

1.234

1.165

1.081

1.023

0.993

df 2

1.291

1.175

1.147

1.086

1.045

1.000

df 3

1.336

1.220

1.173

1.098

1.047

1.014

Table 1. Simulated rejection probabilities of the Kolmogorov-Smirnov bootstrap test.
For the simulation, we generate the error variables from i.i.d. random variables having mean
zero and variance 0.5. Four cases are considered, that is for
k

~ N(0, 0.5) and

k

~

2

4 ,

1 , k n ,

1, 2, 3 .

The simulation results are depicted in Table 1 for the sample sizes

n 30, 40, 50 and 60 and level

1%, 2.5%, 5%, 10% , 15%, and 20% with 500 replications. The entries in the row named df 0

N(0, 0.5) , whereas those in the
1, 2, 3 degrees of freedom
rows named df are generated from chi-square distribution having
such that E( k ) 0 and var( k ) 0.5 , for 1 , k n . We note that for the simulation the
are simulation results for which the errors are generated from

generation of the random errors are restricted neither to a specific family of distributions nor to the
distribution having variance 0.5 only. The variance as well as the family of the distribution may vary
as long as they satisfy the conditions specified in Theorem 1.
IV.

Example
As an example we consider the wheat-yield data (Mercer and Hall’s data) presented in
Cressie (1993), p. 454-455. The data are yield of grain (in pounds) observed over a 25 x 20 lattice of
plots having 20 rows running from east to west and 25 columns of plots running from north to south.
57

Model-Checks For Homoschedastic Spatial Linear Regression Model Based On Bootstrap Method

The experiment consists of giving the 500 plots the same treatment (presumably fertilizer, water,
etc.), from which it can be identified that the data are a realization of 500 independent random
variables. As was informed in Cressie (1993), the plots are assumed to be equally spaced with the
dimension of each plot being 10.82 ft by 8.05 ft. Figure 2 presents the perspective plot of the data.
Observing Figure 2, we postulate under H 0 a first order polynomial model as in Hypothesis
(16). Since the variance is unknown, we use a consistent estimator
data give

2
ˆ nm
= 0.1898, where

Yn

3
m

i 1

2
ˆ nm
nm

2
ˆ nm
. Calculated under H 0 , the

fi (

nm ), Yn m

fi (

nm ), f i (

n m
nm )

fi (

nm )

2

n m

n m

,

nm 3
( / 25, k / 20) : 1  25, 1 k 20 . The functions f 1 , f 2 , and f 3 are defined as in

Section 4. After computation, we get the critical value of the KS type test 1.5919, so bootstrap
approximation of the p-value of the test computed by simulation is given by
be concluded that H 0 is rejected for all suitable values of

ˆ = 0.0001. Thus it can

.

Figure 2. The perspective plot of Mercer and Hall’s data
V.

Concluding Remark
Efron residual based-bootstrap approximation to the KS type statistics based on least squares
residual partial sums processes of homoschedastic spatial linear regression model is consistent. In
the forthcoming paper we shall investigate the application of Efron as well as wild bootstrap in the
case of heteroschedastic spatial linear regression model.
The experimental design considered so far is restricted to the regular lattice on the unit square
I, since the prerequisite condition of

the theory is satisfied well. In practice this situation is

sometimes not reasonable, therefore we need to develop the theory in more general setting.
VI.

References
1.

Billingsley, P., 1968, Convergence of Probability Measures, John Wiley & Sons, Inc, New York,
Singapore, Toronto.

58

JIMT, Vol. 7, No.2, Nopember 2010 : 51 – 59

2.

Bischoff, W. and Miller, F., 2000, Asymptotically Optimal Tests and Optimal Designs for Testing

the Mean in Regression Models with Application to Change-Point Problems, Ann.
Inst. Statist. Math. Vol. 52, 658-679.
3.

Conway, J.B., 1985, A Course in Functional Analysis, Springer-Verlag New York Inc., New York.

4.

Cressie, Noel A. C., 1993, Statistics for Spatial Data, Revised Edition, John Wiley & Sons, Inc,
New York, Singapore, Toronto.

5.

MacNeill, I.B., 1978, Properties of Sequences of Partial Sums of Polynomial Regression

Residuals with Applications to Tests for Change of Regression at Unknown Times ,
Ann. Statist. Vol. 6, 422-433.
6.

MacNeill, I.B. and Jandhyala, V.K., 1993, Change Point Methods for Spatial Data, Multivariate

Environmental Statistics, eds. by G.P. Patil and C.R. Rao, Elsevier Science
Publishers B.V., 298-306.
7.

Park, W.J., 1971, Weak Convergence of Probability Measures on the Function Space
2

C( 0,1 ) , J. of Multivariate Analysis. Vol. 1, 433-444.
8.

Shao, J. and Tu, D., 1995, The Jackknife and Bootstrap, Springer-Verlag, New York.

9.

Stapleton, J. H., 1995, Linear Statistical Model, John Wiley & Sons Inc, New York, Toronto,
Singapore.

10. Xie, L. and MacNeill, I.B., 2006, Spatial Residual Processes and Boundary Detection, South
African Statist. J. Vol. 40, no. 1, 33-53.

59