D.T. Price et al. Agricultural and Forest Meteorology 101 2000 81–94 83
sonal communication, 1999, Canadian Forest Service, Edmonton.
2. Methods
2.1. GIDS GIDS relies on multiple linear regression MLR
analysis of data from a set of nearby stations to esti-
V
p
=
N
X
i= 1
V
i
+ X
− X
i
C
X
+ Y
− Y
i
C
Y
+ Z
− Z
i
C
Z
d
2 i
N
X
i= 1
1 d
2 i
1
mate the local gradients for each climate variable, treating latitude, longitude and elevation as inde-
pendent variables. The climate value for a target point in the neighbourhood is then predicted from
the same station data, using the MLR coefficients to correct for the differences from each station’s posi-
tion. The contributions of each station to the final estimate are inverse-weighted by their squared dis-
tances to the target point, i.e., GIDS assumes that climate data are spatially autocorrelated. Clearly,
there are some important caveats to this assumption, particularly in mountainous regions where topog-
raphy tends to reduce spatial correlations at least when considering two dimensional space and thus
affects the predictability of both temperature and precipitation.
Following Nalder and Wein 1998, computer programs were written to perform MLRs for each
monthly climate variable at the location of every test station or grid point, using routines for the matrix
solution of linear equations from Press et al. 1986. The solutions yielded sets of coefficients C
X
, C
Y
and C
Z
, representing the observed gradients of variable V in response to independent variables X, Y, Z, denot-
ing longitude, latitude and elevation asl, respectively. To account for the effect of possible non-linearities
associated with convergence of the meridians at high latitudes, differences in spherical coordinates
between the target location and the neighbouring stations were also mapped on to planar distance co-
ordinates using an Azimuthal Equidistant projection algorithm given in Snyder 1987. In practice, this
transformation was found to have negligible impacts on the average R
2
of the MLRs and was not used further.
The MLR coefficients were then used to predict each climate variable, V
p
, at each target climate sta- tion, from the coordinates and elevation of the neigh-
bouring stations, using
where N is the number of neighbouring stations con- tributing data to the MLR, X
, Y and Z
are longitude, latitude and elevation of the target station and X
i
, Y
i
, and Z
i
are the corresponding coordinates of the ith neighbouring station respectively, while d
i
is the great circle distance to, and V
i
the value of V observed at, station i. For this study, N was arbitrarily set to 40.
2.2. ANUSPLIN ANUSPLIN is a suite of FORTRAN programs de-
veloped at the Australian National University that cal- culates and optimizes thin plate smoothing splines fit-
ted to data sets distributed across an unlimited number of climate station locations Hutchinson, 1991, 1999.
It has been applied in numerous regions including Aus- tralia, New Zealand, Europe, South America, Africa,
China and parts of southeast Asia. A general model for a thin plate spline function f fitted to n data values
z
i
at positions x
i
is given by Hutchinson, 1995a: z
i
= f x
i
+ ε
i
i = 1, . . . , n
2 where the x
i
typically represent longitude, latitude and suitably scaled elevation. The ε
i
are zero mean ran- dom errors which account for measurement error as
well as deficiencies in the spline model, such as local effects below the resolution of the data network. The
ε
i
are assumed to have a covariance matrix Vσ
2
where V
is a known positive definite n×n matrix, usually di- agonal, while σ
2
is usually unknown. The function f
84 D.T. Price et al. Agricultural and Forest Meteorology 101 2000 81–94
is estimated by minimizing: z − f
T
V
− 1
z − f + ρJ
m
f 3
where z = z
1
. . . , z
n T
, f = f
1
, . . . , f
n T
, and T signifies the matrix transpose, with
f
i
= f x
i
4 and J
m
f is a measure of the roughness of the spline function f defined in terms of mth order partial deriva-
tives of f. The positive number ρ is called the smooth- ing parameter. It is determined objectively by mini-
mizing the Generalized Cross Validation GCV statis- tic, a measure of the predictive error of the surface.
The GCV is calculated by implicitly removing each data point and summing the square of the difference of
each omitted data point from a surface fitted to all the remaining data points see Hutchinson and Gessler,
1994. The procedure provides an estimate of the value of σ
2
. For the interpolation of mean temperature V was
set to the identity matrix, as local effects are the main contribution to model error. For the interpolation of
mean precipitation V was set to a diagonal matrix with entries v
ii
given by v
ii
= σ
2 i
n
i
5 where σ
i 2
is the year-to-year variance of the monthly totals at location x
i
and n
i
is the number of years of record. This is the approximate Method 5.1 described
in Hutchinson 1995a. In the case of precipitation, year-to-year variation is a significant contributor to
model error when using means for different periods. Further details can be found in the references cited
earlier. To properly scale the independent variables, longitude was transformed by the cosine of the central
latitude. This had only a marginal effect on the precip- itation surface. Elevation units are specified in km to
scale this term appropriately see Hutchinson, 1995a.
2.3. Data and comparison Monthly climate data for two regions of the
country were extracted from Environment Canada’s CD-ROM of Canadian 1961–1990 climate normals
Environment Canada, 1994. The variables selected for this study were: monthly mean precipitation, P,
and monthly mean daily maximum and minimum temperatures, T
max
and T
min
, respectively. The two study regions southern British ColumbiaAlberta and
southern Ontariosouthwestern Québec were chosen to represent a diverse range of topography and avail-
able data. The BCAlberta region extends across the Canadian Pacific coast, coastal and interior moun-
tain ranges including the Rockies, and parts of the western boreal forest and prairies. As is typical in
most parts of the world, climate data are particularly sparse in higher elevation areas. The OntarioQuébec
region is comparatively flat but includes some higher elevation locations and areas downwind of the Great
Lakes. In combination, these study areas represented the climatic conditions found in both forested and
agricultural regions across much of southern Canada. The locations of the stations providing data for each
variable in both regions are shown in Fig. 1. For each of T
max
, T
min
and P, respectively, data were available from 434, 436 and 406 stations in the BCAlberta
region, and from 407, 405 and 371, stations in OntarioQuébec.
For each study region, approximately 50 stations were selected at random from the data set and with-
held from the interpolation calculations. The perfor- mance of the two interpolation methods was then com-
pared by using them both to estimate P, T
max
and T
min
at the coordinates of the withheld stations. Resid- uals were calculated as the differences between the
estimated and observed values for each station for each month. Following Hulme et al. 1995 and Daly
1994, precipitation residuals were calculated as per- centage differences from the observed data. This ap-
proach provides a better relative measure of the dif- ferences as compared to absolute values, particularly
when the spatial variation in monthly P is large, as is the case in BC. The residuals and squared residuals
were pooled to determine Mean Errors ME and Root Mean Square Errors RMSE, respectively.
The ME was used to detect bias in the two meth- ods. Bias can be very important, both statistically and
when using the interpolated data for further analysis or modelling. In comparison, RMSE is sensitive to the
size of outliers and was used as an indicator of the magnitude of extreme errors i.e., lower RMSE indi-
cates greater central tendency and generally smaller extreme errors. Box plots were developed to display
the residuals for both methods for each variable. As
D.T. Price et al. Agricultural and Forest Meteorology 101 2000 81–94 85
Fig. 1. Locations of climate stations used for comparison of the GIDS and ANUSPLIN spatial interpolation methods, applied to 1961–1990 AES normals for study areas in British ColumbiaAlberta left and OntarioQu´ebec right. a monthly mean daily maximum temperature;
b monthly mean daily minimum temperature; c monthly mean total precipitation. Symbols distinguish the stations used for interpolation o from those withheld as control stations used for error assessment x.
a further assessment, the number of months in which one method gave lower RMSEs than the other were
counted and checked for significance using a simple binomial test.
2.4. Map generation To assess the behaviour of each interpolation
method, maps were generated for a monthly vari- able giving poor agreement between observed and
predicted data July precipitation in the BCAlberta study region was selected for this test. Using a 1 km
Digital Elevation Model DEM for this region ob- tained by resampling the GTOPO30 data set Verdin
and Jenson, 1996 to a Lambert Conformal Conic projection, precipitation values were estimated for
the centroids of each DEM pixel. The interpolated grids of precipitation data were then imported into
ARCINFO GRID
TM
and printed as coloured images.
3. Results