152 S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166
Broadbalk wheat experiment at Rothamsted Fisher, 1924; Tippett, 1926; Alumnus, 1932; Cochran, 1935;
Buck, 1961; Thorne et al., 1988 and Chmielewski and Potts, 1995, amongst others. With these single-site
models, workers were able to find correlations be- tween observed and predicted yields in the range
0.45–0.6, thus explaining 20–35 of the variability in grain yields. Further wheatclimate investigations in
the UK include the early work of Lawes and Gilbert 1880; Hooker 1907 and Barnard 1936 and the
relatively recent work of Spence 1989.
Regression models have been criticised, since un- derlying mechanisms which transform climatic input
into yield are not explicitly described and the hierar- chical structure of the underlying physiological pro-
cesses is not taken into account Katz, 1977; Monteith, 1981; France and Thornley, 1984; Touré et al., 1994.
For example, monthly climatic effects predicted by a regression model are not easily interpreted from a
physiological background because the model can only be an approximation of the underlying processes, and
may fail to include some of them. Because of their empirical nature, regression models are restricted
to the range of climate data from which they are developed.
Advances in scientific understanding of the plant’s growth processes led to the formulation of determin-
istic growth models. They attempt to simulate the growth processes throughout the year by modelling
relevant plant processes. The Global Climate and Terrestrial Ecosystems GCTE group recognises that
there are at least 14 wheat models which attempt to account for physiological processes that gov-
ern wheat growth and development GCTE, 1992. Wheat simulation models applicable to UK climatic
conditions include AFRCWHEAT2 Porter, 1993, CERES-wheat Ritchie and Otter, 1985 and SIRIUS
Jamieson et al., 1998. Such models are widely ap- plied in decision support and studies of the impact of
climate change on wheat production.
Crop simulation models assume that the dynamic mechanistic process formulations can be represented
accurately, and that the model parameters can be correctly determined. However, of necessity, most of
the important processes within simulation models are described by empirical functions, since no models
exist for the enormously complex and poorly un- derstood mechanisms underlying phenology, canopy
development and senescence, partitioning etc. Thus, simulation models also cannot be used outside the re-
gion they were developed for with confidence. Young et al. 1996 argued that “a data-based mechanistic
modelling” philosophy is the way forward. This sug- gests a model category — a mechanistic and statistical
model — between the extreme modelling approaches outlined earlier which assume that either nothing or
everything is known about the crop-climate system. The means to achieve this is the derivation of the
most parsimonious model based on knowledge of the system; i.e. one that uses the minimum number of
parameters without losing predictive power.
In the present study, we employ a large yield data set from agricultural experiments on winter wheat
in the UK to develop a new model for predicting well-managed wheat grain yields from climatic in-
put. The objective of this work was to develop, as an example of the new methodology described ear-
lier, a parsimonious, empirically-based model which takes into account mechanisms of wheat growth and
development. In order to establish how well such a hybrid-model behaves in the multi-site, multi-year
UK environment, we also perform an extensive inde- pendent test of the suggested model.
2. Data sources and methods
2.1. Winter wheat trials database and weather data An electronic database was established consist-
ing of wheat trials undertaken during the period September 1975–August 1993. The database con-
tains grain yields at 85 dry matter together with additional information such as treatments, experi-
mental design, grid reference, altitude, sowing date, cultivar and type of trial e.g. variety trial, nitrogen
response trial. Trials from most UK agricultural in- stitutes were included details in Landau, 1998. Data
were restricted to autumn-sown, fungicide-treated trials of cultivars of bread-making varieties that had
been on the UK recommended list. For each variety the treatment combination that produced the highest
average plot yield was taken to reflect the trial’s best managed yield. This procedure implicitly assumed
that a high-yielding treatment had been applied and that the crops were overall well-managed in terms
S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166 153
Fig. 1. Locations of crop trials included in the well-managed yield data set 840 trials. The plotting symbol indicates the yield sample
to which a trial was allocated: the first O or second development sample 1 or the test sample ×.
of factors such as cultivation, spraying regime and nutrient supply.
The well-managed yield data set consisted of 840 trials site × year combinations and 1992 average
yield observations from major wheat-growing areas in the UK Fig. 1. The trial set was split into three
samples. We refer to these samples as the ‘first de- velopment sample’, the ‘second development sample’
and the ‘test sample’. The development samples were used to develop the new yield-climate model, while
the yield data in the test sample were reserved for in- dependent testing of the new model. Two-thirds of the
field trials were assigned at random to a development sample and the remaining third to the test sample.
Due to time constraints on the project, the develop- ment sample was further divided again at random. A
test of the predictive accuracy of existing crop models Landau et al., 1998 and the early model development
stage was carried out only with the first development sample.
The allocation of the mainland trials to the three samples was performed by stratified random sampling
from the set of trials. Strata were defined by regions within harvest years. Ten homogeneous weather zones
were used to define regions UK Meteorological Of- fice, 1985. The sizes of the subsamples from each
stratum were chosen to be proportional to the stra- tum sizes. Due to late availability all Northern Ireland
trials were allocated to the test sample. Fig. 1 shows that all three samples were representative of main-
land wheat growing regions and Table 1 demonstrates the size of the sampling variation in annual average
yields.
Daily climate data were considered as in sim- ulation models as our starting point in the search
for a more empirically-based statistical model. We considered minimum and maximum daily tempera-
tures and daily radiation and rainfall totals to be the essential climatic variables affecting wheat yields.
Daily weather data for the period covered by the winter wheat trials database were retrieved for 234
meteorological stations within the UK. Radiation records were sparse only 19 meteorological sta-
tions recorded radiation throughout the period of interest but sunshine durations were available. Be-
cause meteorological stations do not necessarily coincide with the crop sites, methods to interpo-
late yearly series of daily weather data for arbi- trary sites in the UK were developed Landau and
Barnett, 1996 and used to generate daily weather data at the crop sites. The selected interpolation
method took account of the spatial and temporal variation in the weather data and explained vari-
ation in observed weather well for a set of dates at randomly selected sites 94, 97, 84 and 74 of
the variation in minimum and maximum temper- ature, sunshine hours and rainfall were explained,
respectively. Because radiation records were sparse, radiation was estimated from the interpolated sun-
shine durations by a standard linear relationship Rietfield, 1978. The latter method was able to ex-
plain 95 of the variation in daily radiation mea- surements over 2 years at the available recording
stations.
154 S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166
Table 1 Descriptive statistics for the three yield samples 1=first development sample, 2=second development sample, 3=test sample by harvest
year Yield sample
1 2
3 1
2 3
1 2
3 1
2 3
Harvest year Number of trials
Number of yield values Mean yield t ha
− 1
Standard deviation t ha
− 1
1976 3
2 1
4 3
1 7.51
7.17 6.21
0.55 0.71
– 1977
2 1
1 4
2 2
6.82 7.50
8.30 0.67
0.43 0.78
1978 2
3 2
3 6
4 7.11
6.47 6.45
0.15 0.94
0.25 1979
3 3
1 9
9 4
7.67 6.60
6.04 0.86
0.61 0.57
1980 4
4 1
10 6
1 8.24
8.12 5.13
0.86 1.74
– 1981
35 20
22 45
22 25
8.35 8.09
8.58 1.40
1.54 1.47
1982 41
26 33
90 47
69 8.29
8.03 7.82
1.19 1.37
1.70 1983
39 28
29 81
57 59
8.84 8.04
8.69 1.38
1.43 0.84
1984 28
20 29
63 40
75 9.77
9.52 9.12
1.52 1.19
1.40 1985
26 22
21 79
54 57
7.95 7.85
7.43 1.03
0.98 1.10
1986 35
19 30
95 58
93 8.65
8.27 7.94
1.09 0.90
1.12 1987
30 16
23 53
40 62
7.44 7.21
7.57 1.14
1.01 1.09
1988 26
16 18
50 25
29 8.12
8.21 7.89
0.91 1.09
1.11 1989
12 9
7 29
21 17
8.11 8.36
7.71 0.86
0.90 1.20
1990 11
8 18
35 18
44 8.85
9.01 8.34
1.20 1.11
1.23 1991
13 12
17 29
28 39
8.36 8.94
8.01 1.18
0.90 1.54
1992 14
6 10
56 22
42 8.66
8.62 8.59
1.25 1.97
1.06 1993
17 7
14 80
25 71
9.23 8.87
8.80 1.70
1.42 1.58
1976–1993 341
222 277
815 483
694 8.52
8.23 8.19
1.37 1.36
1.40
2.2. Strategies adopted for developing the new hybrid-model
It was envisaged in the initial stages of this project that an existing crop model could be identified which
offered the closest prediction of grain yield, and a par- simonious model would then have been developed on
the basis of the selected model. However, we found that none of the crop models considered AFRC-
WHEAT2, CERES-wheat, modified version 3.0 and SIRIUS, version 3 was able to explain the variation
in grain yields in the first development sample of well-managed yields Landau et al., 1998, the rea-
sons for which are discussed in Jamieson et al. 1999 and Landau et al. 1999. Hence, rather than reducing
the complexity of a simulation model, the develop- ment of a parsimonious model was approached from
a minimalist viewpoint, introducing effects one by one and allowing inclusion in the final model only if
they improved prediction.
The first and second development samples of wheat yields represented the empirical basis upon which to
build the model. The hybrid-model was developed by employing elements of both empirical and mechanistic
modelling. Knowledge of wheat physiology was used to suggest simple expressions of climate effects on
yields. Then this large set of observed yields was used to assess the empirical importance of the suggested
climate variables. Only empirically important climate variables were included in the final model.
The climate response sub-model was developed in two stages — a variable selection process and a model
fitting process. Firstly, during the variable selection process, a set of physiologically meaningful climate
variables, which could potentially explain variations in the observed yields, was established. This set of
potential explanatory variables constitutes the maxi- mum model. Secondly, during the model fitting pro-
cess, this model was reduced to contain only statisti- cally significant terms to form a parsimonious model.
The variable selection process was based on yields from the first development sample alone, whereas the
final model was fitted to yields from both the first and second development sample.
2.3. The variable selection process A pool of climatic explanatory variables to be
considered during the variable selection process was created using physiologicalagronomical guidance as
S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166 155
follows. Firstly, in order to reduce the number of in- puts greatly, summaries of the climatic effects over
physiologically-meaningful periods were investigated rather than the effects of daily climatic inputs. The
crop year was split into five phases: a vegetative phase, an early-reproductive phase, an anthesis phase,
a grain-filling phase and the remaining pre-harvest phase. Details concerning the choice of these phases
are given in Section 2.4. Secondly, knowledge of wheat physiology and agronomy was employed to
suggest simple expressions for expected climate ef- fects within these phases. An expression could contain
a single variable, or of a set of variables principally two variables whose added effects would reflect the
expected climate effect. For example, one possible expression of an expected disease-related rainfall
effect during a phase could be the additive effects of mean and maximum rainfall during the phase.
Knowledge was also used to determine the type of dependence expected for variables involved in each
expression — either positively or negatively related to grain yield.
A cumulative procedure using regression methods was adopted in the variable selection to establish the
maximum model. During the procedure expressions of expected climate effects were assessed for their poten-
tial in explaining variation in observed yields. Within each step of the procedure alternative expressions of
a single expected climate effect were compared and the explanatory variables involved in the best expres-
sion, if any, were added to the set of potential ex- planatory variables. The percentage of yield variation
explained by a regression model was measured by the adjusted coefficient of determination R
2 ad
, for defini- tion see Payne et al., 1993. In each step the expres-
sion which increased the R
2 ad
most, relative to the R
2 ad
of the best model in the previous step, and whose coefficients estimates showed the expected signs,
was selected as a potential candidate for the final model.
We considered climatic effects in the order of their anticipated importance to grain yield. The importance
of each effect was evaluated in terms of our knowl- edge of wheat physiology and agronomy combined
with results from an explorative analysis details in Landau, 1998. When selecting variables, yield ob-
servations were used in aggregated form. As in the test of the crop models Landau et al., 1998, the ob-
servations were averaged for each year within 1 km squares, which represent the precision of the available
grid references. The ‘replicates’ within the squares re- flect trials where several valid cultivars were tested
together, or where several trials were situated in the same square. Aggregation of the observed data set sim-
plified the variable selection process by reducing the variation in grain yield to a component potentially ex-
plainable by climate variables. To take account of the fact that means based on many observations are less
variable than means calculated from few observations, weights reflecting the number of replicates were em-
ployed in all the regressions.
2.4. Initial choice of phenological stages We aimed to define periods within which to aggre-
gate the climate data that reflect key periods of plant development. Although the general dependence of de-
velopment stages on thermal time is well established, differences exist in the approaches taken to model
the relationship. Date of anthesis has been shown to be the most crucial developmental date for predict-
ing grain yields from climate. Grain yields have been found to be most sensitive to changes in radiation lev-
els around anthesis Fischer, 1975, 1985; Willington and Biscoe, 1985; Savin and Slafer, 1991; Tribo¨ı and
Ntonga, 1993; Mitchell et al., 1996 and the date of an- thesis determines the onset of the grain-filling phase.
The latter is important in that climatic conditions dur- ing this phase, especially those determining the dura-
tion of grain-filling itself, have been shown to affect grain yields Monteith and Scott, 1982; Fischer, 1983;
Spiertz and Vos, 1985; Moot et al., 1996.
We therefore identified the best existing predictor of anthesis dates by comparing anthesis date predictions
of crop models applicable to the UK AFRCWHEAT2, CERES-wheat and SIRIUS; run for cultivar Avalon
with a set of observed anthesis dates from 57 UK trials described in Landau, 1998. This showed that
CERES-wheat best predicted the date of anthesis Table 2. CERES-wheat and AFRCWHEAT2 showed
almost identical correlations with the observed dates but CERES-wheat predictions were less biased. SIR-
IUS widely overpredicted the anthesis dates due to a fault which has been corrected in later versions of
the model M.A. Semenov, personal communication.
156 S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166
Table 2 Correlation between observed and predicted anthesis dates r,
bias of predictions bias and root mean square error RMSE of predictions from crop models for the set of 57 observed anthesis
dates
Crop model CERES-wheat
AFRCWHEAT2 SIRIUS
r 0.61
0.62 0.18
Bias days 3.11
− 5.32
9.02 RMSE days
7.41 8.57
14.51
CERES-wheat was therefore used to define a 21-day anthesis phase by allowing a 10-day window on either
side of the predicted anthesis date. The crop year was then split into five phases
Phases I–V by estimating an early-reproductive phase to occur before and a grain-filling phase to oc-
cur after the predicted anthesis phase. The remaining days at the beginning and end of the of the crop year
were assigned to a vegetative and pre-harvest phase, respectively. Fig. 2 illustrates alternative definitions
for the phases before Phase II and after Phase IV
Fig. 2. Alternative phase definitions used during the variable selection process. Phase I starts with sowing S and Phase V ends with harvest H. Phase III is the 21-day window around the anthesis date predicted by CERES-wheat A
CE
. There are three options for the start of Phase II: S
II a
18th April and S
II b
10th May are fixed dates while the date of S
III c
depends on thermal time 500 degree days before the start of Phase III. There are two options for the end of Phase IV: E
IV a
is 600 degree days above base temperature of 0
◦
C and E
IV b
is 470 degree days above a base temperature of 1
◦
C and below an upper limit of 26
◦
C after the end of Phase III.
the anthesis phase Phase III. S
II a
and S
II b
represent an early and a late estimate of terminal spikelet. The
fixed day predictors were derived from the idea that sensitivity to radiation peaks at anthesis and dimin-
ishes towards terminal spikelet, so that variations in terminal spikelet dates would be unimportant as
long as the area of low sensitivity was identified. S
II c
measures the distance between anthesis and terminal spikelet in thermal time using an approximation of
the ARCWHEAT1 thermal time requirement Weir et al., 1984. All formulations used to predict the
end of grain-filling were based on thermal time. E
IV a
reflects an approach to estimating the grain-filling phase which is intermediate between AFRCWHEAT2
and SIRIUS, whereas E
IV b
is defined according to the CERES-wheat formula all for cultivar Avalon.
The latter generally results in a shorter grain-filling duration than the other two models. During the vari-
able selection process, expressions of climate effects as well as these alternative phase definitions were
tested.
S. Landau et al. Agricultural and Forest Meteorology 101 2000 151–166 157
2.5. The model fitting process At the end of the variable selection process the
best definitions of the phenological phases were iden- tified. The next step was the translation of these def-
initions into a phenology sub-model and the genera- tion of climate input required in the maximum model
using this sub-routine. Then during the model fitting process formal inference was used to assess the sig-
nificance of terms included in the maximum model on the basis of both development samples. A parsi-
monious yield response sub-model was determined by step-wise dropping of variables from the maxi-
mum model. At each step the explanatory variable which tested as insignificant at the 5-level using the
F
-test and gave the smallest variance ratio, or a vari- able which showed an unexpected sign for its coef-
ficient estimate, was dropped. All dummy variables were kept in the model as adjustment factors. Vari-
ables reflecting main effects were only dropped, once any interaction term involving them had disappeared.
Also, non-linear threshold parameters were retained as they were believed to be needed to ensure physiologi-
cal meaningfulness. The significance testing assumed independently distributed normal errors with expecta-
tion zero and unknown but constant variance. Residual diagnostics were employed throughout to check the
distributional assumptions. In contrast to the variable selection process the original non-aggregated yields
were used during the fitting process.
The relative importance of each term in the final cli- mate response sub-model was assessed by decompos-
ing the model’s regression sum of squares into com- ponents due to each term. Because the explanatory
variables were not orthogonal to each other, the order in which the terms were added intodropped out of
the model affected the part of the sum of squares that was attributed to them. Therefore, for comparison pur-
poses, a forward and a backward selection procedure were employed to achieve a decomposition.
2.6. Testing the new hybrid-model The new parsimonious hybrid-model was tested
with independent observed yields in the test sample to provide an assessment of the predictive accuracy of
the new hybrid-model in practice. This also allowed comparison of the predictive accuracy of the new
hybrid-model with that of the mechanistic crop mod- els which had already undergone independent testing
for UK well-managed yields Landau et al., 1998.
As in Landau et al. 1998, observed yields were av- eraged within 1 km squares within each year to match
the precision of the interpolated weather variables. The root mean square error RMSE of differences be-
tween observed and predicted yields and correlations were employed to measure the accuracy of the new
hybrid-model for predicting temporally and spatially distributed UK yields. The new model’s accuracy for
predicting annual average yields was also measured in order to assess the model’s ability to predict purely
temporal variation in UK well-managed yields. To take account of the fact that the variance of average yields
is inversely proportional to the number of originally available yields all accuracy measures were weighted
accordingly.
3. Results