76 C
.W. Rougoor et al. Livestock Production Science 66 2000 71 –83
algorithm uses the estimates of the synthetic factors parameter. Jack knifing provides information about
in the first stage to estimate the inner and outer the precision of the parameter estimates. The PLS-
relations, without location parameters. The third step model was estimated with the LVPLS 1.8 program
¨ of the algorithm estimates the location parameters of
Lohmoller, 1987. the synthetic factors and the structural relations
estimated in the first two stages Wold, 1982. A detailed overview of these three steps is given by
3. Results
Wold 1985. No distributional assumptions are made in PLS
3.1. Correlation between variables Fornell and Cha, 1994. Therefore, the traditional
statistical testing methods are not well suited. The The correlations between variables within the
variance extracted measures the amount of variance synthetic variable ‘Breeding Value Conformation’
of the X- or Y-variable that is captured by the varied between 0.65 and 0.94. The correlations
synthetic factor. This variable can vary from 0 to between variables within the synthetic variable
11. The average variance extracted AVE is the ‘Farm Size’ varied between 0.73 and 0.93. These
average of the variances extracted of all X- or Y- examples show that the correlations between vari-
variables of one specific synthetic factor. A high ables within a synthetic variable can be high. So,
AVE indicates that the amount of variance captured multicollinearity is likely to exist. Afifi and Clark
by the synthetic factor is big compared with the 1984 stated that when two variables are highly
amount of unexplained variance of the X- or Y- correlated greater than 0.95, it may be simplest to
variables. It is a measure to evaluate the relationship use only one of them, since one variable conveys
between the synthetic factor and its X-variables: the essentially all of the information contained in the
outer model. This can be used to evaluate the other. However, all correlations were smaller than
goodness of measurement model, that is, reliability 0.95 in this case. Besides that, the presence of these
of the synthetic factors Fornell and Cha, 1994. The big correlations might emphasise differences be-
2
R measures the explanatory power of the relations between the different synthetic factors. It shows how
Table 2
well a synthetic factor is predicted by other synthetic
Percentage of variance explained and the eigen values of the 19
factors. This value is dependent upon the set-up of
principal components
the path-model. The predictive value of the model
Principal variation
Eigen value
can be shown by the Stone–Geisser test or by jack
component
knifing. The Stone-Geisser test calculates a criterion
2
PC 37.17
7.06
1
Q that indicates how well the observed values can
PC 15.77
3.00
2
be reconstructed by the model. It is evaluated as an
PC 9.23
1.75
3
2
R in Ordinary Least Squares OLS without loss of
PC 6.23
1.18
4
2
PC 5.74
1.09
degrees of freedom. The general form of the Q is
5
2
PC 5.18
0.98
6
Q 51 2 E O, where E is the sum of squares of the
PC 4.60
0.88
7
prediction errors and O is the sum of squares of the
PC 3.96
0.75
8
errors from the prediction given by the mean of the
PC 3.26
0.62
9
2
remaining data points. When Q .0 it indicates that
PC 2.38
0.45
10
PC 2.06
0.39
there is predictive relevance of the model, whereas
11
2
PC 1.62
0.31
12
Q ,0 suggests lack of relevance. Jack knifing can
PC 0.82
0.16
13
be used to obtain standard deviations of the parame-
PC 0.74
0.14
14
ter estimates Miller, 1974. This is done by estimat-
PC 0.48
0.09
15
ing the parameters N times in a data set with N
PC 0.31
0.06
16
PC 0.24
0.05
observations, each time cutting off just one observa-
17
PC 0.19
0.04
18
tion. The different estimates for the same parameter,
PC 0.03
0.01
19
then, are used to compute the mean and S.D. of that
C .W. Rougoor et al. Livestock Production Science 66 2000 71 –83
77
tween PLS and PCR, so all variables were retained back to the original variables on a standardized and
in the analysis. on their original scale. These regression coefficients
are shown in Table 3. Because the regression co- 3.2. Principal Components Regression PCR
efficients were reconstituted, no significance values were available for these variables. The standardized
The percentage of variance explained by the 19 regression coefficients were used to compare the
PCs and the eigen values of these PCs are shown in outcome with the outcome of the PLS-modelling.
Table 2. These results also showed that multicol- The regression coefficients based on the original
linearity is present in the dataset, because component scale could be used to interpret the results. For
nineteen had an eigen value close to zero 0.01. instance, the regression coefficient on the percentage
When the rule of thumb was used that a PC has to of use of natural services indicates that at farms with
explain at least 100 P of the variance to be a 1 higher use of natural services the 305-day milk
included in the regression, the percentage of variance production is expected to be 436 kg lower.
explained by one PC has to be at least 100 195 The regression coefficients on the original scale
5.26. Only the first five of the original 19 PCs were used to calculate the synthetic variables, which
could satisfy this criterion see Table 2. These five were used in a multivariate path-analysis. Fig. 2
2
PCs together explained 74.14 of the variance in the shows the outcome of this path-analysis. The R
data set. These five PCs were used in a linear showed that the model could explain 36 of the
regression. The coefficients were then transformed differences in milk production. The synthetic factor
Table 3 Results of principal components regression on 305-day milk production with five PCs included
Variable Regression coefficients
Regression coefficients on standardized scale
on original scale
a
CSF-Production 0.155
64.75 CSF-Culling
0.064 35.30
CSF-Winter milk 20.126
277.73
b
BG-Kg milk 20.046
23.05 BG-Udder
0.025 2.52
c
Farm Size-No inseminated 20.155
216.69 Farm Size-Total no of cows
20.093 23.74
Farm Size-Avg. no of mc 20.090
24.21
d
Use natural service – Cow 20.056
2435.51
e
BV Milk 0.059
0.29 BV Fat
0.061 8.05
f
BV INET 0.107
1.63
g
BV-Development 0.001
0.68 BV-Type
0.039 15.87
BV-Udder 0.014
5.35 BV-Legs
0.030 18.14
BV-Total 0.019
6.60
h
Age heifers 20.185
24.25 ]
Calving Age 20.091
20.48 ]
a
Change in farm average 305-day milk production per point change in CSF.
b
Ditto per percent change in breeding goal.
c
Ditto per extra cow.
d
Ditto per percent change in use of natural service sires.
e
Ditto per kg change in breeding value.
f
Ditto per point change in INET.
g
Ditto per point change in breeding value.
h
Ditto per day change in age.
78 C
.W. Rougoor et al. Livestock Production Science 66 2000 71 –83
Fig. 2. Path coefficients for PCR-modelling. NS5not significant; 5P,.05; 5P,.01.
Table 4 Measurement part of the PLS-model
2
?Synthetic factor Factor loading
Mult. R Average variance
variable extracted
a
?Critical success factor NA
0.45
Milk production 0.39
0.15 Culling
0.60 0.35
Winter milk 20.91
0.83 ?Breeding goal producer
0.06 0.59
Kg milk 20.81
0.65 Udder
0.72 0.52
b
?Use natural service sires 0.16
NA
Cow 21.00
?Farm size NA
0.87
No. inseminated 20.90
0.80 Total no. of cows
20.94 0.89
Avg. no. milking cows 20.95
0.90 ?Breeding value production
0.36 0.91
Milk 0.94
0.88 Fat
0.96 0.92
INET 0.97
0.95 ?Breeding value conformation
NA 1.00
Development 1.00
1.00 Type
1.00 1.00
Udder 1.00
1.00 Legs
1.00 1.00
Total 1.00
1.00 ?Milk production
0.47 NA
305-day milk production 1.00
a
NA5not available; this Synthetic factor was not predicted by any other Synthetic factor.
b
NA5not available; only single indicator.
C .W. Rougoor et al. Livestock Production Science 66 2000 71 –83
79
‘Age at Calving’ was not used by the model because to the PCR-model, no significance values are given
all path coefficients to and from this factor were here, because traditional statistical testing methods
smaller than 0.20. Table 3 and Fig. 2 show that milk are not well suited. The Stone–Geisser test criterion
2
production was higher on farms with managers who Q was used as an alternative method to evaluate the
thought that ‘milk production per cow’ was a CSF model. It had a value of 0.31 indicating that the
for their farm. At these farms the breeding value for model had predictive relevance, because it was
conformation was higher. The breeding goal of the bigger than zero. The same main results as with PCR
producer indicated, however, that these producers put were found with PLS. Small differences were found
relatively much emphasis on the quality of the udder in the relation between the synthetic factors ‘Natural
and less on the kg of milk. Service Sires’ and ‘Breeding Value Conformation’.
PCR found a path coefficient of 0.25, whereas in the 3.3. Partial Least Squares PLS
PLS-model it was smaller than 0.20 and therefore deleted. This indicates that a high percentage of
Table 4 provides the factor loadings for each of natural services at the farm has a relatively strong
2
the measures. The R of each synthetic factor, the
negative effect on the breeding value for production variance extracted for each variable, and the average
and a smaller negative effect on breeding value variance extracted for each synthetic factor are
conformation. Besides that, in the PLS-model, direct given. The factor loadings show that the variable
effects of the synthetic factor ‘Critical Success ‘Winter milk’ is the most important variable of the
Factors’ on ‘Breeding Value Production’ and ‘Natural synthetic variable ‘Critical Success Factors’. The
Service Sires’ were found, whereas in the PCR- positive and negative signs of the two variables in
model these path coefficients were too small. the synthetic variable ‘Breeding Goal Producer’
show that a farmer who has a high score on this synthetic factor has said that the udder is an im-
4. Discussion