and, necessarily, OE. Hence, these forces will tend to form part of the efficient reference set of police forces for a large number of inefficient forces, and a detailed comparison of
these “best-practice” forces with the less efficient units could provide very useful information in any reorganizationrestructuring process.
So far we have not addressed the issue of the statistical significance of the differences in our efficiency scores across staff size groups. We rectify this in the next section using
analysis of variance ANOVA and discriminant analysis techniques.
4. MDA results
To assess the DEA results further, we adopt a dual post-hypothesis testing strategy that utilizes ANOVA and MDA. Both of these statistical techniques allow us to determine
whether there are any significant differences between grouped police forces see Hair et al., 1995, for an introduction. In this analysis, the categorical variable partitions the police
forces into four groups that are determined by the number of total police and civilian staff operating in each force. This allows us to determine, for example, if large police forces by
total staff employed display SE, PTE, or OE scores that are significantly better or worse than their smaller counterparts. If a police force has 0 to 1,500 total staff, it is a member of
staff group 1; between 1,501 to 3,000, they are a member of staff group 2; between 3,001 to 4,500, they are a member of staff group 3; and above 4,501, they are a member of staff group
4. To ensure that we follow at least the minimum requirements necessary for MDA, we have stacked the 5 years before estimation. Table 7 gives the total 1992–1993 to 1996 –1997
stacked grouped summary statistics for the three independent variables, OE, PTE, and SE.
As outlined previously, in terms of SE staff group 3 has the highest mean value, and the lowest standard deviation, while for PTE staff group 1 has the highest mean value with the
smaller standard deviation. The results for OE reveal that staff groups 1, 2, and 3 are very close in terms of the overall mean rankings and their deviation. However, staff group 4 has
the lowest overall mean value with the largest standard deviations. This can be attributed to the wide variations in OE for the West Midlands and the Metropolitan police force that are
evident in Table 4.
The estimation analysis that is followed in this study involves further testing procedures after the DEA estimation. The first stage is the estimation of an ANOVA, a univariate test,
where the dependent variables are PTE and SE, and the independent variable is the cate- gorical staff group.
4
The null hypothesis under interest is that each mean associated with the staff group is equal. As can be seen from Table 8, the F-statistic is greater than the critical
value, and we can conclude that there is a statistically significant size difference associated with our two measures of efficiency. However, we do not know whether the differences are
between staff groups 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, or, finally, 3 and 4.
MDA is much like ANOVA, but in this case the dependent variable is the categorical staff
4
As OE is a product of the multiplication of PTE and SE, it is excluded in the second-stage estimation due to problems of multicollinearity.
68 L. Drake, R. Simper International Review of Law and Economics 20 2000 53–73
group, and the independent variables are PTE and SE. The reason for estimating the MDA is that it offers an alternative insight into the ANOVA results found in Table 8. For example,
if we are trying to predict to which staff group a police force should belong, given a value for PTE or SE, then MDA will derive the linear combination of the two independent variables
that would discriminate best between the staff groups. MDA distinguishes between the groups by multiplying PTE and SE by their corresponding weights and then adds these
products together giving a single discriminant score for each police force. After averaging each discriminant score in each staff group, we obtain the centroid, which we can use to
compare how “far apart” the staff groups are. In this case, our hypothesis of equal means for the staff groups are based on comparing the distribution of the discriminant scores. The test
analysis is such that “if the overlap in the distribution is small, the discriminant function separates the groups well. If the overlap is large, the function is a poor discriminator between
the groups” Hair et al., 1992.
5
Before discussing the MDA results, we need to compare the hit ratio with the maximum- chance and proportional-chance criteria to assess the predictive accuracy of the function. The
maximum chance criterion is calculated as the probability of correctly classifying all scores by placing them in the staff group with the greatest probability of occurrence, which in this
model is 48. However, with unequal groups, we can calculate a proportional-chance criteria, which in this model equals 33.34. Hence, as our hit ratio 49.1 exceeds the
maximum-chance and proportional-chance criteria, we can conclude that the MDA model is valid based on these measures. We also checked our model using Press’s Q-statistic, which
tests whether the staff group classification by MDA would exceed those classifications if carried out by chance. Having a total of 104 predicted group memberships correctly
classified, the estimate of Press’s Q value equals 62.89, which is significant at the 5 critical level. Therefore, utilizing the results obtained from the maximum-chance criteria, the
proportional-chance criteria, and the hit ratio, we can conclude that the MDA model is better at predicting staff group membership than if the prediction is carried out by chance.
Table 9 provides the overall MDA results and indicates that the discriminant functions are highly significant, as measured by Wilks l and the x
2
statistics. Overall, the first function
5
MDA was estimated using SPSS version 8 SPSS Inc; Chicago, IL, with the stepwise Mahalanobis distance, Fisher’s function coefficients method, a decision rule of F being between 0.05 and 0.15, prior probabilities
computed from group size, and the use of the within-groups covariance matrix. Table 8
Test for equality of staff group means Independent
variables Wilks
l
a
Univariate F ratio
b
Significance OE
0.905 7.246
0.000 PTE
0.936 4.708
0.003 SE
0.843 12.880
0.000
a
Degrees of Freedom, 3.
b
Degrees of Freedom, 208. 69
L. Drake, R. Simper International Review of Law and Economics 20 2000 53–73
accounts for 74.4 of the variance, and the second function accounts for 25.6. However, the functions display a low canonical correlation of 0.39 and 0.25, respectively; that is,
15.21 and 6.25 of the variance in the dependent variable can be explained by this model in regression models this is the R
2
statistic. In this context, although the latter figures are relatively low, we must remember that MDA involves the process of determining which staff
group a police force should be included in. However, it cannot take into account sociological and political factors that cannot be included in the calculation but that will have an effect on
how large a police force will be and, therefore, on its staff group classification.
We have found above that the discriminant function is able distinguish between the dependent variables and that there are unequal staff group mean values. A first check of
whether there are indeed staff group mean differences is shown in Table 10, where we can see that none of the group centroids are equal in value. That is, it appears that the first and
second discriminant functions significantly discriminates between all groups. To determine which pairs of group means are significantly different, we estimate a second-stage ANOVA
that allows us to calculate post-hypothesis pairwise tests: TUKEY HSD, Scheffe, and least significant difference see Sharma, 1996, for an introduction to this stage of analysis.
Table 11 gives the ANOVA results for the discriminant scores and the post-hypothesis pairwise tests. The initial analysis shows that the discriminant scores are significantly
different across the staff group means and that the Scheffe test our preferred post-hypothesis
Table 9 Multivariate results for four staff group discriminant analysis
Function Eigen-
value of Variance
Canonical correlation
Wilks l
x
2
df Signif-
icance Function
Cumulative 1
0.188
a
74.4 74.4
0.397 0.791
48.792 6
0.000 2
0.065
a
25.6 100.0
0.246 0.939
13.038 2
0.001
Independent Variables Standardised
Discriminant Coefficients
Function 1 Canonical
Function Function 2
Structure Matrix
b
Function 1 Function 2
PTE 20.121
0.994 0.993
c
0.121 SE
0.988 0.162
20.162 0.987
c
Classification function coefficients- Fisher’s Linear Discriminant Functions
Independent Variables Police Staff Groups
1 2
3 4
PTE 0.867
0.805 0.795
0.773 SE
1.022 1.125
1.137 1.030
Constant 289.466
291.137 292.112
280.666
a
First two canonical discriminant functions were used in the analysis.
b
Pooled within-groups correlations between discriminating variables and standard canonical discriminant functions. Variables ordered by absolute size of correlation within function.
c
Largest absolute correlation between each variable and any discriminant function. 70
L. Drake, R. Simper International Review of Law and Economics 20 2000 53–73
pairwise test shows that function Z1 which corresponds to PTE is significantly different across staff groups, with the exception of 1 and 4 and 2 and 3, and that function Z2 which
corresponds with SE shows that only staff groups 1 and 4 are significantly different. This latter result corresponds well with our earlier analysis in the sense that staff groups 2 and 3
appear to be operating closer to the constant-returns region of production and, hence, would not be expected to exhibit significant differences in SE scores. Staff groups 1 and 4, however,
would be expected to exhibit increasing and decreasing returns to scale, respectively, and hence might be expected to produce significant differences in SE scores.
The results for PTE, however, suggest a more complicated story than is apparent in Table 7 which suggested that PTE scores declined with police size. This further statistical
analysis suggests that staff groups 1 and 4 tend to be more technically efficient, with staff groups 2 and 3 being less efficient but not significantly different in PTE terms. This
interpretation is supported by the very high mean PTE score for police staff group 1 99.64 and by the very high mean PTE scores of some large police forces such as the Metropolitan
and Greater Manchester forces. The relatively low mean PTE score for staff group 4 88.36 is probably explained by the presence of outliers. This is supported by the relatively high
standard deviation of 15.49 evident in Table 7.
Table 11 ANOVA and post hoc multiple comparison tests
Dependent variable ANOVA
F-statistic Significance
Z1 13.004
0.000 Z2
4.485 0.004
Test criteria Tukey HSD
Scheffe LSD
Z1 1–2, 1–3, 2–4, 3–4
1–2, 1–3, 2–4, 3–4 1–2, 1–3, 2–4, 3–4
Z2 1–4
1–4 1–2, 1–3, 1–4, 2–4
Table 10 Functions at group centroids
Police staff group Function
1 2
Staff group 1 20.851
0.611 Staff group 2
0.214 0.005
Staff group 3 0.344
20.003 Staff group 4
20.607 20.393
71 L. Drake, R. Simper International Review of Law and Economics 20 2000 53–73
5. Conclusions