Directory UMM :Data Elmu:jurnal:I:International Review of Law and Economics:Vol20.Issue1.Mar2000:

(1)

Productivity estimation and the size-efficiency relationship

in English and Welsh police forces

An application of data envelopment analysis and multiple

discriminant analysis

Leigh Drake*, Richard Simper

Department of Economics, Loughborough University, Loughborough, LE11 3TU, England

Abstract

This article utilizes data envelopment analysis (DEA) to estimate the productivity of the English and Welsh police forces and to determine whether there are categorical scale effects in policing using multiple discriminant analysis (MDA). The article demonstrates that by using DEA efficiency results it is possible to make inferences about the optimal size and structure of the English and Welsh police forces. In terms of individual force efficiency, the DEA results suggest that the Surrey police force appears to be 38% less efficient than its efficient reference set and that only three police forces (Cleveland, Dorset, and Leicestershire) are consistently efficient. © 2000 Elsevier Science Inc. All rights reserved.

1. Introduction

During the Thatcher/Major Conservative governments, state services were restructured so that they utilized business techniques in creating “value for money” (see Her Majesty’s Inspector of Constabulary, 1995). The reform of the police service instigated by the Con-servative government was in response to the steady increase in crime since 1979, the disproportionate increase in the fear of crime, and the increasing cost of the police service in real terms, from over £1 billion in 1979 –1980 to nearly £7 billion in 1996 –1997 (source: Home Office). These factors led to an inspection and review of the police under the

* Corresponding author. Tel.:144-0-1509-222709; fax:144-0-1509-2233910.

E-mail addresses: L.M.Drake@lboro.ac.uk (L. Drake), R.Simper@lboro.ac.uk (R. Simper) 0144-8188/00/$ – see front matter © 2000 Elsevier Science Inc. All rights reserved.


(2)

Conservative government, which included agencies such as Her Majesty’s Inspector of Constabulary (HMIC) and the Audit Commission, and the introduction of various public charters, including the Citizen’s Charter and the Victim’s Charter (for a discussion, see Stephens, 1994; Sullivan, 1998).

This comprehensive review resulted in various publications concerned with efficiency in the service and included Audit Commission (1990), Home Office (1993), Police Research Group (1993), and the report by Sheehy (1993), which led to recommendations included in The Police and Magistrates’ Courts’ Act of 1994. One of the main recommendations of the Sheehy report was to change the nature of police management from a public to a business-oriented organization and to introduce efficiency targets that were coordinated with local police authorities. Sullivan (1998) argues that the police reform of the 1990s led to the managerialism of the service. That is, “managerialism referred to the belief that all state services do better when reconceived and restructured in terms of the business community’s values of efficiency and effectiveness” (p. 307). The government’s concept of “value for money” in the police service has led us to posit a socioeconomic model of the modern police force. That is, we introduce a methodology that is based on the reorganization of the police force that was begun in the Thatcher/Major government’s reforms but that is set within the concept of the economics of the firm.

The new Labour government has carried on this agenda of ensuring efficiency in the police force (Home Office Inspectorate of Constabulary, 1998). The report reiterated the previous Conservative government’s efficiency drive in the police service with the HMIC arguing that, “police managers need to work harder to ensure that VFM [value for money] is achieved, for competitive pressure has to be created internally. The costing of activity with subsequent measurement and comparison of performance provide the means by which such encouragement is given” (p. 8, paragraph 10).

This article utilizes data envelopment analysis (DEA) to estimate the relative efficiency of the English and Welsh police forces. To determine whether there are categorical size effects in policing, we also utilize multiple discriminant analysis (MDA). To the authors’ knowl-edge, this is the first article to examine the relative efficiency of the English and Welsh police forces. The article is organized as follows. In Section 2, we discuss the methodology utilized in our DEA analysis of police forces and provide details on the variables and data sources. Section 3 presents the results of the DEA efficiency rankings together with a discussion of how certain forces have fared over the 1992–1993 to 1996 –1997 sample period. In Section 4, we undertake MDA tests to discover whether there are categorical size effects that can discriminate between police forces in the sense that one size of force is likely to be more efficient than another. We conclude the article with Section 5.

2. Methodology and data

The term DEA was coined by Charnes et al. (1978) and is a linear-programming technique for constructing extremal, piecewise frontiers that were originally developed by Farrell (1957). The constructed relative efficiency frontiers are nonstatistical or nonparametric in the


(3)

sense that they are constructed through the envelopment of the decision-making units (DMUs) with the “best practice” DMUs forming the nonparametric frontier.

DEA is a leading analytical technique for measuring relative efficiency and has been widely used by both academics and practitioners in evaluating the efficiency of DMUs within an organization or industry in terms of converting resources/inputs into outputs. The tech-nique was originally developed to determine performance measures in non-profit-making organizations where the usual monetary criteria of return on assets/capital, for example, were not appropriate. Hence, DEA has been widely used for relative performance measurement in public sector services such as education (Chalos & Cherian, 1995; Sarrico et al., 1997), health services (see SalinasJime´nez & Smith, 1995, for an example assessing primary-care performance in the English Family Health Service Authorities, and Thanassoulis et al., 1996, for an example using data concerned with prenatal care in England), and criminal courts (see PedrajaChaparro & SalinasJime´nez, 1996, for an example of Spanish court efficiency). Although DEA is ideally suited to the examination of the relative efficiency of law enforce-ment units, to our knowledge, this is the first study to apply this technique to the analysis of relative police force efficiency.

A particular advantage of nonparametric techniques such as DEA, relative to statistical or parametric techniques such as stochastic frontier analysis (Drake & Weyman-Jones, 1996; Ferrier & Lovell, 1990), is that the latter must assume a particular functional form that characterizes the relevant economic production function or cost function. Hence, any result-ant efficiency scores will be partially dependent on how accurately the chosen functional form represents the true production relationship (i.e., the relationship between inputs/ resources and outputs). As DEA is nonparametric and envelops the input/output data of the DMUs under consideration, the derived efficiency results do not suffer from this problem of functional form dependency.

The use of DEA is not confined to public sector enterprises, however. DEA can be applied to any organization/industry in which a reasonably homogenous set of DMUs use the same set of resources, possibly in different combinations, to produce an identifiable range of outputs or “deliverables,” again possibly in different combinations. In this context, DEA has been applied to the analysis of individual building societies and banks within the U.K. financial sector (Drake & Weyman-Jones, 1992, 1996; Drake, 1997), to the relative effi-ciency of hotels within a hotel chain (Johns et al., 1997), and to the analysis of the relative efficiency of the individual bank branches of a U.K. clearing bank (Drake & Howcroft, 1994).

2.1. Measuring relative efficiency using DEA

Within the methodological framework of DEA it is possible to decompose the relative efficiency performance of DMUs into the categories initially suggested by Farrell (1957), and later elaborated on by Banker et al. (1984) and Fare et al. (1985). Farrell’s categories are best illustrated, for the single-output/two-input case in the unit isoquant diagram (Fig. 1) where the unit isoquant (yy) shows the various combinations of the two inputs (x1, x2) that

can be used to produce one unit of the single output (y). The firm at E is productively (or overall) efficient in choosing the cost-minimizing production process given the relative input


(4)

prices represented by the slope of WW’. A DMU at Q is allocatively inefficient in choosing an inappropriate input mix, while a DMU at R is both allocatively inefficient (in the ratio OP/OQ) and technically inefficient (in the ratio OQ/OR) because it requires an excessive amount of both inputs, x, compared with a firm at Q producing the same level of output, y. The use of the unit isoquant implies the assumption of constant returns to scale. However a firm using more of both inputs than the combination represented by Q may experience either increasing or decreasing returns to scale so that, in general, the technical efficiency ratio OQ/OR may be further decomposed into scale efficiency, OQ/OS, and pure technical efficiency, OS/OR, with point Q in Fig. 1 representing the case of constant returns to scale. The former arises because the firm is at an input-output combination that differs from the equivalent constant returns-to-scale situation. Only the latter pure technical efficiency rep-resents the failure of the firm to extract the maximum output from its adopted input levels and, hence, may be thought of as measuring the unproductive use of resources. In summary, productive efficiency5allocative efficiency3scale efficiency3pure technical efficiency

OP/OR5@OP/OQ#3@OQ/OS#3@OS/OR#. (1) Due to the difficulties in accurately measuring all input prices in public sector services such as the police force, this article does not consider allocative efficiency. Hence, concentrating on overall technical efficiency, Farrell (1957) suggested constructing, for each observed DMU, a pessimistic, piecewise, linear approximation to the isoquant, using activity analysis applied to the observed sample of DMUs in the organization/industry in question. This produces a relative rather than an absolute measure of efficiency because the DMUs on the


(5)

piecewise, linear isoquant constructed from the boundary of the set of observations are defined to be the efficient DMUs.

Subsequent developments have extended this mathematical linear-programming approach. If there are n DMUs in the industry, all the observed inputs and outputs are represented by the n-column matrices X and Y. The input requirement set, or reference technology, can then be represented by the free disposal convex hull of the observations, i.e., the smallest convex set containing the observations consistent with the assumption that having less of an input cannot increase output. We do this by choosing weighting vectors,l(one for each firm), to apply to the columns of X and Y to show that firm’s efficiency performance in the best light. For each DMU in turn, using x and y to represent its particular observed inputs and outputs, pure technical efficiency is calculated by solving the problem of finding the lowest multiplicative factor, u, which must be applied to the firm’s use of inputs, x, to ensure that it is still a member of the input requirements set or reference technology. That is, choose

$u, l% to minu, such that ux$l9X

li$0, (li51,

y#l9Y

i51, . . . , n. (2)

To determine scale efficiency, we solve the technical efficiency problem (2) without the constraint that the input requirements set be convex; i.e., we drop the constraint Sli5 1.

This permits scaled-up or down-input combinations to be part of the production possibility set of the DMUs. Fig. 2 illustrates this for the case of a single input and a single output.

In Fig. 2, the production possibility set under constant returns to scale is the region to the right of the ray, OC, through the leftmost input-output observation. Any scaled-up or


(6)

scaled-down versions of the observations are also in the production possibility set under this assumption of constant returns to scale. Imposing the convexity constraintSli51 ensures

that the production possibility set is the area to the right of the piecewise linear frontier VV’, which does not assume constant returns to scale, but allows for the possibility of increasing returns to scale at low output levels and decreasing returns at high output levels. The resulting overall technical and pure technical efficiency ratios, AQ/AR and AS/AR, are illustrated for one of the observations. Scale efficiency is the ratio of the two results.

In the case of program (2), the efficiency ratios with and without the convexity constraint may be labeledupanduo, and scale efficiencyusis thenuo/up. In the subsequent results, we

refer to overall technical efficiency as OE, pure technical efficiency as PTE, and scale efficiency as SE. As explained above, it follows that:

OE5PTE3SE, and SE5OE/PTE

Using DEA to derive a measure of OE, but also to decompose the results into the components of PTE and SE, allows us to examine not only the effectiveness of the use of resources in policing (PTE) but also to gain an insight into the relationship between efficiency and the size of police forces (SE). All economic organizations that use resources to produce outputs are prone to output ranges that display, first, increasing, then constant, and finally decreasing returns to scale. Obtaining this type of information about English and Welsh police forces may enable us to shed some light on the optimal size and structure of police forces from the perspective of economic efficiency, although it is recognized that there will be many other factors that inevitably impinge on the size and structure of forces.

2.2. The identification of inputs and outputs in policing

The measurement of the police force in its actions and activities is complex because it involves many accountable and nonaccountable services. For example, Redshaw et al. (1997) argue that policing consists of the “prevention and detection of crime and the maintenance of public order, but it also embraces a social service role such as welfare and the prevention of crime” (p. 284). Byrne et al. (1996) differentiate between two main police functions: traditional law enforcement, which includes the prevention and repression of crime; and public service duties, including the regulation of noncriminal activities. However, the complications of measurement rest not with the inputs of the police but with their outputs. That is, the former can be grouped as if the service was a firm and, therefore, include labor and various capital costs.

In our model, we break down the inputs of each police force into four distinct categories, as outlined in the Chartered Institute of Public Finance and Accountancy Police Force Statistics. The first input in our estimation methodology is employment costs. This is the total cost of the employed staff of each police force, which includes all police officer ranks, traffic wardens, civilian staff, and other staff development expenses that occur on a daily basis. We have included civilian staff in the summation of police staff costs because the demarcation between the police function and the civilian involvement in policing has become ever more blurred. In a recent report by Her Majesty’s Inspectorate for the Constabulary (1998), for example, the employment of civilian staff was thought to lead to an enhancement of


(7)

“efficiency and effectiveness,” and the report revealed that civilian staff represented approx-imately 30% of total staff employed in the service in 1995–1996. Furthermore, the report argues that “the classification of roles into police/civilian was in itself a redundant concept. Instead, it would be more appropriate to shift the focus to the actual cost of delivering a service function. . .” (HMIC, 1998, p. 55, paragraph 2.48). We believe, therefore, that a total labor cost variable should be utilized as an input, as many of the functions once wholly undertaken by the police are now beginning to be undertaken by civilian staff.

The second input is premises-related expenses, which is the sum of all premises expenses and covers the general daily running costs including repair and maintenance. The third input is transport-related expenses, which includes the running costs and repairs of police vehicles. Finally, the fourth input is capital and other costs. This latter variable includes capital-financing costs and all those costs associated with equipment bought for internal use such as information technology, communications, and furniture, and also includes contracted-in and contracted-out services. This variable has been noted as one that could lead to greater pressure on future capital expenditure due to the need to update information technology facilities so that police forces have the latest equipment. In total, the average annual per

capita expenditure of all forces in England and Wales on capital equipment has increased

from £67 in 1987–1988 to £123 in 1995–1996 (HMIC, 1998).

A major problem inherent in measuring the efficiency of the police service is how to quantify the role of the police in society. There have been many different measurable outputs that have been advanced as useful in compiling efficiency rankings. The first relate to surveys, where some authors have argued that surveys on the evaluation of police perfor-mance “provide more easily quantified measures that dominate HMIC requirements and that. . .can lead to improvements in policing” (Redshaw et al., 1997, p. 284).

It also has been argued, however, that it would be incorrect to survey the public about police service actions as this would introduce bias when using qualitative judgments on how well the police service operates. For example, Shaw & Williamson (1972) argued that young people and the working class rated the service lower than did older people and the middle classes. Recently, Waddington & Braddock (1991) found that white and Asian youths had mixed views of how the police operate, whether as “guardians” or “bullies,” but that black youths tended “to favour the ‘bullies’ perception” (p. 39). The authors concluded that “what distinguishes the races is not the absence of some whites and Asians who regard the police as ‘bullies,’ but the virtual absence amongst their black counterparts of any conception of police as ‘guardians’” (p. 39). These problems and socioeconomic stereotypes imply that surveys could lead to a misinterpretation by the public of police functions.

It is for the above reasons that in addressing the issue of the use of survey data as possible performance indicators of police functions, the HMIC (1998) report “What Price Policing?” concluded that “surveys are an imperfect measure and are affected by sample size, survey methodology and the nature of the population targeted” (p. 85, paragraph 2.167). Skogan (1996) also argued that local surveys were fraught with difficulties because there are inter-area differences. In a Greater Manchester police survey across 13 districts, for example, Skogan found “that the percentage of residents rating ‘burglary and theft’ the ‘single most serious problem’ in their area ranged from 2% to 22%. The range for ‘street crime’ was from


(8)

less than 1% to 22%, car crime 13% to 28%, and ‘young people hanging/drilling around’ from 5% to 24%” (p. 427).

In addition, a study by Redshaw et al. (1997) surveyed police officers, neighborhood watch coordinators, and members of the public and found that when asked to rank 37 jobs that the police are asked to perform, they responded by ranking the top three as 1) respond immediately to emergencies, 2) detect and arrest offenders, and 3) investigate crime, all three of which relate to the classification of a reactive police force variable. However, the authors note that “even where activities appear to have no immediate ‘crime control’ payoffs, there is widespread acceptance that the British tradition of local, community-based, service-oriented, policing needs to be preserved” (p. 300). It is hoped that in future research we will be able to include in our model a variable that can proxy this all-important function of policing.1For the above reasons, and because of the lack of quantifiable data on other police

functions, we use traditional outputs associated with response/reactive policing.

The response/reactive methodology of measuring policing can be found in a number of studies, including Todd & Ramanathan (1994) and Byrne et al. (1996), who argue that even though half of the police’s community work cannot be modeled, a production function can still be estimated. They break down police activities into crime prevention “where crime is contemplated but not committed,” and crime repression, where the “crime has occurred,” and they use an argument from Schmidt & Witte (1984) that any criminal is likely to assess the probability of getting caught after committing a crime. Todd & Ramanathan (1994) also state that outputs should be a measure of activity, such as the number of arrests made, and that “employee allocations are explained marginally better by background demand for services, . . .” (p. 131).2Hence, the probability of arrest is linked to the number of arrests in a police

force and, in particular, to the number of convictions. For this reason we feel that the clear-up rate used in this study is a good proxy for the preventive methods used by the police.

We also note the criticisms of Walker (1992) in using clear-up rates as an output variable and have, therefore, split our sample police forces into Metropolitan English, Welsh, and the Metropolitan and London police forces, and into four size groupings.3 This will allow

comparisons of forces that are closely linked by geographic circumstance and economic size,

1Jackson (1992) found, using data from the United States, that a sizeable proportion of the cost of policing could be attributed to other factors, such as the decline in the demographic and socioeconomic bases of many cities. Most importantly, these factors, even when held constant, still led to increases in fiscal expenditure as the “police are called upon to manage the social threats that rise from the ashes of social decay” (p. 202).

2O’Brien (1996) has argued that there is some level of police discretion in reporting or recording criminal incidences. Hence, instead of using recorded crime statistics for a variety of crimes, it would be better to consider serious crimes such as murder, where there is evidence (a body) and with which, therefore, the assessment of police productivity can be better assessed. This methodology is not used in our estimation because it would exclude a considerable number of other crimes, which constitute a higher proportion of crime in the United Kingdom than in the United States, as represented by the study of O’Brien.

3Walker (1992) believes that clear-up rates can be misleading and forcibly argues that they “should not be commended as performance indicators by which to judge the policing service delivered to the public. Compar-isons between forces in these rates are invidious, and they may lead to inefficient and even possibly corrupt practices” (p. 305).


(9)

and will mitigate any possible bias in our analysis of police forces and the results presented in Section 4.

The second output variable is the total number of traffic offenses that the police and contracted civilian staff (such as traffic wardens) deal with in a year, which includes prosecutions, the number of written warnings, and fixed-penalty fines. This is an important variable as it measures the effect on policing of the 6% increase in registered vehicles (from 21.6 million in 1988 to 22.9 million in 1995–1996) and the associated increased traffic problems encountered by the police. Furthermore, in line with the response/reactive meth-odology it would be expected that increases in the number of recorded traffic offenses would,

ceteris paribus, tend to reduce the per capita number of traffic offenses.

In recent years, the government has implemented a strict drunk-driving campaign, which can take up police time with respect to performing breathalyzer test on drivers. In fact, there has been a 76% increase in breathalyzer tests since 1988, and the 781,100 tests carried out by police in 1996 –1997 is the largest number of tests since breathalyzer tests were intro-duced in 1967 (source: Home Office). We would expect that, as more people have breatha-lyzer tests administered to them, serious road accidents would be likely to drop, thereby freeing up more police time for other activities. As mentioned above, we would also expect that increased administration of breathalyzer tests would act as a deterrent to drunk driving and, hence, should, ceteris paribus, ultimately reduce the level of per capita drunk-driving offenses. Following the methodology of Byrne et al. (1996), this action can be classified as a reactive approach to reducing car accidents, and so, the total number of breathalyzer tests constitutes our final output variable. The next section discusses the results from the DEA and MDA using the methodology outlined above.

3. DEA results

The DEA results for OE, PTE, and SE for the English and Welsh police forces are detailed in Tables 1, 2, and 3. The corresponding results for the London and Metropolitan police forces are given in Tables 4, 5, and 6. It is important to note, however, that all the efficiency scores were derived by contrasting each police force with all its peers, although we elected to summarize the results separately for the English, Welsh, and Metropolitan police forces. As can be seen from the tables, we produce DEA relative efficiency scores for each year from 1992–1993 to 1996 –1997 and also provide details of the mean efficiency scores for each police force over these years.

With respect to the DEA efficiency results, the ratios discussed previously typically produce efficiency scores of unity for efficient DMUs and less than unity for inefficient units. We choose to use a score of 100 for efficient units, however, as this permits a ready interpretation of the degree of inefficiency in percentage terms. In Table 1, for example, Surrey appears to have the least efficient force with an average OE score of 62.39 over the 5-year period and with individual year scores ranging from 43.12 in 1992–1993 to 81.43 in 1993–1994. The interpretation of these results is that, on average, the Surrey force is around 38% less efficient than its efficient reference set forces (those forces that form the relevant frontier and have scores of 100) in terms of translating its available resources into the


(10)

specified outputs. Furthermore, in 1992–1993 this relative degree of inefficiency was as high as 57%. If we analyze the PTE and SE results for Surrey, we can gain some insight into the source of this level of inefficiency. The mean PTE score is 69.34, for example, while the mean SE score is 89.27. This suggests that the bulk of the inefficiency is not caused by a failure to operate under constant returns to scale. In fact, the SE score in most years is over 90, suggesting that the Surrey force is not too far removed from the constant returns region Table 1

DEA of English and Welsh police forces OE resultsa

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Non-metropolitan England Avon and

Somerset

93.46 95.31 77.45 81.95 83.83 86.40

Bedfordshire 100 80.58 88.12 82.04 100 90.15

Cambridgeshire 100 100 100 90.26 96.07 97.27

Cheshire 78.81 76.35 69.59 83.88 81.18 77.96

Cleveland 100 100 100 100 100 100

Cumbria 88.97 86.7 79.43 95.76 85.71 87.31

Derbyshire 85.94 100 95.61 96.85 100 95.68

Devon and Cornwall

89.65 78.89 91.9 96.65 93.53 90.12

Dorset 100 100 100 100 100 100

Durham 64.08 84.59 82.05 77.07 60.85 73.73

Essex 87.21 88.02 N/Ab

N/A 81.15 85.46

Gloucestershire 100 91.41 91.06 98.42 100 96.18

Hampshire 87.54 89.65 86.38 100 100 92.71

Hertfordshire 98.54 88.37 80.40 82.68 88.55 87.71

Humberside 60.39 64.56 72.02 90.82 78.14 73.19

Kent 74.75 80.61 79.39 77.92 75.81 77.70

Lancashire 66.91 82.77 73.89 87.12 100 82.14

Leicestershire 100 100 100 100 100 100

Lincolnshire 77.48 100 N/A 100 100 94.37

Norfolk 93.34 77.47 78.09 74.47 87.65 82.0

Northamptonshire 100 100 100 100 98.98 99.80

North Yorkshire 71.15 78.93 70.82 69.79 75.37 73.21

Nottinghamshire 68.79 100 100 100 100 93.76

Staffordshire 87.33 92.83 83.33 93.62 100 91.42

Suffolk 79.08 80.33 72.90 100 85.79 83.62

Surrey 43.12 81.43 62.15 65.36 59.89 62.39

Sussex 94.69 93.93 79.14 87.69 96.01 90.29

Thames Valley 100 100 100 87.15 84.98 94.43

Warwickshire 76.17 77.17 83.28 97.49 93.94 85.61

West Mercia 83.13 86.72 81.93 83.31 78.66 82.75

Wiltshire 78.85 87.59 75.25 98.41 79.12 83.84

Wales

Dyfed-Powys 75.99 82.45 76.10 83.66 87.81 81.20

Gwent 72.11 100 100 100 100 94.42

North Wales 86.80 97.14 100 81.69 83.41 89.81

South Wales 92.70 100 100 100 100 98.54

a

Data for the Essex (1994 –1995 and 1995–1996), and Lincolnshire (1994 –1995) police forces were unavail-able.

b


(11)

of operation. The mean PTE score of 69.34, however, suggests that the main factor behind the observed low overall efficiency levels is a failure to utilize resources effectively. Specifically, the mean figure of 69.34 suggests that the Surrey force should be able to reduce their use of resources by around 31%, on average, across the range of inputs without adversely affecting the capacity of the force to deliver the observed outputs. It is possible to make this type of assertion because the DEA results tell us that, in comparison with the Table 2

DEA of English and Welsh police forces PTE resultsa

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Non-metropolitan England Avon and

Somerset

100 100 82.73 83.03 85.00 90.15

Bedfordshire 100 95.19 100 93.56 100 97.75

Cambridgeshire 100 100 100 96.56 100 99.31

Cheshire 81.61 77.11 69.88 84.72 84.49 79.56

Cleveland 100 100 100 100 100 100

Cumbria 98.59 97.15 94.60 100 100 98.07

Derbyshire 94.58 100 97.58 100 100 98.43

Devon and Cornwall

96.78 87.54 100 98.98 94.59 95.58

Dorset 100 100 100 100 100 100

Durham 88.94 84.59 82.25 78.41 78.07 82.45

Essex 89.42 90.68 N/Ab

N/A 81.48 87.19

Gloucestershire 100 100 100 100 100 100

Hampshire 97.24 96.77 100 100 100 98.80

Hertfordshire 99.97 88.81 81.96 83.57 88.87 88.64

Humberside 61.92 69.02 72.19 92.61 78.60 74.87

Kent 79.45 80.66 83.78 89.00 94.03 85.38

Lancashire 75.00 82.79 76.69 91.72 100 85.24

Leicestershire 100 100 100 100 100 100

Lincolnshire 88.72 100 N/A 100 100 97.18

Norfolk 96.34 82.61 80.58 77.52 88.81 85.17

Northamptonshire 100 100 100 100 99.06 99.81

North Yorkshire 80.21 83.48 80.15 77.90 83.65 81.08

Nottinghamshire 69.23 100 100 100 100 93.85

Staffordshire 95.90 92.89 83.34 94.15 100 93.26

Suffolk 90.16 92.66 91.48 100 100 94.86

Surrey 61.86 83.46 66.77 68.92 65.68 69.34

Sussex 100 100 86.09 91.17 96.23 94.70

Thames Valley 100 100 100 87.99 91.61 95.92

Warwickshire 94.40 98.71 100 100 100 98.62

West Mercia 84.00 88.15 81.99 83.40 80.18 83.54

Wiltshire 90.55 97.71 91.19 100 100 95.89

Wales

Dyfed-Powys 100 100 100 100 100 100

Gwent 100 100 100 100 100 100

North Wales 90.27 98.04 100 89.34 88.57 93.24

South Wales 100 100 100 100 100 100

a

Data for the Essex (1994 –1995 and 1995–1996), and Lincolnshire (1994 –1995) police forces were unavail-able.

b


(12)

Surrey force, other forces with similar input and output configurations are using, on average, 31% fewer inputs to deliver similar output levels.

If we turn now to Tables 4, 5 and 6, we see that the observed OE levels of the Metropolitan police force require a different interpretation. Table 4 indicates that the Metropolitan force has a mean OE score of only 57.52, which is the lowest of any force in the sample. With the exception of 1993–1994, the figures range from 30.56 in 1992–1993 to 62.06 in 1994 –1995. Table 3

DEA of English and Welsh police forces SE resultsa

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Non-metropolitan England Avon and

Somerset

93.46 95.31 93.62 98.69 98.63 95.94

Bedfordshire 100 84.65 88.12 87.69 100 92.09

Cambridgeshire 100 100 100 93.47 96.07 97.91

Cheshire 96.57 99.01 99.59 99.01 96.08 98.051

Cleveland 100 100 100 100 100 100

Cumbria 90.24 89.24 83.96 95.76 85.71 88.98

Derbyshire 90.86 100 97.98 96.85 100 97.14

Devon and Cornwall

92.63 90.12 91.90 97.64 98.88 94.24

Dorset 100 100 100 100 100 100

Durham 72.05 100 99.76 98.29 77.94 89.61

Essex 97.53 97.07 N/Ab

N/A 99.59 98.06

Gloucestershire 100 91.41 91.06 98.42 100 96.18

Hampshire 90.02 92.64 86.38 100 100 93.81

Hertfordshire 98.57 99.50 98.10 98.94 99.64 98.95

Humberside 97.53 93.54 99.76 98.07 99.41 97.66

Kent 94.08 99.94 94.76 87.55 80.62 91.39

Lancashire 89.21 99.98 96.35 94.98 100 96.10

Leicestershire 100 100 100 100 100 100

Lincolnshire 87.33 100 N/A 100 100 96.83

Norfolk 96.89 93.78 96.91 96.07 98.69 96.47

Northamptonshire 100 100 100 100 99.92 99.98

North Yorkshire 88.70 94.55 88.36 89.58 90.10 90.26

Nottinghamshire 99.36 100 100 100 100 99.87

Staffordshire 91.06 99.94 99.99 99.44 100 98.08

Suffolk 87.71 86.69 79.69 100 85.79 87.98

Surrey 69.71 97.57 93.08 94.83 91.181 89.27

Sussex 94.69 93.93 91.93 96.18 99.771 95.30

Thames Valley 100 100 100 99.05 92.77 98.36

Warwickshire 80.689 78.18 83.28 97.49 93.94 86.72

West Mercia 98.96 98.38 99.93 99.89 98.10 99.05

Wiltshire 87.079 89.64 82.52 98.41 79.12 87.35

Wales

Dyfed-Powys 75.99 82.45 76.10 83.66 87.81 81.20

Gwent 72.11 100 100 100 100 94.42

North Wales 96.156 99.08 100 91.41 94.17 96.17

South Wales 92.70 100 100 100 100 98.54

a

Data for the Essex (1994 –1995 and 1995–1996), and Lincolnshire (1994 –1995) police forces were unavail-able.

b


(13)

It would be inappropriate to label the Metropolitan as a highly inefficient police force, however, as Table 5 indicates that the corresponding PTE scores are 100 in each year of the study. This suggests that, given the scale of the Metropolitan’s operations, it is a highly efficient police force with no obvious inefficiencies in resource utilization. In contrast, Table 6 reveals an average SE score of only 57.52, confirming that all of the observed OE is associated with scale effects. Given that the Metropolitan is the largest force in the country, this result strongly suggests that there are significant diseconomies of scale at work with respect to large police force operations. As in other large organizations, this is probably attributable to the extra bureaucracy and layers of management structure that tend to accompany large scale operations.

Further statistical examination of the relative efficiency results is undertaken in Section 4, but it is interesting to note from Table 7 that the mean SE levels for the largest and smallest police forces (86.99 and 85.85, respectively) are considerably lower than those of the intermediate-size forces, staff group 2 (95.11) and staff group 3 (96.23). This is not surprising because we would expect that a large proportion of staff group 1 forces would exhibit increasing returns to scale, while the majority of staff group 4 forces would exhibit Table 4

DEA of London and Metropolitan police force OE results

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Metropolitan Greater

Manchester

90.32 100 100 100 100 98.06

Merseyside 69.33 66.75 60.87 76.01 72.64 69.12

South Yorkshire 74.61 76.68 69.66 72.99 78.70 74.53

Northumbria 73.41 60.96 69.76 73.21 76.05 70.68

West Midlands 18.23 74.54 68.68 71.58 74.28 61.46

West Yorkshire 65.49 69.65 70.99 74.51 77.08 71.54

London

City 100 100 75.00 73.21 71.22 83.89

Metropolitan 30.56 99.35 62.06 42.24 53.41 57.52

Table 5

DEA of London and Metropolitan police force PTE results

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Metropolitan Greater

Manchester

100 100 100 100 100 100

Merseyside 81.64 71.55 87.74 92.00 74.50 81.49

South Yorkshire 81.21 82.48 74.72 76.85 79.16 78.88

Northumbria 80.65 61.07 80.79 82.74 78.09 76.67

West Midlands 18.82 74.54 82.05 100 100 75.08

West Yorkshire 78.73 69.73 77.40 100 100 85.17

London

City 100 100 100 100 100 100


(14)

diseconomies of scale. Hence, both sets of police forces would have SE scores well below 1. In contrast, staff groups 2 and 3 would be expected to be operating much closer to, if not at, the constant-returns region of the production relationship and would, therefore, have SE scores at, or closer to, unity. The fact that staff group 3 exhibits the highest mean SE score (96.23), with by far the lowest standard deviation (4.04), strongly suggests that police forces in this size band and staff group are close to the optimum in terms of scale efficiency. Clearly, this type of information could prove highly informative in the context of any proposed restructuring of police forces such as the merging of forces or the redrawing of police force boundaries, etc.

Interestingly, Table 7 indicates that the smallest forces (those in staff group 1) appear to be the most technically efficient, with a mean PTE score of 99.64 and a standard deviation of only 1.30. Furthermore, the mean PTE scores appear to decline with size as the scores for Table 6

DEA of London and Metropolitan police forces SE results

1992–1993 1993–1994 1994–1995 1995–1996 1996–1997 Mean

Metropolitan Greater

Manchester

90.32 100 100 100 100 98.06

Merseyside 84.92 93.29 69.38 82.62 97.50 85.54

South Yorkshire 91.87 92.97 93.23 94.98 99.42 94.49

Northumbria 91.02 99.82 86.35 88.48 97.39 92.61

West Midlands 96.86 100 83.71 71.58 74.28 85.29

West Yorkshire 83.18 99.89 91.72 74.51 77.08 85.28

London

City 100 100 75 73.21 71.22 83.89

Metropolitan 30.56 99.35 62.06 42.24 53.41 57.52

Table 7

Group descriptive statistics and the test for equality of group means between different staff groups, 1992–1993 to 1996 –1997

Dependent variable Independent variables Sample

size

SE PTE OE

Group means

Staff group 1 85.85 99.64 85.56 19

Staff group 2 95.11 91.73 87.43 101

Staff group 3 96.23 90.57 87.17 53

Staff group 4 86.99 88.36 76.13 39

Total 93.07 91.52 85.12 212

Standard deviations

Staff group 1 11.02 1.30 11.23 19

Staff group 2 6.61 10.44 12.73 101

Staff group 3 4.04 9.72 10.24 53

Staff group 4 16.87 15.49 18.89 39


(15)

staff groups 2, 3, and 4 are 91.73, 90.57, and 88.36, respectively. This suggests that, leaving aside the issue of the scale of operations, effective resource usage and cost control are easier to accomplish in smaller police forces than in larger ones.

Hence, we appear to have an interesting dichotomy in the sense that levels of PTE appear to decline with police force size, but there is clear evidence of an inverted U-shaped relationship with respect to scale efficiency. The latter is suggestive of the classic U-shaped average-cost curve that is typically attributed to increasing, and eventually decreasing, returns to scale. Indeed, it is interesting to note that the mean SE scores support the notion of a “saucer-shaped” average cost curve for policing in the sense that there appear to be substantial increasing and decreasing returns to scale in evidence at the extreme ends of the size spectrum, but a relatively large region of constant returns or modest economies/ diseconomies of scale at intermediate size ranges. Although this is a very common finding in economic studies of industrial production, it is a particularly interesting result to find that the same economic production relationship appears to hold good in public sector services such as policing.

Clearly, the apparent tradeoff between PTE and SE presents particular problems in the context of decisions over police force management and structure, and it warrants further research and investigation. A simple way of characterizing the problem is to think of the SE results (and the corresponding notional average cost curve) as revealing the minimum level of average costs that could be attained for any given scale of output, provided that all resources are used effectively. Furthermore, this information reveals the relationship between size and efficiency, or size and unit costs. Economists frequently refer to this as revealing the minimum efficient scale of operation, i.e., the minimum level of output that exhausts all economies of scale. Our results suggest that this would be at staff group 2 or 3.

As Liebestein (1966) pointed out, however, these notional minimum costs at each given scale of operation are not always realized due to various factors such as managerial inefficiency, for example. He referred to this failure to realize the minimum possible unit costs as “X-inefficiency” and the DEA analog to this is our PTE results, which show whether resources are being used at their maximum efficiency for any given scale of output. Clearly, a failure to utilize resources at their maximum efficiency would result in unit costs exceeding their potential minimum. Our finding that PTE declines with the scale of output has powerful implications because it suggests that X-inefficiency increases with size and, hence, that the wedge between minimum and actual unit costs will be increasing, while at the same time minimum unit costs are actually declining with size up to a point.

Our results suggest, therefore, that to enhance the overall efficiency of the English and Welsh police forces two approaches are necessary. First, consideration must be given to some structural reorganization such that individual forces are operating closer to the mini-mum efficient scale, which appears to be staff groups 2 to 3. Second, the apparent problem of worsening X-inefficiencies must be investigated and tackled in larger police forces. While DEA can provide valuable insights into the reductions in inputs necessary to achieve PTE for given output levels (using comparisons with police forces that have efficient reference sets), it seems likely that a review of management and staffing structures in the larger police forces may be required. It is interesting to note in this respect that only three police forces, Cleveland, Dorset, and Leicestershire, are consistently efficient in terms of both SE and PTE


(16)

(and, necessarily, OE). Hence, these forces will tend to form part of the efficient reference set of police forces for a large number of inefficient forces, and a detailed comparison of these “best-practice” forces with the less efficient units could provide very useful information in any reorganization/restructuring process.

So far we have not addressed the issue of the statistical significance of the differences in our efficiency scores across staff size groups. We rectify this in the next section using analysis of variance (ANOVA) and discriminant analysis techniques.

4. MDA results

To assess the DEA results further, we adopt a dual post-hypothesis testing strategy that utilizes ANOVA and MDA. Both of these statistical techniques allow us to determine whether there are any significant differences between grouped police forces (see Hair et al., 1995, for an introduction). In this analysis, the categorical variable partitions the police forces into four groups that are determined by the number of total police and civilian staff operating in each force. This allows us to determine, for example, if large police forces (by total staff employed) display SE, PTE, or OE scores that are significantly better (or worse) than their smaller counterparts. If a police force has 0 to 1,500 total staff, it is a member of staff group 1; between 1,501 to 3,000, they are a member of staff group 2; between 3,001 to 4,500, they are a member of staff group 3; and above 4,501, they are a member of staff group 4. To ensure that we follow at least the minimum requirements necessary for MDA, we have stacked the 5 years before estimation. Table 7 gives the total 1992–1993 to 1996 –1997 stacked grouped summary statistics for the three independent variables, OE, PTE, and SE. As outlined previously, in terms of SE staff group 3 has the highest mean value, and the lowest standard deviation, while for PTE staff group 1 has the highest mean value with the smaller standard deviation. The results for OE reveal that staff groups 1, 2, and 3 are very close in terms of the overall mean rankings and their deviation. However, staff group 4 has the lowest overall mean value with the largest standard deviations. This can be attributed to the wide variations in OE for the West Midlands and the Metropolitan police force that are evident in Table 4.

The estimation analysis that is followed in this study involves further testing procedures after the DEA estimation. The first stage is the estimation of an ANOVA, a univariate test, where the dependent variables are PTE and SE, and the independent variable is the cate-gorical staff group.4The null hypothesis under interest is that each mean associated with the

staff group is equal. As can be seen from Table 8, the F-statistic is greater than the critical value, and we can conclude that there is a statistically significant size difference associated with our two measures of efficiency. However, we do not know whether the differences are between staff groups 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, or, finally, 3 and 4.

MDA is much like ANOVA, but in this case the dependent variable is the categorical staff

4As OE is a product of the multiplication of PTE and SE, it is excluded in the second-stage estimation due to problems of multicollinearity.


(17)

group, and the independent variables are PTE and SE. The reason for estimating the MDA is that it offers an alternative insight into the ANOVA results found in Table 8. For example, if we are trying to predict to which staff group a police force should belong, given a value for PTE or SE, then MDA will derive the linear combination of the two independent variables that would discriminate best between the staff groups. MDA distinguishes between the groups by multiplying PTE and SE by their corresponding weights and then adds these products together giving a single discriminant score for each police force. After averaging each discriminant score in each staff group, we obtain the centroid, which we can use to compare how “far apart” the staff groups are. In this case, our hypothesis of equal means for the staff groups are based on comparing the distribution of the discriminant scores. The test analysis is such that “if the overlap in the distribution is small, the discriminant function separates the groups well. If the overlap is large, the function is a poor discriminator between the groups” (Hair et al., 1992).5

Before discussing the MDA results, we need to compare the hit ratio with the maximum-chance and proportional-maximum-chance criteria to assess the predictive accuracy of the function. The maximum chance criterion is calculated as the probability of correctly classifying all scores by placing them in the staff group with the greatest probability of occurrence, which in this model is 48%. However, with unequal groups, we can calculate a proportional-chance criteria, which in this model equals 33.34%. Hence, as our hit ratio (49.1%) exceeds the maximum-chance and proportional-chance criteria, we can conclude that the MDA model is valid based on these measures. We also checked our model using Press’s Q-statistic, which tests whether the staff group classification by MDA would exceed those classifications if carried out by chance. Having a total of 104 predicted group memberships correctly classified, the estimate of Press’s Q value equals 62.89%, which is significant at the 5% critical level. Therefore, utilizing the results obtained from the maximum-chance criteria, the proportional-chance criteria, and the hit ratio, we can conclude that the MDA model is better at predicting staff group membership than if the prediction is carried out by chance.

Table 9 provides the overall MDA results and indicates that the discriminant functions are highly significant, as measured by Wilksl and the x2 statistics. Overall, the first function

5

MDA was estimated using SPSS version 8 (SPSS Inc; Chicago, IL), with the stepwise Mahalanobis distance, Fisher’s function coefficients method, a decision rule of F being between 0.05 and 0.15, prior probabilities computed from group size, and the use of the within-groups covariance matrix.

Table 8

Test for equality of staff group means Independent

variables

Wilks la

Univariate F ratiob

Significance

OE 0.905 7.246 0.000

PTE 0.936 4.708 0.003

SE 0.843 12.880 0.000

a

Degrees of Freedom, 3. b


(18)

accounts for 74.4% of the variance, and the second function accounts for 25.6%. However, the functions display a low canonical correlation of 0.39 and 0.25, respectively; that is, 15.21% and 6.25% of the variance in the dependent variable can be explained by this model (in regression models this is the R2 statistic). In this context, although the latter figures are relatively low, we must remember that MDA involves the process of determining which staff group a police force should be included in. However, it cannot take into account sociological and political factors that cannot be included in the calculation but that will have an effect on how large a police force will be and, therefore, on its staff group classification.

We have found above that the discriminant function is able distinguish between the dependent variables and that there are unequal staff group mean values. A first check of whether there are indeed staff group mean differences is shown in Table 10, where we can see that none of the group centroids are equal in value. That is, it appears that the first and second discriminant functions significantly discriminates between all groups. To determine which pairs of group means are significantly different, we estimate a second-stage ANOVA that allows us to calculate post-hypothesis pairwise tests: TUKEY HSD, Scheffe, and least significant difference (see Sharma, 1996, for an introduction to this stage of analysis).

Table 11 gives the ANOVA results for the discriminant scores and the post-hypothesis pairwise tests. The initial analysis shows that the discriminant scores are significantly different across the staff group means and that the Scheffe test (our preferred post-hypothesis Table 9

Multivariate results for four staff group discriminant analysis

Function

Eigen-value

% of Variance Canonical

correlation

Wilks l

x2

df

Signif-icance

Function Cumulative

1 0.188a

74.4 74.4 0.397 0.791 48.792 6 0.000

2 0.065a

25.6 100.0 0.246 0.939 13.038 2 0.001

Independent Variables

Standardised Discriminant Coefficients Function 1

Canonical Function Function 2

Structure Matrixb

Function 1 Function 2

PTE 20.121 0.994 0.993c

0.121

SE 0.988 0.162 20.162 0.987c

Classification function coefficients-Fisher’s Linear Discriminant Functions

Independent Variables Police Staff Groups

1 2 3 4

PTE 0.867 0.805 0.795 0.773

SE 1.022 1.125 1.137 1.030

Constant 289.466 291.137 292.112 280.666

a

First two canonical discriminant functions were used in the analysis. b

Pooled within-groups correlations between discriminating variables and standard canonical discriminant functions. Variables ordered by absolute size of correlation within function.

c


(19)

pairwise test) shows that function Z1 (which corresponds to PTE) is significantly different across staff groups, with the exception of 1 and 4 and 2 and 3, and that function Z2 (which corresponds with SE) shows that only staff groups 1 and 4 are significantly different. This latter result corresponds well with our earlier analysis in the sense that staff groups 2 and 3 appear to be operating closer to the constant-returns region of production and, hence, would not be expected to exhibit significant differences in SE scores. Staff groups 1 and 4, however, would be expected to exhibit increasing and decreasing returns to scale, respectively, and hence might be expected to produce significant differences in SE scores.

The results for PTE, however, suggest a more complicated story than is apparent in Table 7 (which suggested that PTE scores declined with police size). This further statistical analysis suggests that staff groups 1 and 4 tend to be more technically efficient, with staff groups 2 and 3 being less efficient but not significantly different in PTE terms. This interpretation is supported by the very high mean PTE score for police staff group 1 (99.64) and by the very high mean PTE scores of some large police forces such as the Metropolitan and Greater Manchester forces. The relatively low mean PTE score for staff group 4 (88.36) is probably explained by the presence of outliers. This is supported by the relatively high standard deviation of 15.49 evident in Table 7.

Table 11

ANOVA and post hoc multiple comparison tests

Dependent variable ANOVA

F-statistic Significance

Z1 13.004 0.000

Z2 4.485 0.004

Test criteria

Tukey HSD Scheffe LSD

Z1 1–2, 1–3, 2–4, 3–4 1–2, 1–3, 2–4, 3–4 1–2, 1–3, 2–4, 3–4

Z2 1–4 1–4 1–2, 1–3, 1–4, 2–4

Table 10

Functions at group centroids

Police staff group Function

1 2

Staff group 1 20.851 0.611

Staff group 2 0.214 0.005

Staff group 3 0.344 20.003


(20)

5. Conclusions

This article is the first to use DEA to examine the relative efficiency of the English and Welsh police forces. Relatively few forces were identified as being consistently efficient throughout the sample period. The “best-practice” police forces could be used as valuable comparitors in any attempt to restructure police forces to improve productivity and effi-ciency. The study revealed important information concerning the size-efficiency relationship in English and Welsh policing. Specifically, evidence of significant increasing and decreas-ing returns was found at the extremes of the size spectrum, and the SE results were supportive of a “saucer-shaped” average cost curve in policing. This was confirmed by subsequent ANOVA and MDA analysis. Furthermore, the SE results suggest that the optimal size of police forces in England and Wales is at staff group 2 or 3.

Interestingly, however, the PTE scores suggested a very different size-efficient relation-ship for X-efficiency. Specifically, the smallest and largest forces tend to produce relatively higher PTE scores than the intermediate size forces, although staff group 1 showed the greatest consistency with respect to high levels of X-efficiency. The clear differences between the SE and PTE results suggest that the process of enhancing overall police force efficiencies in England and Wales will necessarily be difficult and complex. Nevertheless, this article demonstrates that DEA can produce highly informative results that could, in conjunction with other types of analysis, potentially influence the design of efficiency-enhancing reforms in English and Welsh policing.

References

Audit Commission (1990). Effective Policing – Performance Review in Police Forces. Police Paper No. 8, London.

Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale efficiencies in data envelopment analysis. Management Sci 30, 1078 –1092.

Byrne, D., Dezhbakhsh, H., & King, R. (1996). Unions and police productivity: an econometric investigation. Industrial Relations 35, 566 –584.

Chalos, P., & Cherian, J. (1995). An application of data envelopment analysis to public-sector performance-measurement and accountability. J Accounting Public Policy 14, 143–160

Charnes, A., Cooper, W. W., & Rhoades, E. (1978). Measuring the efficiency of decision making units. Eur J Operational Res 2, 429 – 444.

Drake, L. (1997). Measuring efficiency in UK banking. Economic Research Paper No. 97/18, Department of Economics, Loughborough University.

Drake, L., & Howcroft, B. (1994). Relative efficiency in the branch network of a UK bank: an empirical study. Omega Int J Management Sci 22, 83–90.

Drake, L., & Weyman-Jones, T. G. (1992). Technical and scale efficiency in UK building societies,“ Appl Financial Econ 2, 1–9

Drake, L., & Weyman-Jones, T. G. (1996). ”Productive and allocative inefficiencies in U.K. building societies: a comparison of non-parametric and stochastic frontier techniques. Manchester Sch of Econ of Soc Studies 114, 22–37.

Fare, R., Grosskopf, S., & Lovell, C. A. K. (1985). The Measurement in Efficiency Production. Boston, MA: Kluwer Nijhoff.


(21)

Ferrier, G. D., & Lovell, C. A. K. (1990). Measuring cost efficiency in banking: econometric and linear programming evidence. J Econometrics 46, 229 –245.

Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1992). Multivariate Data Analysis With Readings. New York: Macmillan Publishing Company.

Home Office. (1993). Police Reform. The Government’s Proposals for the Police Service in England and Wales. London, UK: Her Majesty’s Stationary Office, CM 2281.

Her Majesty’s Inspector of Constabulary. (1995). Obtaining Value for Money in the Police Service, London, UK.: HMIC.

Her Majesty’s Inspector of Constabulary. (1998). What Price Policing, London, UK.: HMIC.

Jackson, P. I. (1992). The police and social threat: urban transition, youth gangs, and social control. Policing and Society 2, 193–204.

Johns, N., Howcroft. B., & Drake. L. (1997). The use of data envelopment analysis to monitor hotel productivity. Prog Tourism Hospitality Res 3, 119 –127.

Liebenstein, H. (1966). Allocative efficiency vs. x-efficiency. Am Econ Rev 56, 392– 415. O’Brien, R. M. (1996). Police productivity and crime rates: 1973-1992. Criminology 34, 183–207.

PedrajaChaparro, F., & SalinasJime´nez, J. (1996). An assessment of the efficiency of spanish courts using DEA. Appl Econ 28, 1391–1403.

Police Research Group. (1993). Opportunities for reducing the administrative burdens on the police. Police Research Series, Paper No. 3, Home Office Police Department, London.

Redshaw, J., Mawby, R. I., & Bunt, P.(1997). Evaluating core policing in Britain: the views of police and consumers. Int J Sociology Law 25 283–301.

Sarrico, C. S., Hogan, S. M., Dyson, R. G., & Athanassopoulos. A. D. (1997). Data envelopment analysis and university selection. J Operational Res Soc 48, 1163–1177.

SalinasJimenez, J., & Smith, P. (1996). Data envelopment analysis applied to quality in primary health care. Ann Operations Res 67, 141–161.

Schmidt, P., & Witte, A. (1984). An Economic Analysis of Crime and Justice: Theory, Methods, and Applications. Orlando, FL: Academic Press.

Sharma, S. (1996). Applied Multivariate Techniques. New York: John Wiley and Sons.

Shaw, M., & Williamson, W. (1972). Public attitudes to the police. The Criminologist 7, 18 –33. Sheehy Report. (1993). Police Responsibilities and Rewards, Cm. 2280, London, HMSO. Skogan, W. G. (1996). The police and public opinion in Britain. Am Behav Scientist 39, 421– 432. Stephens, M. (1994). Care and control: the future of British policing. Policing Soc 4, 237–251.

Sullivan, R. R. (1998). The politics of British policing in the Thatcher/Major state. The Howard J 37, 306 –318. Thanassoulis, E., Boussofiane, A., & Dyson. R. G. (1996). A comparison of data envelopment analysis and ratio

analysis as tools for performance assessment. Omega Int J Management Sci 24, 229 –244

Todd, R., & Ramanathan, K. V. (1994). Perceived social needs, outcomes, measurements, and budgetary responsiveness in a not-for-profit setting; some empirical evidence. The Accounting Rev 69, 122–137. Waddington, P. A. J., & Braddock, Q. (1991). ‘Guardians’ or ‘bullies’? perceptions of the police amongst

adolescent black, white and Asian boys. Policing Soc 2, 31– 45.


(1)

(and, necessarily, OE). Hence, these forces will tend to form part of the efficient reference set of police forces for a large number of inefficient forces, and a detailed comparison of these “best-practice” forces with the less efficient units could provide very useful information in any reorganization/restructuring process.

So far we have not addressed the issue of the statistical significance of the differences in our efficiency scores across staff size groups. We rectify this in the next section using analysis of variance (ANOVA) and discriminant analysis techniques.

4. MDA results

To assess the DEA results further, we adopt a dual post-hypothesis testing strategy that utilizes ANOVA and MDA. Both of these statistical techniques allow us to determine whether there are any significant differences between grouped police forces (see Hair et al., 1995, for an introduction). In this analysis, the categorical variable partitions the police forces into four groups that are determined by the number of total police and civilian staff operating in each force. This allows us to determine, for example, if large police forces (by total staff employed) display SE, PTE, or OE scores that are significantly better (or worse) than their smaller counterparts. If a police force has 0 to 1,500 total staff, it is a member of staff group 1; between 1,501 to 3,000, they are a member of staff group 2; between 3,001 to 4,500, they are a member of staff group 3; and above 4,501, they are a member of staff group 4. To ensure that we follow at least the minimum requirements necessary for MDA, we have stacked the 5 years before estimation. Table 7 gives the total 1992–1993 to 1996 –1997 stacked grouped summary statistics for the three independent variables, OE, PTE, and SE. As outlined previously, in terms of SE staff group 3 has the highest mean value, and the lowest standard deviation, while for PTE staff group 1 has the highest mean value with the smaller standard deviation. The results for OE reveal that staff groups 1, 2, and 3 are very close in terms of the overall mean rankings and their deviation. However, staff group 4 has the lowest overall mean value with the largest standard deviations. This can be attributed to the wide variations in OE for the West Midlands and the Metropolitan police force that are evident in Table 4.

The estimation analysis that is followed in this study involves further testing procedures after the DEA estimation. The first stage is the estimation of an ANOVA, a univariate test, where the dependent variables are PTE and SE, and the independent variable is the cate-gorical staff group.4The null hypothesis under interest is that each mean associated with the staff group is equal. As can be seen from Table 8, the F-statistic is greater than the critical value, and we can conclude that there is a statistically significant size difference associated with our two measures of efficiency. However, we do not know whether the differences are between staff groups 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, or, finally, 3 and 4.

MDA is much like ANOVA, but in this case the dependent variable is the categorical staff

4As OE is a product of the multiplication of PTE and SE, it is excluded in the second-stage estimation due to


(2)

group, and the independent variables are PTE and SE. The reason for estimating the MDA is that it offers an alternative insight into the ANOVA results found in Table 8. For example, if we are trying to predict to which staff group a police force should belong, given a value for PTE or SE, then MDA will derive the linear combination of the two independent variables that would discriminate best between the staff groups. MDA distinguishes between the groups by multiplying PTE and SE by their corresponding weights and then adds these products together giving a single discriminant score for each police force. After averaging each discriminant score in each staff group, we obtain the centroid, which we can use to compare how “far apart” the staff groups are. In this case, our hypothesis of equal means for the staff groups are based on comparing the distribution of the discriminant scores. The test analysis is such that “if the overlap in the distribution is small, the discriminant function separates the groups well. If the overlap is large, the function is a poor discriminator between the groups” (Hair et al., 1992).5

Before discussing the MDA results, we need to compare the hit ratio with the maximum-chance and proportional-maximum-chance criteria to assess the predictive accuracy of the function. The maximum chance criterion is calculated as the probability of correctly classifying all scores by placing them in the staff group with the greatest probability of occurrence, which in this model is 48%. However, with unequal groups, we can calculate a proportional-chance criteria, which in this model equals 33.34%. Hence, as our hit ratio (49.1%) exceeds the maximum-chance and proportional-chance criteria, we can conclude that the MDA model is valid based on these measures. We also checked our model using Press’s Q-statistic, which tests whether the staff group classification by MDA would exceed those classifications if carried out by chance. Having a total of 104 predicted group memberships correctly classified, the estimate of Press’s Q value equals 62.89%, which is significant at the 5% critical level. Therefore, utilizing the results obtained from the maximum-chance criteria, the proportional-chance criteria, and the hit ratio, we can conclude that the MDA model is better at predicting staff group membership than if the prediction is carried out by chance.

Table 9 provides the overall MDA results and indicates that the discriminant functions are highly significant, as measured by Wilksl and the x2 statistics. Overall, the first function

5

MDA was estimated using SPSS version 8 (SPSS Inc; Chicago, IL), with the stepwise Mahalanobis distance, Fisher’s function coefficients method, a decision rule of F being between 0.05 and 0.15, prior probabilities computed from group size, and the use of the within-groups covariance matrix.

Table 8

Test for equality of staff group means Independent

variables

Wilks

la

Univariate F ratiob

Significance

OE 0.905 7.246 0.000

PTE 0.936 4.708 0.003

SE 0.843 12.880 0.000

a

Degrees of Freedom, 3.

b


(3)

accounts for 74.4% of the variance, and the second function accounts for 25.6%. However, the functions display a low canonical correlation of 0.39 and 0.25, respectively; that is, 15.21% and 6.25% of the variance in the dependent variable can be explained by this model (in regression models this is the R2 statistic). In this context, although the latter figures are relatively low, we must remember that MDA involves the process of determining which staff group a police force should be included in. However, it cannot take into account sociological and political factors that cannot be included in the calculation but that will have an effect on how large a police force will be and, therefore, on its staff group classification.

We have found above that the discriminant function is able distinguish between the dependent variables and that there are unequal staff group mean values. A first check of whether there are indeed staff group mean differences is shown in Table 10, where we can see that none of the group centroids are equal in value. That is, it appears that the first and second discriminant functions significantly discriminates between all groups. To determine which pairs of group means are significantly different, we estimate a second-stage ANOVA that allows us to calculate post-hypothesis pairwise tests: TUKEY HSD, Scheffe, and least significant difference (see Sharma, 1996, for an introduction to this stage of analysis).

Table 11 gives the ANOVA results for the discriminant scores and the post-hypothesis pairwise tests. The initial analysis shows that the discriminant scores are significantly different across the staff group means and that the Scheffe test (our preferred post-hypothesis

Table 9

Multivariate results for four staff group discriminant analysis Function

Eigen-value

% of Variance Canonical correlation

Wilks

l

x2

df Signif-icance Function Cumulative

1 0.188a

74.4 74.4 0.397 0.791 48.792 6 0.000 2 0.065a

25.6 100.0 0.246 0.939 13.038 2 0.001

Independent Variables

Standardised Discriminant Coefficients Function 1

Canonical Function Function 2

Structure Matrixb

Function 1 Function 2

PTE 20.121 0.994 0.993c

0.121

SE 0.988 0.162 20.162 0.987c

Classification function coefficients-Fisher’s Linear Discriminant Functions

Independent Variables Police Staff Groups

1 2 3 4

PTE 0.867 0.805 0.795 0.773

SE 1.022 1.125 1.137 1.030

Constant 289.466 291.137 292.112 280.666

a

First two canonical discriminant functions were used in the analysis.

b

Pooled within-groups correlations between discriminating variables and standard canonical discriminant functions. Variables ordered by absolute size of correlation within function.

c


(4)

pairwise test) shows that function Z1 (which corresponds to PTE) is significantly different across staff groups, with the exception of 1 and 4 and 2 and 3, and that function Z2 (which corresponds with SE) shows that only staff groups 1 and 4 are significantly different. This latter result corresponds well with our earlier analysis in the sense that staff groups 2 and 3 appear to be operating closer to the constant-returns region of production and, hence, would not be expected to exhibit significant differences in SE scores. Staff groups 1 and 4, however, would be expected to exhibit increasing and decreasing returns to scale, respectively, and hence might be expected to produce significant differences in SE scores.

The results for PTE, however, suggest a more complicated story than is apparent in Table 7 (which suggested that PTE scores declined with police size). This further statistical analysis suggests that staff groups 1 and 4 tend to be more technically efficient, with staff groups 2 and 3 being less efficient but not significantly different in PTE terms. This interpretation is supported by the very high mean PTE score for police staff group 1 (99.64) and by the very high mean PTE scores of some large police forces such as the Metropolitan and Greater Manchester forces. The relatively low mean PTE score for staff group 4 (88.36) is probably explained by the presence of outliers. This is supported by the relatively high standard deviation of 15.49 evident in Table 7.

Table 11

ANOVA and post hoc multiple comparison tests

Dependent variable ANOVA

F-statistic Significance

Z1 13.004 0.000

Z2 4.485 0.004

Test criteria

Tukey HSD Scheffe LSD

Z1 1–2, 1–3, 2–4, 3–4 1–2, 1–3, 2–4, 3–4 1–2, 1–3, 2–4, 3–4

Z2 1–4 1–4 1–2, 1–3, 1–4, 2–4

Table 10

Functions at group centroids

Police staff group Function

1 2

Staff group 1 20.851 0.611

Staff group 2 0.214 0.005

Staff group 3 0.344 20.003


(5)

5. Conclusions

This article is the first to use DEA to examine the relative efficiency of the English and Welsh police forces. Relatively few forces were identified as being consistently efficient throughout the sample period. The “best-practice” police forces could be used as valuable comparitors in any attempt to restructure police forces to improve productivity and effi-ciency. The study revealed important information concerning the size-efficiency relationship in English and Welsh policing. Specifically, evidence of significant increasing and decreas-ing returns was found at the extremes of the size spectrum, and the SE results were supportive of a “saucer-shaped” average cost curve in policing. This was confirmed by subsequent ANOVA and MDA analysis. Furthermore, the SE results suggest that the optimal size of police forces in England and Wales is at staff group 2 or 3.

Interestingly, however, the PTE scores suggested a very different size-efficient relation-ship for X-efficiency. Specifically, the smallest and largest forces tend to produce relatively higher PTE scores than the intermediate size forces, although staff group 1 showed the greatest consistency with respect to high levels of X-efficiency. The clear differences between the SE and PTE results suggest that the process of enhancing overall police force efficiencies in England and Wales will necessarily be difficult and complex. Nevertheless, this article demonstrates that DEA can produce highly informative results that could, in conjunction with other types of analysis, potentially influence the design of efficiency-enhancing reforms in English and Welsh policing.

References

Audit Commission (1990). Effective Policing – Performance Review in Police Forces. Police Paper No. 8, London.

Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale efficiencies in data envelopment analysis. Management Sci 30, 1078 –1092.

Byrne, D., Dezhbakhsh, H., & King, R. (1996). Unions and police productivity: an econometric investigation. Industrial Relations 35, 566 –584.

Chalos, P., & Cherian, J. (1995). An application of data envelopment analysis to public-sector performance-measurement and accountability. J Accounting Public Policy 14, 143–160

Charnes, A., Cooper, W. W., & Rhoades, E. (1978). Measuring the efficiency of decision making units. Eur J Operational Res 2, 429 – 444.

Drake, L. (1997). Measuring efficiency in UK banking. Economic Research Paper No. 97/18, Department of Economics, Loughborough University.

Drake, L., & Howcroft, B. (1994). Relative efficiency in the branch network of a UK bank: an empirical study. Omega Int J Management Sci 22, 83–90.

Drake, L., & Weyman-Jones, T. G. (1992). Technical and scale efficiency in UK building societies,“ Appl Financial Econ 2, 1–9

Drake, L., & Weyman-Jones, T. G. (1996). ”Productive and allocative inefficiencies in U.K. building societies: a comparison of non-parametric and stochastic frontier techniques. Manchester Sch of Econ of Soc Studies 114, 22–37.

Fare, R., Grosskopf, S., & Lovell, C. A. K. (1985). The Measurement in Efficiency Production. Boston, MA: Kluwer Nijhoff.


(6)

Ferrier, G. D., & Lovell, C. A. K. (1990). Measuring cost efficiency in banking: econometric and linear programming evidence. J Econometrics 46, 229 –245.

Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1992). Multivariate Data Analysis With Readings. New York: Macmillan Publishing Company.

Home Office. (1993). Police Reform. The Government’s Proposals for the Police Service in England and Wales. London, UK: Her Majesty’s Stationary Office, CM 2281.

Her Majesty’s Inspector of Constabulary. (1995). Obtaining Value for Money in the Police Service, London, UK.: HMIC.

Her Majesty’s Inspector of Constabulary. (1998). What Price Policing, London, UK.: HMIC.

Jackson, P. I. (1992). The police and social threat: urban transition, youth gangs, and social control. Policing and Society 2, 193–204.

Johns, N., Howcroft. B., & Drake. L. (1997). The use of data envelopment analysis to monitor hotel productivity. Prog Tourism Hospitality Res 3, 119 –127.

Liebenstein, H. (1966). Allocative efficiency vs. x-efficiency. Am Econ Rev 56, 392– 415. O’Brien, R. M. (1996). Police productivity and crime rates: 1973-1992. Criminology 34, 183–207.

PedrajaChaparro, F., & SalinasJime´nez, J. (1996). An assessment of the efficiency of spanish courts using DEA. Appl Econ 28, 1391–1403.

Police Research Group. (1993). Opportunities for reducing the administrative burdens on the police. Police Research Series, Paper No. 3, Home Office Police Department, London.

Redshaw, J., Mawby, R. I., & Bunt, P.(1997). Evaluating core policing in Britain: the views of police and consumers. Int J Sociology Law 25 283–301.

Sarrico, C. S., Hogan, S. M., Dyson, R. G., & Athanassopoulos. A. D. (1997). Data envelopment analysis and university selection. J Operational Res Soc 48, 1163–1177.

SalinasJimenez, J., & Smith, P. (1996). Data envelopment analysis applied to quality in primary health care. Ann Operations Res 67, 141–161.

Schmidt, P., & Witte, A. (1984). An Economic Analysis of Crime and Justice: Theory, Methods, and Applications. Orlando, FL: Academic Press.

Sharma, S. (1996). Applied Multivariate Techniques. New York: John Wiley and Sons.

Shaw, M., & Williamson, W. (1972). Public attitudes to the police. The Criminologist 7, 18 –33. Sheehy Report. (1993). Police Responsibilities and Rewards, Cm. 2280, London, HMSO. Skogan, W. G. (1996). The police and public opinion in Britain. Am Behav Scientist 39, 421– 432. Stephens, M. (1994). Care and control: the future of British policing. Policing Soc 4, 237–251.

Sullivan, R. R. (1998). The politics of British policing in the Thatcher/Major state. The Howard J 37, 306 –318. Thanassoulis, E., Boussofiane, A., & Dyson. R. G. (1996). A comparison of data envelopment analysis and ratio

analysis as tools for performance assessment. Omega Int J Management Sci 24, 229 –244

Todd, R., & Ramanathan, K. V. (1994). Perceived social needs, outcomes, measurements, and budgetary responsiveness in a not-for-profit setting; some empirical evidence. The Accounting Rev 69, 122–137. Waddington, P. A. J., & Braddock, Q. (1991). ‘Guardians’ or ‘bullies’? perceptions of the police amongst

adolescent black, white and Asian boys. Policing Soc 2, 31– 45.


Dokumen yang terkait

Prediksi Penyakit Menggunakan Genetic Algorithm (GA) dan Naive Bayes Untuk Data Berdimensi Tinggi Prediction of Disease Using Genetic Algorithm (GA) and Naive Bayes For Data High Dimension

0 0 11

Pemodelan dan Simulasi Penurunan Tekanan pada Pipa Transmisi Menggunakan Metode Secant Modeling and Simulation Pressure Drop in Transmission Pipeline Using Secant Method

0 0 13

Prediksi Penyakit Menggunakan Algoritma Differential Evolution (DE) dan Least Square Support Vector Machine (LSSVM) Untuk Data Berdimensi Tinggi Prediction Of Disease Using Differential Evolution (DE) and Least Square Support Vector Mchine (LSSVM) For Hig

0 0 10

Implementasi dan Analisis Keterkaitan Semantik Antar Kata Menggunakan Pointwise Mutual Information max dengan Estimasi dari Kata Polisemi Implementation and Analysis of Semantic Relatedness to Words Pair Using Pointwise Mutual Information max with Estimat

1 3 5

Pengukuran Happiness Index Masyarakat Kota Bandung pada Media Sosial Twitter Menggunakan Pendekatan Ontologi Top-Down Hierarchy Happiness Index Measurement of Bandung Citizen on Social Media Twitter Using Top- Down Hierarchy Ontology Approach

0 0 10

Pemberian Peringkat Jawaban pada Forum Tanya-Jawab Online Menggunakan Lexical dan Semantic Similarity Measure Feature Answer Ranking in Community Question Answering using Lexical and Semantic Similarity Measure Feature

0 0 8

Kategorisasi Topik Tweet di Kota Jakarta, Bandung, dan Makassar dengan Metode Multinomial Naïve Bayes Classifier Tweet Topic Categorization in Jakarta, Bandung, and Makassar with Multinomial Naïve Bayes Classifier

0 0 9

Pemeringkatan Jawaban pada Community Question Answering dengan Tekstual Fitur dan Pemodelan Topik Answer Ranking in Community Question Answering with Textual Feature and Topic Models

0 0 8

Pattern of English Sentence

0 0 21

HANDBOOK of ENERGY ECONOMIC STATISTICS of INDONESIA

0 1 126