Directory UMM :Data Elmu:jurnal:O:Operations Research Letters:Vol28.Issue2.2001:

Operations Research Letters 28 (2001) 93–98
www.elsevier.com/locate/dsw

A comment on multi-stage DEA methodology
Laurens Cherchye ∗ , Tom Van Puyenbroeck
Center for Economic Studies, Catholic University of Leuven, Naamsestraat 69, B-3000 Leuven, Belgium
Received 24 May 1999; accepted 20 June 2000

Abstract
Coelli (Oper. Res. Lett. 23 (1998) 143) introduced a multi-stage (MS-) DEA methodology that represents an interesting
alternative for the treatment of slacks remaining after proportional correction in inputs or outputs. By solving a sequence of
radial models, one thus arrives at the identi cation of “more representative ecient points” (in terms of similar input=output
mixes). We show that “most representative ecient points” can be found using a direct approach and may di er from those
obtained by MS-DEA. Nevertheless, MS-DEA remains very attractive in view of its economic (intrinsic price) legitimation.
c 2001 Elsevier Science B.V. All rights reserved.

Keywords: Multi-stage DEA; Mix deviation; Cost interpretation

1. Introduction
As a solution to the slack problem of radial DEA
measures, Coelli [6] proposed a multi-stage (MS-)

DEA methodology that helps to identify “more representative ecient points” for the orientated DEA models relative to those obtained by the original two-stage
linear programming process presented in [1]. Specifically, he proposes to proceed in each further step
by radially scaling down that (input=output) subvector in which the evaluated observation is dominated in
the Koopmans [10] sense. His conclusion is that “this
approach seeks to identify ecient projected points
which have input and output mixes which are as sim∗ Corresponding author. Tel.: +32-16-326854; fax: +32-16326796.
E-mail address: laurens.cherchye@econ.kuleuven.ac.be
(L. Cherchye).

ilar as possible to those of the inecient points” (p.
148). However, he gives no precise speci cation of
how mix deviation should be measured. In this paper
we show that, for an intuitive de nition of the concept,
the above assertion is not strictly correct. Nevertheless,
because of its computational convenience, MS-DEA
remains recommendable. Moreover, it can be given an
attractive economic legitimation as we subsequently
demonstrate. Some remarks are contained in the nal
section.
We use 5 (=) to indicate “less (greater) than or

equal to”, while 6 (¿) means 5 (=) and =. Further, cos(u; z) represents cos  with  being the angle between vectors u and z. For ease of exposition,
we focus solely on input eciency in the following
and discuss mix deviation in terms of input combinations. Extensions to output projections are straightforward, however. We further restrict attention to the
constant returns to scale (CRS) DEA model presented

c 2001 Elsevier Science B.V. All rights reserved.
0167-6377/01/$ - see front matter 
PII: S 0 1 6 7 - 6 3 7 7 ( 0 0 ) 0 0 0 6 8 - 7

94

L. Cherchye, T.Van Puyenbroeck / Operations Research Letters 28 (2001) 93–98

in [3], but a similar discussion applies to other DEA
models.

2. Minimizing mix deviation
An appropriate indicator for the deviation between
reference and evaluated input–output mixes appears
to be the angle between the two vectors (see also [4]).

In this respect, maximizing the cosine of this angle
implies minimization of dispersion between the mixes.
Obviously, standard DEA (radial) measures imply a
cosine of value 1 as mixes of evaluated and reference
combinations coincide. The search for a Koopmans
ecient reference as similar as possible in terms of
input and output mixes, then implies maximizing the
cosine over the ecient subset.
Consider a setting with N rms. Each ith rm uses a
semi-positive input vector xi = (xi1 ; : : : ; xiK ) to produce
a semi-positive output vector yi = (yi1 ; : : : ; yiM ). Let
the input correspondence associated with yi be represented by L(yi ). For DEA-CRS the latter is de ned as
(for j = 1; : : : ; N )
LCRS (yi ) ≡

 
N
N

 


j yj = yi ;
j xj 5 xi ; j = 0 :
x 

 
j=1
j=1

Further, let E LCRS (yi ) represent the Koopmans
ecient subset of LCRS (yi ):
E LCRS (yi )
≡ {x | x ∈ LCRS (yi ); x′ 6x ⇒ x′ ∈ LCRS (yi )}:

(2.1)

The following conditions should then be met by a
reference vector xR for xi :
(C.1) xR ∈ E LCRS (yi ),
(C.2) cos(xR ; xi ) = cos(xR′ ; xi ) ∀xR′ ∈ E LCRS (yi ).

xR can be identi ed in two steps: rst locate
E LCRS (yi ) and look for its cosine maximizing element in a second step. In a DEA setting ecient
subsets are constituted by ecient facets of convex
polyhedrons. Algorithms and software codes to identify these facets have been developed in the multiple
objective linear programming eld (see [14] for an
overview). An ecient facet F can be expressed as

the set of all convex combinations of P ecient extreme points {xE1 ; : : : ; xEP }: So, identifying for each F
the element which maximizes the cosine with xi requires solving the following non-linear programming
problem:
K l P
l
1
l=1 xi (
j=1 j xEj )


max
(2.2)
P

||xi ||
K
l 2
l=1 (
j=1 j xEj )
subject to
P


j = 1;

j = 0:

j=1

This problem can be solved by means of simple
Lagrangian techniques. The same procedure should be
followed for each rm in the sample (i.e. N times).
The maximum cosine is then associated with the most
“resembling” reference in terms of closest input mix.

Note that it may sometimes be computationally more
ecient to proceed by identifying all ecient facets
of the whole production possibility set (and not of
each separate LCRS (yi ) individually) following, e.g.
the algorithm of Yu et al. [15]. A straightforwardly
analogous procedure then applies.
The “mix dispersion minimizing” reference will
not necessarily coincide with that obtained from
the MS-DEA algorithm as can be illustrated
by means of a simple counterexample. Suppose
we have three observations all of which produce
the same amount of output. The input vectors are
x1 = (10; 9; 15); x2 = (10; 7=2; 2) and x3 = (10; 5=2; 8).
The one-step radial procedure labels all three observations ecient. However, it is clear that only x2 and
x3 are Koopmans ecient as x1 has a slack in both
the second and third inputs. Convex combinations of
x2 and x3 constitute the ecient subset. The MS-DEA
method will proceed in the second step by comparing
collinearly the subvector (9; 15) of x1 to convex combinations of (7=2; 2) and (5=2; 8), which is illustrated
in Fig. 1. The eventual MS-DEA projection for x1 is

(10; 3; 5). The cosine of the angle between this reference and x1 is 0.866. We can now apply model (2.2).
The cosine values are depicted in Fig. 2 as a function
of the intensity parameter , which gives the weight
of x3 in the reference vector. In Fig. 2 we also indicate
the point = 0:5, i.e. the value associated with the
MS-DEA projection. This gure reveals that there are
other combinations of x2 and x3 with a higher cosine

L. Cherchye, T.Van Puyenbroeck / Operations Research Letters 28 (2001) 93–98

Fig. 1. Second-step projection.

Fig. 2. Evolution of cosine value.

value. It appears that mix deviation is minimized for
= 1, or: x3 is the most representative reference. The
associated cosine value between x1 = (10; 9; 15) and
its mix dispersion minimizing reference (10; 5=2; 8)
is 0.922, which is clearly above 0.866.
A point that deserves special attention is the fact

that the cosine measure (2.2) is not invariant to the
units in which the di erent (input) quantities are measured. The units invariance property has been especially advocated by Russell [13]. To establish units invariance of the cosine estimate prescaled data xˆjl are to
be used instead of the original data xjl (l=1; : : : ; K; j=
1; : : : ; N ).
We will brie
y discuss two possible prescaling procedures. A procedure that is frequently employed uses
xˆjl = xjl =l , where l represents the standard deviation
of each input variable l (see [11] where it was rst
proposed in DEA). Strictly speaking, this procedure
cannot be applied to our speci c example as 1 =0. Of
course, this is a somewhat pathological case. Thus, as-

95

suming there is at least one other observation x4 with
x41 = xi1 (i = 1; 2; 3), such a normalization procedure
could have generated the data of our example, and the
case against the MS-DEA procedure still holds even
when a units invariant procedure for measuring mix
deviation (in cosine terms) is employed.

A particularly interesting alternative prescaling procedure uses xˆjl =xjl =xil , where i stands for the evaluated
rm (see e.g. [2]). Incorporating this in (2.2) yields
the following objective:
K
l ( )
1
max √
l=1
;
(2.3)

K
K
2
l=1 (l ( ))
P
l
where l ( ) = [( j=1 j xEj
)=xil ] denotes the proportional shrinkage in each input l to get from the evaluated
P input vector (xi ) to the reference input vector

( j=1 j xEj ). In fact, the measure in (2.3) captures
the cosine of the angle between the shrinkage vector ( ) = (1 ( ); : : : ; K ( )) and the unit vector e =
(1; : : : ; 1). When l ( )=l′ ( ) for all l; l′ ∈ {1; : : : ; K}
(i.e. the cosine maximizing projection equals the radial projection) then the cosine will equal unity as the
shrinkage vector is collinear to the unit vector. On the
other hand, the more ( ) deviates from the unit vector e in “mix” terms, the lower is the cosine value.
This is intuitively appealing as in that case the proportional shrinkage factors (i.e. the l ( )) are increasingly unequal. It is easy to check that applying this
prescaling procedure to our example leads to exactly
the same qualitative results, i.e. x3 would be obtained
as the “most representative reference” for x1 .
An interesting by-product of this last procedure is
that it allows for immediately constructing an eciency measure as a useful complement to the projection procedure, viz., by aggregating the obtained (cosine maximizing) references. For example,
K one could
simply take the arithmetic mean (i.e. l=1 l ( )=K)
K
or the harmonic mean (i.e. K= l=1 (l ( ))−1 ) as an
eciency gauge (compare with [8,5], respectively, although we emphasize that the obtained eciency measure need not be associated with the same projection
point as identi ed in [8, 5]).
We have thus established that MS-DEA does not
necessarily yield most representative Koopmans ecient references in terms of mix properties. One could
claim that this nding depends on how mix deviation
is measured, viz., by means of the angle between the

96

L. Cherchye, T.Van Puyenbroeck / Operations Research Letters 28 (2001) 93–98

two vectors. However, the latter appears to be a natural choice and it is unlikely that our result would be
contradicted for any other indicator.
While the above suggests that Coelli’s original motivation for MS-DEA is debatable, the projection procedure clearly has very attractive features. As pointed
out by Coelli [6], it is units invariant (no prescaling
is hence required) and computationally not very demanding. Last but not least, there exists an economic
justi cation for MS-DEA as we will show in the next
section.

3. An economic legitimation for MS-DEA
Let us denote the radial (input) eciency estimate
associated with an input–output vector (xi ; yi ) by i .
Debreu [7] proved that this measure can also be expressed as follows:
i ≡ max
xR

(pR )T xR
;
(pR )T xi

with xR ∈ Isoq LCRS (yi );
(3.1)

where pR = 0 is a shadow (or “implicit”) price vector
associated with xR (i.e. (pR )T xR 5 (pR )T xl ∀xl ∈
LCRS (y)) and Isoq LCRS (yi ) the isoquant of LCRS (yi ):
Isoq LCRS (yi ) ≡
{x | x ∈ LCRS (yi ); x ∈ LCRS (yi ) for  ∈ [0; 1)}:

(3.2)

Clearly, from comparison of (2.1) with (3.2),
E LCRS (yi ) ⊆ Isoq LCRS (yi ): radially ecient input
combinations are not necessarily Koopmans ecient.
So, (3.1) reveals that radial projection results in a reference belonging to the isoquant that maximizes the
ratio of reference to actual costs (referred to as “cost
ratio” in the following) evaluated against a shadow
price vector associated with the latter (this shadow
price vector need not necessarily be unique and is determined up to a positive multiplicative scalar). This
gives a nice economic legitimation for (one-step) radially projecting on the production frontier. That is,
the radial measure has an evocative characterization
as an upper bound to economic eciency using the
shadow prices implicit in the production technology
(see also [12]).

The slack problem of radial DEA measures boils
down to the possibility of zero elements in pR . In the
example of the previous section a zero implicit price
is associated with the second and third inputs of x1 .
On the other hand, Koopmans [10] showed that in
a linear activity setting (like DEA) each element of
the ecient subset corresponds to at least one strictly
positive shadow price vector. This Koopmans characterization of ecient production is widely accepted
because of its intimate link with the rst two fundamental theorems of welfare economics, which say that
(for this speci c setting) a productive organization is
(input and output) ecient if and only if one can construct a strictly positive price vector under which it
becomes pro t maximizing. Analogously, in the input
space a vector is ecient if and only if there exists
a strictly positive price combination under which it is
cost minimizing. In our example this is the case for
x2 ; x3 and all their convex combinations.
The MS-DEA projection obviously meets the Koopmans de nition. Moreover, it can be shown to maximize the cost ratio, but now precisely evaluated against
a strictly positive shadow price vector. To see this,
suppose that the one-step radial procedure projects xi
on xR ∈ Isoq LCRS (yi ) \ E LCRS (yi ) (xR = i xi ). Denote the set of all xE ∈ E LCRS (yi ) with the same
amount of inputs in which xR does not exhibit slack by
XE ≡ {x | x ∈ E LCRS (yi ) and x6xR }. Clearly, all elements of XE have the same cost ratio as xR (with zero
prices accorded to those dimensions in which there
is slack). E.g., in the above example this applies for
convex combinations of x2 and x3 when x1 is evaluated. For all other xE′ ∈ E L(yi ) \ XE the cost ratio is
strictly exceeded by that of xR . Each element of XE is
by construction also associated with a strictly positive
shadow price vector. Thus, by choosing the shadow
prices corresponding to those dimensions in which xR
has a slack small enough compared to those of the
non-slack dimensions, we still have that the resulting
cost ratio related to each xE ∈ XE will strictly be above
the cost ratios of all xE′ ∈ E L(yi ) \ XE : That is, we
are led to choose among the xE ∈ XE when searching for (in terms of cost ratios) a suitable Koopmans
ecient reference.
Split up xR1 in a slack subvector xRS and a non-slack
subvector xRN , and similarly decompose the shadow
price vector pR associated with xR in pRS (=0) and
pRN (¿ 0). In the second step only xRS is further

L. Cherchye, T.Van Puyenbroeck / Operations Research Letters 28 (2001) 93–98

reduced. Also decompose xi and xE (∈ XE ) in xiS
and xiN (so that xRl = i xil (l = N; S)) and xES and
xEN (=i xiN ), respectively. One thus has to nd xR∗
(decomposable in xR∗ S 6xRS with associated implicit
price vector pR∗ S ¿0 and xR∗ N = i xiN with, without
loss of generality, pR∗ N = pRN ) such that:
i (pRN )T xiN + (pR∗ S )T xR∗ S
(pRN )T xiN + (pR∗ S )T xiS
=

i (pRN )T xiN + (pES )T xES
;
(pRN )T xiN + (pES )T xiS

(3.3)

97

of R∗ S are equal to zero). In this case one can formulate a straightforwardly analogous argument to justify
a radial procedure in the third and (if necessary) each
further step.
Summarizing, the MS-DEA algorithm has a sound
economic basis which is very similar to the one-step
procedure, but strengthens the latter by making
shadow prices in the cost ratio strictly positive. There
always exists a pair of strictly positive implicit price
vectors under which the MS-DEA cost ratio exceeds
that associated with any other element of the ecient
subset.

where xES corresponds to the xE ∈ XE with a shadow
price vector pES ¿0. (3.3) can be formulated as
i + (R∗ S )T xR∗ S
i + (ES )T xES
=
;
1 + (R∗ S )T xiS
1 + (ES )T xiS

4. Conclusion
(3.4)

where (R∗ S )T xR∗ S = (ES )T xES ≡ c, with c being a positive constant. To see this, note that (3.3)
should hold for all rescalings of the vectors pRN and

pkS (k = R ∗ ; E). Next, we use that xRS ; xRS
and xE
T
are radially ecient, i.e. (pRN ) (i xiN ) 5 (pRN )T xlN
and (pRN )T (i xiN ) + (pkS )T xkS 5 (pRN )T xlN +
(pkS )T xlS (k = R ∗ ; E) for all xl = (xlN ; xlS ) ∈
LCRS (y). Convex combinations of these conditions
(with k ∈ [0; 1]; k = R ∗ ; E) yield (pRN )T (i xiN ) +
k (pkS )T xkS 5 (pRN )T xlN + k (pkS )T xlS . That is,
for xR∗ and xE it is possible to rescale shadow prices
associated with slack inputs independent of those
corresponding to non-slack inputs while maintaining
cost minimization. Then, xing R∗ and E appropriately gives (pRN )T (i xiN ) + R∗ (pR∗ S )T xR∗ S =
(pRN )T (i xiN ) + E (pES )T xES ; from which we obtain (R∗ S )T xR∗ S = (ES )T xES ≡ c for (kS )T xkS =
[(k (pkS )T xkS )=((pRN )T (i xiN ))] (k = R ∗ ; E): Condition (3.4) becomes
(i + c)((ES )T xiS − (R∗ S )T xiS ) = 0:

(3.5)

Eq. (3.5) is ful lled when xR∗ S is obtained from
equiproportionate reduction of xRS , as from (3.1):
(R∗ S )T xR∗ S
(ES )T xES
=
T
i (R∗ S ) xiS
i (ES )T xiS
or (ES )T xiS − (R∗ S )T xiS = 0:
This establishes an economic legitimation for applying radial (subvector) projection in the second step.
Of course, it could be that xR∗ ∈ XE (some elements

When gauging the relative performance of a productive unit it often seems defendable to use benchmarks that are its peers in the most genuine sense.
This is after all one intuitive rephrasement of the
bene t-of-the-doubt perspective underlying a great
deal of the (in)eciency measurement literature. On
the other hand, sticking to Farrell’s [9] “most conservative assumption” regarding mixes can yield peers
that are themselves relatively inecient. Identifying a mix-minimizing ecient reference in the way
described above safeguards the conservative assumption to the furthest extent possible. In particular, it
will weakly dominate MS-DEA in identifying the
‘best’ ecient peers. Evidently, this yields a “most
representative ecient reference” only to the extent
that one indeed interprets the general notion of representativeness in terms of product mixes. Or stated
otherwise, the produced peers can still be criticized
in view of the fact that an (almost) equiproportionate rescaling of input=output vectors is not always
implementable in practice.
Although the latter issue may occasionally arise in
evaluation practice, the basic idea of radial correction is still defended by many in view of its intuitive economic interpretation as a bene t-of-the-doubt
cost ratio. Adhering to this framework, the wasteful
production associated with slacks can be interpreted
as the presence of implicit zero prices in that ratio.
When slacks are removed using MS-DEA this e ectively implies that the economic interpretation can be
strengthened to a bene t-of-the-doubt cost ratio with
strictly positive implicit prices. Even if it were only on

98

L. Cherchye, T.Van Puyenbroeck / Operations Research Letters 28 (2001) 93–98

this account, MS-DEA seems a particularly appealing
method for the treatment of slacks in the programming
approach to productive eciency measurement.
Acknowledgements
We want to thank Tim Coelli, Holger Scheel and an
anonymous referee for helpful comments on previous
versions of this paper.
References
[1] A.I. Ali, L.M. Seiford, The mathematical programming
approach to eciency analysis, in: H.O. Fried, C.A.K.
Lovell, S.S. Schmidt (Eds.), The Measurement of Productive
Eciency, Oxford University Press, New York, 1993, pp.
120–159.
[2] I. Bardhan, W.F. Bowlin, W.W. Cooper, T. Sueyoshi, Models
and measures for eciency dominance in DEA: Part II: Free
Disposal Hull and Rusell measure approaches, J. Oper. Res.
Soc. Japan 39 (1996) 333–334.
[3] A. Charnes, W.W. Cooper, E. Rhodes, Measuring the
eciency of decision making units, Eur. J. Oper. Res. 2
(1978) 429–444.
[4] L. Cherchye, T. Van Puyenbroeck, Non-radial eciency
as semi-radial eciency, in: G. Westermann (Ed.), Data
Envelopment Analysis in the Service Sector, Gabler,
Wiesbaden, 1999, pp. 51–64.

[5] L. Cherchye, T. Van Puyenbroeck, Product mixes as objects
of choice in non-parametric eciency measurement, Eur. J.
Oper. Res. 132 (2001) 45–53.
[6] T.J. Coelli, A multistage methodology for the solution of
orientated DEA models, Oper. Res. Lett. 23 (1998) 143–149.
[7] G. Debreu, The coecient of resource utilization,
Econometrica 19 (3) (1951) 273–292.
[8] R. Fare, C.A.K. Lovell, Measuring the technical eciency of
production, J. Econom. Theory 19 (1) (1978) 150–162.
[9] M. Farrell, The measurement of productive eciency, J. Roy.
Statist. Soc. Ser. A 120 (1) (1957) 253–281.
[10] T. Koopmans, Analysis of production as an ecient
combination of activities, in: T. Koopmans (Ed.), Activity
Analysis of Production and Allocation: Proceedings of a
Conference, Yale University Press, New Haven, 1951, pp.
33–97.
[11] C.A.K. Lovell, J.T. Pastor, Units invariance and translation
invariance in DEA models, Oper. Res. Lett. 18 (1998) 147–
151.
[12] R.R. Russell, Measures of technical eciency, J. Econom.
Theory 35 (1) (1985) 109–126.
[13] R.R. Russell, On the axiomatic approach to the measurement
of technical eciency, in: W. Eichhorn (Ed.), Measurement
in Economics: Theory and Application of Economic Indices,
Physica-Verlag, Heidelberg, 1988, pp. 207–220.
[14] R.E. Steuer, Random problem generation and the computation
of ecient extreme points in multiple objective linear
programming, Comput. Optim. Appl. 3 (1994) 333–347.
[15] G. Yu, Q. Wei, P. Brockett, L. Zhou, Construction of all
ecient surfaces of the production possibility set under the
generalized data envelopment analysis model, Eur. J. Oper.
Res. 95 (1996) 491–510.