Linear programming system identification pdf

European Journal of Operational Research 161 (2005) 663–672
www.elsevier.com/locate/dsw

Continuous Optimization

Linear programming system identification
Marvin D. Troutt

a,*

, Suresh K. Tadisina b,1, Changsoo Sohn
Alan A. Brandyberry d,3

c,2

,

a

c


Department of Management and Information Systems, Kent State University, Kent, OH 44242-0001, USA
b
Department of Management, Southern Illinois University, Carbondale, IL 62901-4627, USA
Department of Business Administration, School of Business Administration, The Catholic University of Korea, San 43-1,
Yokkok 2-Dong, Wonmi-Gu, Puchon City, Gyuonggi-Do 420-743, South Korea
d
Department of Management and Information Systems, Kent State University, Kent, OH 44242-0001, USA
Received 22 February 2002; accepted 30 July 2003
Available online 13 December 2003

Abstract
We define a version of the Inverse Linear Programming problem that we call Linear Programming System Identification. This version of the problem seeks to identify both the objective function coefficient vector and the constraint
matrix of a linear programming problem that best fits a set of observed vector pairs. One vector is that of actual
decisions that we call outputs. These are regarded as approximations of optimal decision vectors. The other vector
consists of the inputs or resources actually used to produce the corresponding outputs. We propose an algorithm for
approximating the maximum likelihood solution. The major limitation of the method is the computation of exact
volumes of convex polytopes. A numerical illustration is given for simulated data.
Ó 2003 Elsevier B.V. All rights reserved.
Keywords: Inverse optimization; Linear programming; Parameter estimation; Constraint matrix estimation; Maximum likelihood


1. Introduction
Linear programming (LP) is perhaps the most
widely applicable technique in Operational Research. However, obtaining the parameters,

*
Corresponding author. Tel.: +1-330-672-1145; fax: +1-330672-2953.
E-mail addresses: mtroutt@kent.edu (M.D. Troutt), suresht@siu.edu (S.K. Tadisina), csohn@catholic.ac.kr (C. Sohn),
abrandyb@kent.edu (A.A. Brandyberry).
1
Tel.: +618-453-3307.
2
Tel.: +8232-340-3287.
3
Tel.: +330-672-1146; fax: +330-672-2953.

namely, the objective function coefficients and
technological coefficient matrix, is typically the
most difficult step of a modeling project based on
LP. The method proposed here can estimate the
LP objective coefficient vector and the technological coefficient matrix from time series or crosssectional data on actual decisions and resources

used. We call the actual decisions outputs and
similarly, we call the resources inputs. Thus, given
data on output vectors and input vectors we wish
to infer the best fitting LP model.
This problem lies in the class known as inverse
optimization and can be referred to as an instance
of Inverse LP. Tarantola (1987) has given a general

0377-2217/$ - see front matter Ó 2003 Elsevier B.V. All rights reserved.
doi:10.1016/j.ejor.2003.07.019

664

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

account of inverse problem solving and discusses
several applications, particularly in the physical
sciences. Although LP is used in that work for
algorithmic steps, the Inverse LP problem itself
was not addressed. On the other hand, Zhang and

Liu (1996), and Ahuja and Orlin (2001), respectively, have treated versions of the Inverse LP
problem. Zhang and Liu (1996) considered inverse
assignment and minimum cost flow problems.
Ahuja and Orlin (2001) studied a more general LP
problem and a version of the problem as follows.
Given a feasible solution x0 , find the minimum
perturbation of the objective function coefficients
so that x0 becomes an optimal solution. Solutions
to this problem were obtained for both the L1 - and
L1 -norms. In this paper, we consider a version
of the inverse LP problem that seeks estimates of
both the objective function vector and the constraint matrix coefficients from a sample of observed resource vectors and decision vectors. We
call this version LP system identification. It may be
stated formally following some notational conventions.
Notational conventions: Vectors and matrices
are written in boldface fonts. Vectors are column
vectors. The transpose of vector y is denoted by y0 .
We denote by mðÞ the volume (Lebesgue measure)
of a set (Æ). We abbreviate the terms, probability
density function(s), as pdf(s). Similarly, we

abbreviate the term, cumulative distribution
function, by cdf.
We may now state the problem formally. Suppose that for each observation index t ¼ 1; . . . ; T
there are given data on output or decision vectors
yt , with components yrt , r ¼ 1; . . . ; R and input
vectors, xt , with components xti , i ¼ 1; . . . ; I. Suppose further that (1) there exists a nonnegative
technological coefficient matrix A ¼ fair g, such
that Ayt 6 xt for all t; and (2) there exists a nonnegative objective coefficient vector p with components pr , such that the observed yt -vectors
approximate, in a sense to be defined below,
optimal solutions of the linear programming
problems Pt given by
Pt :

max

p0 y;

s:t:

Ayt 6 xt for all t and

y P 0 componentwise:

ð1:1Þ

Let yt denote any optimal solution to problem Pt .
Then the ratio
vt ¼ p0 yt =p0 yt

ð1:2Þ

is a measure of the degree to which the actual
performance for observation t achieves its maximum value. Such ratios are instances of decisional
efficiency measures as proposed in Troutt (1995).
In that paper, a method of parameter estimation
called the maximum decisional efficiency (MDE)
principle was proposed. This paper applies the
MDE principle to solve the sub-problems of estimating p given A as A varies over a set of admissible choices. Following estimation of p given A,
the imputed value of the ratios in (1.2) may be
obtained. A pdf can then be fitted to these ratios,
and a joint pdf can be constructed for the set of

feasible output vectors for each linear programming problem Pt . From these pdfs, a likelihood
score can be obtained for the observed sample.
Thus, we develop a maximum likelihood approach
for the joint estimation of p and A.
The rest of the paper is organized as follows.
Section 2 discusses the Maximum Decisional Efficiency principle and a maximin estimation model
derived from it. In Section 3, a method for representing the set of admissible or candidate Amatrices is proposed. Section 4 develops a joint
likelihood function for the sample of output vectors. In terms of the above notations, this likelihood is a function of the data and parameters
denoted by Lðy1 ; . . . ; yt ; x1 ; . . . ; xt jp; AÞ. Section 5
summarizes the estimation method in pseudo-code
form and provides a numerical illustration on
simulated data. Section 6 provides a discussion of
limitations and potential extensions, and Section 7
provides conclusions.

2. Overviews of the strategy and maximum decisional efficiency estimation
An overview of the estimation strategy is as
follows. The MDE approach is used to solve the
sub-problem of finding the best p-estimate,
p ¼ p ðAÞ, corresponding to each A-matrix candidate. This problem is called the MMA model

and is given in (2.11)–(2.14). After a review of the

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

general MDE method, the MMA model is derived.
Then the space of admissible A-matrices is constructed in Section 3. A likelihood score is associated to each A and p pair in Section 4 and the
space of admissible A-matrices is searched for the
pair (A, p ðAÞ) with the largest likelihood. Thus,
we propose a maximum likelihood procedure for
the joint estimation of p and A.
For the derivation of the MMA model below,
it is useful to briefly review the MDE estimation
principle in general form as discussed in Troutt
(1995). Let f ðx; pÞ be an objective function
depending on an unknown, or perhaps imprecisely
known, parameter vector, p. Let X Rn be compact, and let P Rm be compact. Assume
f ðx; pÞ P 0 for all x 2 X and p 2 P . Finally, let
x ðpÞ be any solution of max f ðx; pÞ. Then define
the decisional efficiency, vt , of observation, xt , by
vt ¼ vt ðpÞ ¼ f ðxt ; pÞ=f ðx ðpÞ; pÞ:


ð2:1Þ

t

Clearly, 0 6 v ðpÞ 6 1 for all p 2 P . Thus, vt ðpÞ
measures the degree of optimality of the observed
decision, xt , as a function of the unknown
parameter vector, p. Since it is also the ratio of an
actual return or benefit to maximum potential one,
it can be regarded as an efficiency measure, hence,
the term decisional efficiency.
In the case of one observation, the MDE principle chooses the estimate of the true value of p as
that p0 for which
0

vðp Þ ¼ max vðpÞ:

ð2:2Þ


To extend the notion to multiple observations, let
vðpÞ be the vector with components vt ðpÞ. Let r be
an aggregator function such as the sum, mean,
geometric mean, minimum (Leontif), etc. Then the
MDE principle estimates the true value of p by p0
where p0 is the parameter vector that maximizes the
aggregate decisional efficiency rfvðpÞg. For example, if we choose summation as theP
aggregator, then
T
the MDE estimate p0 maximizes t¼1 vt ðpÞ.
We next model the main sub-problem for
p ¼ p ðAÞ using the MDE approach. We select
the minimum aggregator, which results in a
maximin formulation. The MDE sub-problem for
p given A is given by
max min p0 yt =p0 yt :
t

ð2:3Þ


665

By LP duality, we also have
p0 yt ¼ nt 0 xt ;

ð2:4Þ

t

where the n are the optimal dual variables for
problem Pt . Therefore, the decisional efficiency vt
for observation t can be expressed as
vt ¼ p0 yt =nt 0 xt ;
t 0

ð2:5Þ

t

where n x is the optimal solution to the dual problem Dt associated with Pt , and where Dt is given by
Dt:

min

nt0 xt ;

ð2:6Þ

0 t

ð2:7Þ

A n ¼ p;

s:t:

t

n P 0 componentwise:

ð2:8Þ

We have chosen the constraints (2.7), as opposed
to A0 nt P p, for the following reason. We assume
that all data vectors yt have strictly positive components. In turn, we assume that the unobserved
yt -vectors also have strictly positive components.
Since these yt -vectors may be considered as the
dual variables for the constraints (2.7), the slack
variables of the constraints A0 nt P p must be zero
by complementary slackness.
Next, we note that a normalization is necessary.
Namely, if p is replaced by kp in problems, Pt or
Dt , where k > 0, then the corresponding dual
variables are also multiplied by k without changing
the value of the efficiency ratios vt . For this purpose, we choose the normalization constraint
nt0 0 xt0 ¼ 1;

ð2:9Þ

where t0 is a particular index choice.
Finally, it is necessary to insure that the efficiency ratios do not exceed unity. This can be
accomplished by including the constraints
p0 yt 6 nt 0 xt

ð2:10Þ

for all t:

Collecting these considerations, the estimation
model for the p given A sub-problem is then given
by model MMA.
MMA:

p0 yt =nt0 xt ;

max

min

s:t:

An ¼p

t
0 t

componentwise
for all t;
p0 yt 6 nt0 xt for all t;
nt0 0 xt0 ¼ 1;
p; nt P 0 componentwise
for all t:

ð2:11Þ
ð2:12Þ
ð2:13Þ
ð2:14Þ
ð2:15Þ

666

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

It is important to note that the maximization of
the minimum ratio in (2.9) requires that nt0 xt be
simultaneously minimized, which together with
constraints (2.10) and (2.13), make the resulting nt
optimal for the problems Dt , corresponding to the
vector p. In particular, when p is the optimal psolution for problem MMA, then the associated
optimal nt can be seen to be the corresponding
optimal dual variables nt .
Model MMA is a generalized fractional programming problem for which several algorithms
have been proposed. See, for example, Crouzeix
et al. (1985), Crouzeix and Ferland (1991), Pardalos and Phillips (1991), Barros et al. (1996a,b) and
Gugat (1996). However, it is also amenable to
solution by general-purpose nonlinear programming algorithms such as the GRG2 nonlinear
solver in Microsoft ExcelTM . For the numerical
illustration below, a further simplification is shown
by which the MMA model can be reduced to a
linear programming problem.

3. Generation of the admissible candidate constraint
matrices

Similarly, we call vector ki ¼ fkir g a row generator
vector. Finally, we define the corresponding
admissible A-matrix by
A ¼ fair g ¼ fj i kir g:

Thus, vector ki determines the orientation of the

ith constraint hyper-plane,
is a scale factor
P andt ji P
so that the constraint r air yr ¼ r j i kir yrt 6 xti is
the tightest fitting half-plane, which encloses all
data vectors. Thus by tightest fitting, we mean that
the constraint forms the smallest half-space that
includes all of the data.
Representation of the K-matrix can be further
simplified as follows. Define variables, uir , for
i ¼ 1; . . . ; I and r ¼ 1; . . . ; R
1, as follows. First,
let 0 6 uir 6 1 for all i and r. Then let
ki1 ¼ ui1 ;
ki2 ¼ ð1
ki1 Þui2 ;
ki3 ¼ ð1
ki1
ki2 Þui3 ;
...
kir ¼

1

r
1
X
n¼1

kiR ¼
It is useful to limit the class from which estimates of the A-matrix can be chosen.

ð3:2Þ

1

R
1
X
n¼1

!

kin uir

kin

!

for all i and for r 6 R
1;

for all i:
ð3:3Þ

Definition 1. We call matrix A a technically efficient constraint matrix if (i) Ayt 6 xt componentwise for all t, (ii) the elements of A are
nonnegative, and (iii) Ayt ¼ xt for at least one
index-t, for each index-i.

By discretizing the interval for each uir into N steps, this search space can be refined to any desired degree. For example, with N ¼ 100 and a
step size of N
1 ¼ 0:01, values of uir could be selected from f0:01; 0:02; . . . ; 1:00g. Section 5 gives
this process is in detail.

The aim of Definition 1 is to limit the search to
those nonnegative matrices that are ‘‘tight fitting’’
as discussed further below. This class of matrices
can be represented with the help of:

4. Construction of the likelihood function

Definition 2. We call K ¼ fkir g a constraint generator matrix if kir P 0 for all i and r, and
P
r kir ¼ 1 for all i. For each row-i, define scalars
j i called row scale factors, by
( ,
)
X
t

t
ji ¼ min xi
kir yr :
ð3:1Þ
t

r

Let A be a specified candidate matrix. By
solution of model MMA, an estimate, p ¼ p ðAÞ,
can be determined. Also the efficiency ratios
vt ¼ vt ðAÞ ¼ p 0 yt =nt xt ¼ p 0 yt =zt
t

ð4:1Þ

can then be computed. The v are distributed over
the interval ½0; 1, and a specific pdf model can be
fitted to the vtA -data. This pdf, denoted gðvÞ, may
be used to construct a pdf on the set of output

667

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

vectors y, for each problem Pt , as will be seen
below. The following additional definitions are
needed. Let F t ¼ fy : Ay 6 xt ; yr P 0 for all rg be
the set of feasible decision vectors y, for the problem Pt , and let zt be the optimal objective function
value for Pt . Next, let S t ðv; xt Þ ¼ fy 2 F t : p0 y ¼
vzt g. These sets contain those feasible vectors that
have efficiency score v. We note that these sets vary
with t. This is because, while the constraint matrix
A is the same for each problem Pt , the differing
resource data vectors xt create different constraints
(half-spaces) for each t. Fig. 1 depicts problem Pt
and these relationships for
of y 2 R2 .
S the case
t
t
Next, define W ðvÞ ¼ v0 P v S ðv; xt Þ as the set of
vectors that are feasible for problem Pt , and which
also have efficiency scores of v or greater. Also,
define the functions V t ðyÞ ¼ p0 y=zt . These are defined on the set of feasible vectors for problem Pt
and give the efficiency values of the feasible yvectors. For example, V t ðyt Þ ¼ vt .
We are now able to propose pdf models f t ðyÞ
for the feasible sets of problems Pt . These pdfs
represent the composition of a two-step process.
First, a value v, 0 6 v 6 1, is selected according to
the pdf gðvÞ. Then given v, a vector y is selected on
the set S t ðv; xt Þ according to the uniform pdf on
that set. Let Dv be a small positive number and
consider the approximation of Probðv 6 V ðyÞ 6
v þ DvÞ in two ways. First, this probability is given
by

Z

vþDv

gðuÞ du ffi gðvÞ Dv:

ð4:2Þ

v

By the uniform pdf assumption, f ðyÞ is constant
on S t ðv; xt Þ for each v and has value uðvÞ, say, on
these sets. It follows that
Probðv 6 V ðyÞ 6 v þ DvÞ
Z
R
Y
¼
dyr
f ðyÞ
fy:v 6 V ðyÞ 6 vþDvg

r¼1

ffi uðvÞ½mðW ðv; xÞÞ
mðW ðv þ Dv; xÞÞ:

ð4:3Þ

The volume measure in brackets can be further
approximated. For small Dv, it is given by the
product of the surface measure mðS t ðv; xt ÞÞ and the
distance element kDv corresponding to Dv and
orthogonal to S t ðv; xt Þ. This distance is the length
of the projection of the vector ðDvÞy in the
direction of vector p, which is
ðDvÞz =kpk:

ð4:4Þ

It follows that
mðW ðv; xÞ
mðW ðv þ Dv; xÞ
ffi mðS t ðv; xt ÞÞðDvÞz =kpk:

ð4:5Þ

Combining results, we have in the limit as Dv ! 0,
gðvÞ ¼ uðvÞmðS t ðv; xt ÞÞðDvÞz =kpk:

ð4:6Þ

Therefore
f t ðyÞ ¼ uðV t ðyÞÞ;

ð4:7Þ

where
1

uðvÞ ¼ kpkgðvÞ½z mðS t ðv; xt ÞÞ :
To summarize, the pdf
lem Pt is given by
yt *

y2

v =1
v



ð4:8Þ

for vectors y in prob-

f t ðyÞ ¼ kpkgðV t ðyÞÞ½zt mðS t ðv; xt ÞÞ
1 :

St(V(yt),xt)

Ay ≤ xt

4

π y = vz*

yt
v=0

ð4:9Þ

Finally, assuming independence, we obtain the
likelihood of the data sample as
Lðy1 ; . . . ; yT ; x1 ; . . . ; xt jp; AÞ ¼

T
Y

f t ðyt Þ;

ð4:10Þ

t¼1

y1 →
Fig. 1. Problem Pt in R2 . Although depicted, y t is unobserved
and is not used in the estimation procedure.

4
An alternative derivation of this pdf is given in Troutt et al.
(2003).

668

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

when each f t ðyt Þ is defined. One or more of these
could, in theory, be undefined in the event that for
some t, mðS t ðv; xt ÞÞ ¼ 0 in (4.9). In that case, we
propose the following heuristic. If any f t ðyt Þ is
undefined, its value is replaced by the largest of
those, which are defined. Then (4.10) is calculated
by the revised values. This heuristic, which we call
indented likelihood, has the following properties. A
sample will have defined likelihood unless for
every t, yt ¼ yt . This condition could only occur if
every observed yt is an optimal solution of problem, Pt for the estimated p and A, a condition that
can be regarded as unlikely. Except for that case,
the likelihood will be defined and may have any
value in ½0; 1Þ. This heuristic is discussed further
in Section 6.
Remark. The density model given by (4.9) reflects
the combined influences of both efficiency with
respect to the objective function value and the
geometry of the constraints. An output vector will
have a high pdf value to the extent that it has a
large gðvÞ-value and/or a small mðS t ðv; xt ÞÞ-value.
The density models f t ðyÞ may be regarded as
combining these separate influences.

5. Summary of the method and a numerical
illustration
5.1. Initialization and summary of the method
A summary of the estimation method is given in
the form of pseudo-code in Fig. 2.
To initialize, choose the number of steps N for
discretizing the ½0; 1 ranges of the uir . Let Zi;r be
integer variables. Set the all entries of matrix
U ¼ fuir g to the value N
1 . Initialize LIST as an
empty table with number of fields given by
1 þ R þ I  R, where 1 accounts for the scalar
value of L, R accounts for the components of p,
and I  R accounts for the components of A.
5.2. Fitting the pdf and computing volumes of
polytopes
Since the vt -data lie in the interval ½0; 1, a versatile pdf modeling approach can be based on the

BEGIN:
Repeat for Z1,1.= 1 To N
Repeat for Z1,2.= 1 To N

Repeat for Z1, R−1.= 1 To N
Repeat for Z2,1 = 1 To N

Repeat for ZI,R−1 = 1 To N
Set uir = N−1 Zir for i = 1,…,I and r = 1,…,R
Construct the Λ -matrix according to (3.3)
Compute the κ i*, i = 1,…,I, according to (3.1)
Construct the A-matrix according to (3.2)
Solve problem MMA (2.9-2.13) for π
Using π and A, solve problems Pt (1.1) for zt*, t = 1,…,T
Compute vt for t = 1,…,T according to (4.1)
Fit a pdf model g(v) to the vt-data
Compute the set volumes St(vt,xt) for t =
1,…,T
Compute the likelihood L according to (4.94.10)
Add line (L, components of π, components of
A) to LIST
End

End
Sort LIST in descending order of L
Return the estimate as the top row of LIST
Fig. 2. Summary of the estimation procedure in pseudo-code
form.

two-parameter gamma pdf. First, transform the
2
data by way of x ¼ ðln vÞ so that x 2 ½0; 1Þ.
Then model the pdf of x as the two-parameter
gamma pdf. Some related details can be found in
Troutt et al. (2000), which provides a method of
moments approach to fitting the gamma model.
For the evaluation of the volumes of the sets
S t ðvt ; xt Þ, which are convex polytopes, exact algorithms have been proposed. See, for example,
B€
ueler et al. (1998), Lawrence (1991), Verschelde
et al. (1994), and Cohen and Hickey (1979). Codes
in C are available through the home page website 5
of Prof. K. Fukuda. In the numerical illustration
that we present in the next section, the calculations
are elementary and do not require the specialized
software.

5

http://www.ifor.math.ethz.ch/~fukuda/fukuda.html.

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

5.3. A numerical illustration on simulated data
An example of the procedure is developed here
for simulated data. For simplicity, we consider two
outputs and two inputs. In addition, constant input
vectors are used. This permits an additional simplification in that problem MMA simplifies to an
LP problem. This can be seen as follows. We have
xt ¼ ðx1 ; x2 Þ0 and nt ¼ ðn1 ; n2 Þ0 for all t. Then, in
problem MMA, the objective function simplifies to
max minp0 yt =nt0 xt
t

¼ max minðp1 y1t þ p2 y2t Þ=ðn1 x1 þ n2 x2 Þ:
t

ð5:1Þ

Now by the normalization constraint, n1 x1 þ
n2 x2 ¼ 1, so that (5.1) simplifies further to

669

Table 1
Candidate matrices
Number

k1

k2

1
2
3
4
5
6
7
8
9
10
11
12
13

1
1
1
1
2/3
2/3
2/3
2/3
1/2
1/2
1/2
1/4
1/4

2/3
1/2
1/4
0
2/3
1/2
1/4 (stipulated true case)
0
1/2
1/4
0
1/4
0

Then maximization of w, along with the additional
constraints

The constraints and objective function orientations
for this model are depicted in Fig. 1. Thus,
p0 ¼ ð5; 3Þ0 , xt ¼ ð40; 60Þ0 , and zt ¼ 108 for
t ¼ 1; . . . ; 30. Random yt -observations were generated as follows. First, a vt -value was simulated
from the positive exponential density, gðvÞ ¼
1
ga ðvÞ ¼ Ca eav , v 2 ½0; 1, for which Ca ¼ aðea
1Þ ,
and for which the cdf is given by GðvÞ ¼
ðea

1 ðeav
1Þ The parameter value a ¼ 8, was
chosen. Then, for each vt -value, a yt such that

w 6 p1 y1t þ p2 y2t

po0 yt ¼ vt zt

max mint p1 y1t þ p2 y2t :

ð5:2Þ

Objective functions of this kind can be converted
to linear ones by the introduction of an auxiliary
variable, w, for which
w ¼ min p1 y1t þ p2 y2t :

ð5:3Þ

t

for all t;

ð5:4Þ

ensures that (5.3) holds at the optimal solution.
Thus, it follows that problem MMA reduces to the
following LP problem:
max
s:t:

w;
w 6 p1 y1t þ p2 y2t
0

An¼p

for all t;

componentwise for all t;

p1 y1t þ p2 y2t 6 1 for all t;
n1 x1 þ n2 x2 ¼ 1;
w; p1 ; p2 ; n1 ; n2 P 0:

ð5:5Þ
ð5:6Þ
ð5:7Þ
ð5:8Þ
ð5:9Þ
ð5:10Þ

Data sets consisting of 30 output vectors ðy1t ; y2t Þ,
t ¼ 1; . . . ; 30, were simulated from the following
stipulated true model:
Stipulated true LP model:
max

p00 y ¼ 5y1 þ 3y2 ;

ð5:11Þ

s:t:

2y1 þ y2 6 40;
y1 þ 3y2 6 60;

ð5:12Þ
ð5:13Þ

yi P 0:

ð5:14Þ

ð5:15Þ

was generated using the uniform pdf on the line
segment defined by the constraints (5.2)–(5.4) and
(5.5). Fig. 1 may be referenced again. The end-points
of this segment were determined by maximizing and
minimizing y2t subject to the feasibility constraints.
The Lebesgue measure of the set S t ðv; xt Þ is the
length of the line segment defined by those endpoints, and the uniform pdf on the segment is the
reciprocal of that length. Several candidate Amatrices were constructed as follows. Pairs of row
generator vectors were selected as in Table 1. Then,
the j i -factors and corresponding A-matrices were
computed according to the procedure of Section 3.
Twenty-five such data sets, of 30 observations
each, were simulated and the estimation steps were
completed for each data set. The estimated p-vector 6 and the mean, v, of the vt -values were

6
Multiple solutions were resolved by selecting the first
solution reported by the LP algorithm.

670

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

obtained. The pdf model ga ðvÞ ¼ Ca eav was fitted to
the observed vt -values by the method of maximum
likelihood estimation (MLE). The MLE estimate of
a can be verified to be the solution of the equation
1

a
1
ea ðea


þ v ¼ 0;

ð5:16Þ

which was solved by Newtons method using a
starting value of a0 ¼ 2. Finally, the likelihood
value for each observation was calculated using
(4.9) and (4.10). The results were obtained using
SAS/IML (1989) and are shown in Table 2.

From Table 2, the stipulated true matrix,
number seven, was identified for 68% of these
randomly simulated data sets. For those trials in
which the maximum likelihood candidate was
incorrect, the correct matrix was identified as
having next highest likelihood in all but one trial.
In addition, in all but one of the misidentified
trials, matrix number eight was the one actually
identified. It can be noted that matrix number
eight is close to matrix number seven in the sense
that one constraint is the same and the other has

Table 2
Estimation results for 25 consecutive data set simulations based on the stipulated assumptions
Estimated

Ratio p1 =p2

Estimated a

0.0337
0.0379
0.0250
0.0387
0.0250
0.0278
0.0289
0.0311
0.0396
0.0272
0.0250
0.0300
0.0422
0.0453
0.0250
0.0367
0.0272
0.0394
0.0328
0.0250
0.0309
0.0250
0.0347
0.0377
0.0289

1.142
0.847
2.004
0.822
2.000
1.665
1.564
1.350
0.773
1.735
2.000
1.450
0.645
0.512
2.000
0.946
1.735
0.784
1.213
2.000
1.366
2.000
1.081
0.875
1.554

7.962
4.572
7.932
7.554
8.926
8.241
7.989
5.827
4.824
10.950
8.138
5.979
4.183
3.303
6.813
6.029
13.450
4.419
5.977
10.298
7.414
7.009
8.927
7.141
6.971

0.0407
0.0083

0.0320
0.0062

1.363
0.497

7.233
2.269



0.0017

0.0012

0.099

0.454









Trial

Indicated best
candidate
matrix

First
runner-up

Second
runner-up

p1

p2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

#7 ( )
#8
#1
#7 ( )
#7 ( )
#8
#8
#7 ( )
#7 ( )
#7 ( )
#8
#7 ( )
#8
#10
#8
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )
#7 ( )


#7
#5


#7
#7



#7

#7
#7
#7













#6























0.0385
0.0321
0.0501
0.0318
0.0500
0.0463
0.0452
0.0420
0.0306
0.0472
0.0500
0.0435
0.0272
0.0232
0.0500
0.0347
0.0472
0.0309
0.0398
0.0500
0.0422
0.0500
0.0375
0.0330
0.0449

Mean
Standard
deviation
Standard
error of
the mean
Frequency
correct














0.68

0.28

( ) indicates correct identification.

( )

( )
( )

( )
( )
( )
( )



671

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

slope near to that of the other constraint of matrix
number seven.
The mean of the estimated ratio p1 =p2 was
1.363, as compared to 1.6, for the stipulated
model. This is a difference of 2.39 standard errors
of the mean. It should be noted that the stipulated
model (5.11)–(5.14) has a large range of optimality
that yields ratios in the range (0.333, 2.000). All
trials but one, number three, yielded ratios in this
range. Perhaps less estimation precision for the
objective function vector should be expected due
to the wide range of optimality for this example.
The reader is cautioned that these results are
merely illustrative and do not constitute an adequate simulation study of the method. However,
we believe they suggest that the method might be
reasonably effective and is worthy of a more comprehensive simulation study. For this example, the
method appeared less effective in identifying the
correct distribution parameter a. However, accuracy of this estimate may be neither necessary nor
sufficient for correct identification of the matrix.
See, for example, trial nine (good matrix identification but poor a-estimate) and trial three (good aestimate but poor matrix identification). Similar
observations hold with respect to the a-estimate
and the p1 =p2 ratio estimates. This is evidenced by
trials one, four, ten, eleven and fifteen in Table 2.

6. Limitations and potential extensions
6.1. Validation of the optimization goal
The major theoretical assumptions are that: (i)
linear programming models with the same p and A
were appropriate for each observation, and (ii)
that the organization or decision maker(s) have
chosen or otherwise produced outputs which
approximate optimal solutions to these models.
The appropriateness of the second assumption can
be judged to some degree by the magnitude of the
likelihood scores, and the degree of concentration
of the fitted gðvÞ pdf toward the ideal value of
unity for v. In Troutt et al. (2000), a criterion,
called normal-like-or-better performance effectiveness, has been proposed as a test for the optimization goal.

6.2. Other decision model orientations
The basic decision model class assumed here is
that of the Pt , which may be described as outputoriented. Namely, output values were to be approximately maximized subject to given resources. A
cost minimization input-oriented model could be
developed along similar lines. That is, suppose the
data are regarded as attempts to minimize input
costs subject to required outputs. The basic models,
Qt , analogous to the Pt , can be given as
Qt:

min
s:t:

n0 xt ;
Bx P yt ;

x P 0;

componentwise:
ð6:1Þ

The model analogous to MMA can be easily
constructed for this case.
Zeleny (1986) has called attention to the LP
modeling flexibilities afforded by regarding resource levels as additional decision variables in
what are called De Novo LP models. Such constructs adapted to the present setting clearly provide an avenue for further research on mixed
input–output orientations.
6.3. The unbounded pdf heuristic
For the positive exponential gðvÞ that we used
for the numerical illustration, the pdf value at the
upper end-point of the interval gð1Þ is bounded
and greater than zero. If yt ¼ yt and yt is the
unique optimal solution of problem Pt , then
mðS t ðv; xt ÞÞ ¼ 0 in (4.9), and thus (4.9) tends to the
indeterminate form, gð1Þ=0, as v ! 1, causing
(4.10) to be unbounded. For that case, we proposed the indented likelihood heuristic. More
generally, gð1Þ may be unbounded or zero. If gð1Þ
is unbounded then (4.9) tends to the indeterminate
form, 1=0, so that the indented likelihood heuristic should still apply. However, if gð1Þ is zero
then (4.9) has the indeterminate form, 0/0. In this
case, the limit as v ! 1 in the right-hand side of
(4.9), if finite, would appear to be the most
appropriate value assignment at v ¼ 1. Such a
limit might be estimated by extrapolating the trend
of the values of f t ðy t Þ associated with the ordered
vt -values.

672

M.D. Troutt et al. / European Journal of Operational Research 161 (2005) 663–672

Other heuristics could be proposed. For example, the xt -vectors might be stretched by a factor
such as 1 þ e in order to prevent any data vector,
yt , from being an exactly optimal solution to its
respective problem Pt . However, we have not observed the occurrence of these problem cases so far
in our experiments with the method.
6.4. Use of other aggregators in the MDE technique
The sum or average aggregator could be used to
develop a model like the MMA model above. The
resulting model would require maximization of the
sum of the efficiency ratios. While the problem of
optimizing the sum of linear fractional functions
subject to linear constraints is challenging, some
algorithms have been proposed. Schaible and Shi
(2003) provide a recent survey.

7. Conclusions
We have defined a version of the Inverse LP
problem that we call Linear Programming System
Identification. This problem seeks to identify both
the objective function coefficient vector and the
constraint matrix of a linear programming problem that best fits a set of observed vector pairs.
One vector of each pair is considered an approximately optimal decision vector for the linear programming problem when the other vector is the
resource vector of that problem. An algorithm was
proposed for approximating a maximum likelihood solution. Results were illustrated for an
example in two dimensions. Some preliminary
simulation results for that example suggest that the
method may be promising and is worthy of a more
comprehensive simulation study for future research.

Acknowledgements
We wish to thank an anonymous referee for
many helpful suggestions. We are also indebted to
Dr. L.F. Cheung for stimulating discussions.

References
Ahuja, R.K., Orlin, J.B., 2001. Inverse optimization. Operations Research 49 (5), 771–783.
Barros, A.I., Frenk, J.B.G., Schaible, S., Zhang, S., 1996a. A
new algorithm for generalized fractional programs. Mathematical Programming 72, 147–175.
Barros, A.I., Frenk, J.B.G., Schaible, S., Zhang, S., 1996b.
Using duality to solve generalized fractional programming
problems. Journal of Global Optimization 8, 139–170.
B€
ueler, B., Enge, A., Fukuda, K., 1998. Exact volume computation for convex polytopes a practical study. In: Kalai, G.,
Ziegler, G. (Eds.), Polytopes––Combinatorics and Computation. DMV-Seminars. Birkh€auser Verlag.
Cohen, J., Hickey, T., 1979. Two algorithms for determining
volumes of convex polyhedra. Journal of the Association for
Computing Machinery 26 (3), 401–414.
Crouzeix, J.J., Ferland, J.A., 1991. Algorithms for generalized
fractional programming. Mathematical Programming 52,
191–207.
Crouzeix, J.J., Ferland, J.A., Schaible, S., 1985. An algorithm
for generalized fractional programs. Journal of Optimization Theory and Applications 47, 35–49.
Gugat, M., 1996. A fast algorithm for a class of generalized
fractional programs. Management Science 42 (10), 1493–
1499.
Lawrence, J., 1991. Polytope volume computation. Mathematics of Computation 57 (196), 259–271.
Pardalos, P.U., Phillips, A.T., 1991. Global optimization of
fractional programs. Journal of Global Optimization 1,
173–182.
SAS/IML, 1989. Software: Usage and Reference. Version 6,
first ed. SAS Institute, Inc., Cary, NC.
Schaible, S., Shi, J., 2003. Fractional programming: The sumof-ratios case. Optimization Methods and Software 18 (2),
219–229.
Tarantola, A., 1987. Inverse Problem Theory: Methods for
Data Fitting and Model Parameter Estimation. Elsevier,
Amsterdam.
Troutt, M.D., 1995. A maximum decisional efficiency estimation principle. Management Science 41, 76–82.
Troutt, M.D., Gribbin, D.W., Shanker, M., Zhang, A., 2000.
Cost efficiency benchmarking for operational units with
multiple cost drivers. Decision Sciences 31 (4), 813–832.
Troutt, M.D., Pang, W.-K., Hou, S.-H., 2003. Vertical Density
Representation and Its Applications. World Scientific Publishing Co. Pte. Ltd., Singapore.
Verschelde, J., Verlinden, P., Cools, R., 1994. Homotopies
exploiting Newton polytopes for solving sparse polynomial
systems. SIAM Journal on Numerical Analysis 31 (3), 915–
930.
Zhang, J., Liu, Z., 1996. Calculating some inverse linear
programming problems. Journal of Computational and
Applied Mathematics 72, 261–273.
Zeleny, M., 1986. Optimal system design with multiple criteria:
De novo programming approach. Engineering Costs and
Production Economics 10, 89–94.