Chapter 5
OPTIMIZATION OVER SETS DEFINED BY INEQUALITY
CONSTRAINTS: NONLINEAR PROGRAMMING
In many decision-making situations the assumption of linearity of the constraint inequalities in LP is quite restrictive. The linearity of the objective function is not restrictive as shown in the first
exercise below. In Section 1 we present the general nonlinear programming problem NP and prove the Kuhn-Tucker theorem. Section 2 deals with Duality theory for the case where appropriate
convexity conditions are satisfied. Two applications are given. Section 3 is devoted to the important special case of quadratic programming. The last section is devoted to computational considerations.
5.1 Qualitative Theory of Nonlinear Programming
5.1.1 The problem and elementary results.
The general NP is a decision problem of the form: Maximize
f x
subject to x ≤ 0 , i = 1, . . . , m,
5.1 where
x ∈ R
n
, f
i
: R
n
→ R, i = 0, 1, . . . , m, are differentiable functions. As in Chapter 4, x ∈ R
n
is said to be a feasible solution if it satisfies the constraints of 5.1, and Ω ⊂ R
n
is the subset of all feasible solutions;
x
∗
∈ Ω is said to be an optimal decision or optimal solution if f
x
∗
≥ f x for x ∈ Ω. From the discussion in 4.1.2 it is clear that equality constraints and sign
constraints on some of the components of x can all be transformed into the form 5.1. The next
exercise shows that we could restrict ourselves to objective functions which are linear; however, we will not do this.
Exercise 1: Show that 5.2, with variables
y ∈ R, x ∈ R
n
, is equivalent to 5.1: 49
50
CHAPTER 5. NONLINEAR PROGRAMMING
Maximize y
subject to f
i
x ≤ 0, 1 ≤ i ≤ m, and y − f x ≤ 0 .
5.2 Returning to problem 5.1, we are interested in obtaining conditions which any optimal decision
must satisfy. The argument parallels very closely that developed in Exercise 1 of 4.1 and Exercise 1 of 4.2. The basic idea is to linearize the functions
f
i
in a neighborhood of an optimal decision x
∗
.
Definition: Let
x be a feasible solution, and let Ix ⊂ {1, . . . , m} be such that f
i
x = 0 for ı ∈ Ix, f
i
x 0 for i 6∈ Ix. The set Ix is called the set of active constraints at x. Definition:
i Let x ∈ Ω. A vector h ∈ R
n
is said to be an admissible direction for Ω at x if there
exists a sequence x
k
, k = 1, 2, . . . , in Ω and a sequence of numbers ε
k
, k = 1, . . . , with ε
k
for all k such that
lim
k→∞
x
k
= x ,
lim
k→∞ 1
ε
k
x
k
− x = h .
ii Let CΩ, x = {h|h is an admissible direction for Ω at x}. CΩ, x is called the tangent cone
of Ω at x. Let KΩ, x = {x + h|h ∈ CΩ, x}. See Figures 5.1 and 5.2 and compare them with
Figures 4.1 and 4.2. If we take
x
k
= x and ε
k
= 1 for all k, we see that 0 ∈ CΩ, x so that the tangent cone is always nonempty. Two more properties are stated below.
Exercise 2: i Show that
CΩ, x is a cone, i.e., if h ∈ CΩ, x and θ ≥ 0, then θh ∈ CΩ, x. ii Show that
CΩ, x is a closed subset of R
n
. Hint for ii: For m = 1, 2, . . . , let h
m
and {x
mk
, ε
mk
0}
∞ k=1
be such that x
mk
→ x and 1ε
mk
x
mk
− x → h
m
as k → ∞. Suppose
that h
m
→ h as m → ∞. Show that there exist subsequences {x
mk
m
, ε
mk
m
}
∞ m=1
such that x
mk
m
→ x and 1ε
mk
m
x
mk
m
− x → h as m → ∞. In the definition of
CΩ, x we made no use of the particular functional description of Ω. The following elementary result is more interesting in this light and should be compared with 2.18 in
Chapter 2 and Exercise 1 of 4.1.
Lemma 1: Suppose
x
∗
∈ Ω is an optimum decision for 5.1. Then
f
0x
x
∗
h ≤ 0 for all h ∈ CΩ, x
∗
. 5.3
Proof: Let
x
k
∈ Ω, ε
k
0, k = 1, 2, 3, . . . , be such that
5.1.
QUALITATIVE THEORY OF NONLINEAR PROGRAMMING 51
, P
{x|f
3
x = 0}
Q x
∗
direction of increasing
payoff πk =
{x|f x = k}
{x|f
1
x = 0} Ω
R {x|f
2
x = 0}
Figure 5.1: Ω = P QR
lim
k→∞
x
k
= x
∗
, lim
k→∞ 1
ε
k
x
k
− x
∗
= h . 5.4
Note that in particular 5.4 implies lim
k→∞ 1
ε
k
|x
k
− x
∗
| = |h| . 5.5
Since f
is differentiable, by Taylor’s theorem we have f
x
k
= f x
∗
+ x
k
− x
∗
= f x
∗
+ f
0x
x
∗
x
k
− x
∗
+ o|x
k
− x
∗
| . 5.6
Since x
k
∈ Ω, and x
∗
is optimal, we have f
x
k
≤ f x
∗
, so that 0 ≥ f
0x
x
∗ x
k
−x
∗
ε
k
+
o|x
k
−x
∗
| ε
k
. Taking limits as
k → ∞, using 5.4 and 5.5, we can see that
0 ≥ =
lim
k→∞
f
0x
x
∗
h. ♦ f
0x
x
∗ x
k
−x
∗
ε
k
+ lim
k→∞ o|x
k
−x
∗
| |x
k
−x
∗
|
lim
k→∞ |x
k
−x
∗
| ε
k
52
CHAPTER 5. NONLINEAR PROGRAMMING
- -
- -
- -
- -
x
∗
KΩ, x
∗
CΩ, x
∗
Figure 5.2: CΩ, x
∗
is the tangent cone of Ω at x
∗
. The basic problem that remains is to characterize the set
CΩ, x
∗
in terms of the derivatives of the functions
f
i
. Then we can apply Farkas’ Lemma just as in Exercise 1 of 4.2. Lemma 2:
Let x
∗
∈ Ω. Then CΩ, x
∗
⊂ {h|f
ix
x
∗
h ≤ 0 for all i ∈ Ix
∗
} . 5.7
Proof: Let
h ∈ R
n
and x
k
∈ Ω, ε
k
0, k = 1, 2, . . . , satisfy 5.4. Since f
i
is differentiable, by Taylor’s theorem we have
f
i
x
k
= f
i
x
∗
+ f
ix
x
∗
x
k
− x
∗
+ o|x
k
− x
∗
| . Since
x
k
∈ Ω, f
i
x
k
≤ 0, and if i ∈ Ix
∗
, f
i
x
∗
= 0, so that f
i
x
k
≤ f
i
x
∗
. Following the proof of Lemma 1 we can conclude that
0 ≥ f
ix
x
∗
h. ♦
Lemma 2 gives us a partial characterization of CΩ, x
∗
. Unfortunately, in general the inclusion sign in 5.7 cannot be reversed. The main reason for this is that the set
{f
ix
x
∗
|i ∈ Ix
∗
} is not in general linearly independent.
Exercise 3:
Let x ∈ R
2
, f
1
x
1
, x
2
= x
1
− 1
3
+ x
2
, and f
2
x
1
, x
2
= −x
2
. Let x
∗ 1
, x
∗ 2
= 1, 0. Then Ix
∗
= {1, 2}. Show that
5.1.
QUALITATIVE THEORY OF NONLINEAR PROGRAMMING 53
CΩ, x
∗
6= {h|f
ix
x
∗
h ≤ 0 , i = 1, 2, }. Note that
{f
1x
x
∗
, f
2x
x
∗
} is not a linearly independent set; see Lemma 4 below.
5.1.2 Kuhn-Tucker Theorem.