The problem and elementary results.

Chapter 5 OPTIMIZATION OVER SETS DEFINED BY INEQUALITY CONSTRAINTS: NONLINEAR PROGRAMMING In many decision-making situations the assumption of linearity of the constraint inequalities in LP is quite restrictive. The linearity of the objective function is not restrictive as shown in the first exercise below. In Section 1 we present the general nonlinear programming problem NP and prove the Kuhn-Tucker theorem. Section 2 deals with Duality theory for the case where appropriate convexity conditions are satisfied. Two applications are given. Section 3 is devoted to the important special case of quadratic programming. The last section is devoted to computational considerations.

5.1 Qualitative Theory of Nonlinear Programming

5.1.1 The problem and elementary results.

The general NP is a decision problem of the form: Maximize f x subject to x ≤ 0 , i = 1, . . . , m, 5.1 where x ∈ R n , f i : R n → R, i = 0, 1, . . . , m, are differentiable functions. As in Chapter 4, x ∈ R n is said to be a feasible solution if it satisfies the constraints of 5.1, and Ω ⊂ R n is the subset of all feasible solutions; x ∗ ∈ Ω is said to be an optimal decision or optimal solution if f x ∗ ≥ f x for x ∈ Ω. From the discussion in 4.1.2 it is clear that equality constraints and sign constraints on some of the components of x can all be transformed into the form 5.1. The next exercise shows that we could restrict ourselves to objective functions which are linear; however, we will not do this. Exercise 1: Show that 5.2, with variables y ∈ R, x ∈ R n , is equivalent to 5.1: 49 50 CHAPTER 5. NONLINEAR PROGRAMMING Maximize y subject to f i x ≤ 0, 1 ≤ i ≤ m, and y − f x ≤ 0 . 5.2 Returning to problem 5.1, we are interested in obtaining conditions which any optimal decision must satisfy. The argument parallels very closely that developed in Exercise 1 of 4.1 and Exercise 1 of 4.2. The basic idea is to linearize the functions f i in a neighborhood of an optimal decision x ∗ . Definition: Let x be a feasible solution, and let Ix ⊂ {1, . . . , m} be such that f i x = 0 for ı ∈ Ix, f i x 0 for i 6∈ Ix. The set Ix is called the set of active constraints at x. Definition: i Let x ∈ Ω. A vector h ∈ R n is said to be an admissible direction for Ω at x if there exists a sequence x k , k = 1, 2, . . . , in Ω and a sequence of numbers ε k , k = 1, . . . , with ε k for all k such that lim k→∞ x k = x , lim k→∞ 1 ε k x k − x = h . ii Let CΩ, x = {h|h is an admissible direction for Ω at x}. CΩ, x is called the tangent cone of Ω at x. Let KΩ, x = {x + h|h ∈ CΩ, x}. See Figures 5.1 and 5.2 and compare them with Figures 4.1 and 4.2. If we take x k = x and ε k = 1 for all k, we see that 0 ∈ CΩ, x so that the tangent cone is always nonempty. Two more properties are stated below. Exercise 2: i Show that CΩ, x is a cone, i.e., if h ∈ CΩ, x and θ ≥ 0, then θh ∈ CΩ, x. ii Show that CΩ, x is a closed subset of R n . Hint for ii: For m = 1, 2, . . . , let h m and {x mk , ε mk 0} ∞ k=1 be such that x mk → x and 1ε mk x mk − x → h m as k → ∞. Suppose that h m → h as m → ∞. Show that there exist subsequences {x mk m , ε mk m } ∞ m=1 such that x mk m → x and 1ε mk m x mk m − x → h as m → ∞. In the definition of CΩ, x we made no use of the particular functional description of Ω. The following elementary result is more interesting in this light and should be compared with 2.18 in Chapter 2 and Exercise 1 of 4.1. Lemma 1: Suppose x ∗ ∈ Ω is an optimum decision for 5.1. Then f 0x x ∗ h ≤ 0 for all h ∈ CΩ, x ∗ . 5.3 Proof: Let x k ∈ Ω, ε k 0, k = 1, 2, 3, . . . , be such that 5.1. QUALITATIVE THEORY OF NONLINEAR PROGRAMMING 51 , P {x|f 3 x = 0} Q x ∗ direction of increasing payoff πk = {x|f x = k} {x|f 1 x = 0} Ω R {x|f 2 x = 0} Figure 5.1: Ω = P QR lim k→∞ x k = x ∗ , lim k→∞ 1 ε k x k − x ∗ = h . 5.4 Note that in particular 5.4 implies lim k→∞ 1 ε k |x k − x ∗ | = |h| . 5.5 Since f is differentiable, by Taylor’s theorem we have f x k = f x ∗ + x k − x ∗ = f x ∗ + f 0x x ∗ x k − x ∗ + o|x k − x ∗ | . 5.6 Since x k ∈ Ω, and x ∗ is optimal, we have f x k ≤ f x ∗ , so that 0 ≥ f 0x x ∗ x k −x ∗ ε k + o|x k −x ∗ | ε k . Taking limits as k → ∞, using 5.4 and 5.5, we can see that 0 ≥ = lim k→∞ f 0x x ∗ h. ♦ f 0x x ∗ x k −x ∗ ε k + lim k→∞ o|x k −x ∗ | |x k −x ∗ | lim k→∞ |x k −x ∗ | ε k 52 CHAPTER 5. NONLINEAR PROGRAMMING - - - - - - - - x ∗ KΩ, x ∗ CΩ, x ∗ Figure 5.2: CΩ, x ∗ is the tangent cone of Ω at x ∗ . The basic problem that remains is to characterize the set CΩ, x ∗ in terms of the derivatives of the functions f i . Then we can apply Farkas’ Lemma just as in Exercise 1 of 4.2. Lemma 2: Let x ∗ ∈ Ω. Then CΩ, x ∗ ⊂ {h|f ix x ∗ h ≤ 0 for all i ∈ Ix ∗ } . 5.7 Proof: Let h ∈ R n and x k ∈ Ω, ε k 0, k = 1, 2, . . . , satisfy 5.4. Since f i is differentiable, by Taylor’s theorem we have f i x k = f i x ∗ + f ix x ∗ x k − x ∗ + o|x k − x ∗ | . Since x k ∈ Ω, f i x k ≤ 0, and if i ∈ Ix ∗ , f i x ∗ = 0, so that f i x k ≤ f i x ∗ . Following the proof of Lemma 1 we can conclude that 0 ≥ f ix x ∗ h. ♦ Lemma 2 gives us a partial characterization of CΩ, x ∗ . Unfortunately, in general the inclusion sign in 5.7 cannot be reversed. The main reason for this is that the set {f ix x ∗ |i ∈ Ix ∗ } is not in general linearly independent. Exercise 3: Let x ∈ R 2 , f 1 x 1 , x 2 = x 1 − 1 3 + x 2 , and f 2 x 1 , x 2 = −x 2 . Let x ∗ 1 , x ∗ 2 = 1, 0. Then Ix ∗ = {1, 2}. Show that 5.1. QUALITATIVE THEORY OF NONLINEAR PROGRAMMING 53 CΩ, x ∗ 6= {h|f ix x ∗ h ≤ 0 , i = 1, 2, }. Note that {f 1x x ∗ , f 2x x ∗ } is not a linearly independent set; see Lemma 4 below.

5.1.2 Kuhn-Tucker Theorem.