Basic results. Duality Theory

5.2. DUALITY THEORY 57 and since f i x ∗ = 0 whereas f ix x ∗ h + δh ∗ 0, we can conclude that f i x k 0 for suffi- ciently large k. Thus, x k ∈ Ω for sufficiently large k. Hence, h + δh ∗ ∈ CΩ, x ∗ . To finish the proof we note that δ 0 can be made arbitrarily small, and CΩ, x ∗ is closed by Exercise 2, so that h ∈ CΩ, x ∗ . ♦ The next lemma applies to the formulation 5.9. Its proof is left as an exercise since it is very similar to the proof of Lemma 4. Lemma 5: Suppose x ∗ is feasible for 5.9 and suppose there exists h ∗ ∈ R n such that the set {f ix x ∗ |i ∈ Ix ∗ , f ix x ∗ h ∗ = 0} S{r jx x ∗ |j = 1, . . . , k} is linearly independent, and f ix x ∗ h ∗ ≤ 0 for i ∈ Ix ∗ , r jx x ∗ h ∗ = 0 for 1 ≤ j ≤ k. Then CQ is satisfied at x ∗ . Exercise 10: Prove Lemma 5

5.2 Duality Theory

Duality theory is perhaps the most beautiful part of nonlinear programming. It has resulted in many applications within nonlinear programming, in terms of suggesting important computational algo- rithms, and it has provided many unifying conceptual insights into economics and management science. We can only present some of the basic results here, and even so some of the proofs are relegated to the Appendix at the end of this Chapter since they depend on advanced material. How- ever, we will give some geometric insight. In 2.3 we give some application of duality theory and in 2.2 we refer to some of the important generalizations. The results in 2.1 should be compared with Theorems 1 and 4 of 4.2.1 and the results in 4.2.3. It may be useful to note in the following discussion that most of the results do not require differ- entiability of the various functions.

5.2.1 Basic results.

Consider problem 5.17 which we call the primal problem: Maximize f x subject to f i x ≤ ˆb i , 1 ≤ i ≤ m x ∈ X , 5.17 where x ∈ R n , f i : R n → R, 1 ≤ i ≤ m, are given convex functions, f : R n → R is a given concave function, X is a given convex subset of R n and ˆ b = ˆb 1 , . . . , ˆb m ′ is a given vector. For convenience, let f = f 1 , . . . , f m ′ : R n → R m . We wish to examine the behavior of the maximum value of 5.17 as ˆ b varies. So we define Ωb = {x|x ∈ X, f x ≤ b}, B = {b|Ωb 6= φ}, and M : B → R S{+∞} by Mb = sup{f x|x ∈ X, f x ≤ b} = sup{f x|x ∈ Ωb} , so that in particular if x ∗ is an optimal solution of 5.17 then M ˆb = f ˆ x. We need to consider the following problem also. Let λ ∈ R m , λ ≥ 0, be fixed. Maximize f x − λ ′ f x − ˆb subject to x ∈ X , 5.18 58 CHAPTER 5. NONLINEAR PROGRAMMING and define mλ = sub{f x − λ ′ f x − ˆb|x ∈ X} . Problem 5.19 is called the dual problem: Minimize mλ subject to λ ≥ 0 . 5.19 Let m ∗ = inf {mλ|λ ≥ 0}. Remark 1: The set X in 5.17 is usually equal to R n and then, of course, there is no reason to separate it out. However, it is sometimes possible to include some of the constraints in X in such a way that the calculation of mλ by 5.18 and the solution of the dual problem 5.19 become simple. For example see the problems discussed in Sections 2.3.1 and 2.3.2 below. Remark 2: It is sometimes useful to know that Lemmas 1 and 2 below hold without any convexity conditions on f , f, X. Lemma 1 shows that the cost function of the dual problem is convex which is useful information since there are computation techniques which apply to convex cost functions but not to arbitrary nonlinear cost functions. Lemma 2 shows that the optimum value of the dual problem is always an upper bound for the optimum value of the primal. Lemma 1: m : R n + → R S{+∞} is a convex function. Here R n + = {λ ∈ R n |λ ≥ 0}. Exercise 1: Prove Lemma 1. Lemma 2: Weak duality If x is feasible for 5.17, i.e., x ∈ Ωˆb, and if λ ≥ 0, then f x ≤ Mˆb ≤ m ∗ ≤ mλ . 5.20 Proof: Since f x − ˆb ≤ 0, and λ ≥ 0, we have λ ′ f x − ˆb ≤ 0. So, f x ≤ f x − λ ′ f x − ˆb, for x ∈ Ωˆb, λ ≥ 0 . Hence f x ≤ sup {f x|x ∈ Ωˆb} = Mˆb ≤ sup {f x − λ ′ f x − ˆb|x ∈ Ωˆb} and since Ωˆb ⊂ X, ≤ sup {f x − λ ′ f x − ˆb|x ∈ X} = mλ . Thus, we have f x ≤ Mˆb ≤ mλ for x ∈ Ωˆb, λ ≥ 0 , and since M ˆb is independent of λ, if we take the infimum with respect to λ ≥ 0 in the right-hand inequality we get 5.20. ♦ The basic problem of Duality Theory is to determine conditions under which M ˆb = m ∗ in 5.20. We first give a simple sufficiency condition. Definition: A pair ˆ x, ˆ λ with ˆ x ∈ X, and ˆλ ≤ 0 is said to satisfy the optimality conditions if 5.2. DUALITY THEORY 59 ˆ x is optimal solution of 5.18 with λ = ˆ λ, 5.21 ˆ x is feasible for 5.17, i.e., f i ˆ x ≤ ˆb i for i = 1, . . . , m , 5.22 ˆ λ i = 0 when f i ˆ x ˆb i , equivalently, ˆ λ ′ f ˆ x − ˆb = 0. 5.23 ˆ λ ≥ 0 is said to be an optimal price vector if there is ˆx ∈ X such that ˆx, ˆλ satisfy the optimality condition. Note that in this case ˆ x ∈ Ωˆb by virtue of 5.22. The next result is equivalent to Theorem 4ii of Section 1 if X = R n , and f i , 0 ≤ i ≤ m, are differentiable. Theorem 1: Sufficiency If ˆ x, ˆ λ satisfy the optimality conditions, then ˆ x is an optimal solution to the primal, ˆ λ is an optimal solution to the dual, and M ˆb = m ∗ . Proof: Let x ∈ Ωˆb, so that ˆλ ′ f x − ˆb ≤ 0. Then f x ≤ f x − ˆλ ′ f x − ˆb ≤ sup{f x − ˆλ ′ f x − ˆb|x ∈ X} = f ˆ x − ˆλ ′ f ˆ x − ˆb by 5.21 = f ˆ x by 5.23 so that ˆ x is optimal for the primal, and hence by definition f ˆ x = M ˆb. Also mˆ λ = f ˆ x − ˆλ ′ f ˆ x − ˆb f ˆ x = M ˆb , so that from Weak Duality ˆ λ is optimal for the dual. ♦ We now proceed to a much more detailed investigation. Lemma 3: B is a convex subset of R m , and M : B → R S{+∞} is a concave function. Proof: Let b, ˜b belong to B, let x ∈ Ωb, ˜x ∈ Ω˜b, let 0 ≤ θ ≤ 1. Then θx + 1 − θ˜x ∈ X since X is convex, and f i θx + 1 − θ˜x ≤ θf i x + 1 − θf i ˜ x since f i is convex, so that f i θx + 1 − θ˜x ≤ θb + 1 − θ˜b , 5.24 hence θx + 1 − θ˜x ∈ Ωθb + 1 − θ˜b and therefore, B is convex. Also, since f is concave, f θx + 1 − θ˜x ≥ θf x + 1 − θf ˜ x . 60 CHAPTER 5. NONLINEAR PROGRAMMING Since 5.24 holds for all x ∈ Ωb and ˜x ∈ Ω˜b it follows that M θb + 1 − θˆb ≥ sup {f θx + 1 − θ˜x|x ∈ Ωb, ˜x ∈ Ω˜b} ≥ sup{f x|x ∈ Ωb} + 1 − θ sup {f ˜ x|˜x ∈ Ω˜b} = θM b + 1 − θM˜b. ♦ Definition: Let X ⊂ R n and let g : X → R S{∞, −∞}. A vector λ ∈ R n is said to be a supergradient subgradient of g at ˆ x ∈ X if gx ≤ gˆx + λ ′ x − ˆx for x ∈ X. gx ≥ gˆx + λ ′ x − ˆx for x ∈ X. See Figure 5-3. , - - - - - . . . . M b M ˆb b 6∈ B ˆb b M is not stable at ˆb M b M ˆb ˆb b M is stable at ˆb M b M ˆb + λ ′ b − ˆb ˆb b λ is a supergradient at ˆb Figure 5.3: Illustration of supergradient of stability. Definition: The function M : B → R S{∞} is said to be stable at ˆb ∈ B if there exists a real number K such that M b ≤ Mˆb + K|b − ˆb| for b ∈ B . In words, M is stable at ˆb if M does not increase infinitely steeply in a neighborhood of ˆb. See Figure 5.3. A more geometric way of thinking about subgradients is the following. Define the subset A ⊂ R 1+m by 5.2. DUALITY THEORY 61 A = {r, b|b ∈ B, and r ≤ Mb} . Thus A is the set lying ”below” the graph of M . We call A the hypograph 1 of M . Since M is concave it follows immediately that A is convex in fact these are equivalent statements. Definition: A vector λ , λ 1 , . . . , λ m is said to be the normal to a hyperplane supporting A at a point ˆ r, ˆb if λ ˆ r + m X i=1 λ i ˆb i ≥ λ r + m X i=1 λ i b i for all r, b ∈ A . 5.25 In words, A lies below the hyperplane ˆ π = {r, b|λ r+ P λ i b i = λ ˆ r+ P λ i b i }. The supporting hyperplane is said to be non-vertical if λ 6= 0. See Figure 5.4. Exercise 2: Show that if ˆ b ∈ B, ˜b ≥ ˆb, and ˜r ≤ Mˆb, then ˜b ∈ B, M˜b, and ˜r, ˜b ∈ A. Exercise 3: Assume that ˆ b ∈ B, and Mˆb ∞. Show that i if λ = λ 1 , . . . , λ m ′ is a supergradient of M at ˆb then λ ≥ 0, and 1, −λ 1 , . . . , −λ m ′ defines a non-vertical hyperplane supporting A at M ˆb, ˆb, ii if λ , −λ 1 , . . . , −λ m ′ defines a hyperplane supporting A at M ˆb, ˆb then λ ≥ 0, λ i ≥ 0 for 1 ≤ i ≤ m; futhermore, if the hyperplane is non-vertical then λ 1 λ , . . . , λ m λ ′ is a supergradient of M at ˆb. We will prove only one part of the next crucial result. The reader who is familiar with the Separation Theorem of convex sets should be able to construct a proof for the second part based on Figure 5.4, or see the Appendix at the end of this Chapter. Lemma 4: Gale [1967] M is stable at ˆb iff M has a supergradient at ˆb. Proof: Sufficiency only Let λ be a supergradient at ˆb, then M b ≤ Mˆb + λ ′ b − ˆb ≤ Mˆb + |λ||b − ˆb| . ♦ The next two results give important alternative interpretations of supergradients. Lemma 5: Suppose that ˆ x is optimal for 5.17. Then ˆ λ is a supergradient of M at ˆb iff ˆ λ is an optimal price vector, and then ˆ x, ˆ λ satisfy the optimality conditions. Proof: By hypothesis, f ˆ x = M ˆb, ˆ x ∈ X, and f ˆx ≤ ˆb. Let ˆλ be a supergradient of M at ˆb. By Exercise 2, M ˆb, f ˆ x ∈ A and by Exercise 3, ˆλ ≥ 0 and M ˆb − ˆλ ′ ˆb ≥ Mˆb − ˆλ ′ f ˆ x , so that ˆ λ ′ f ˆ x − ˆb ≥ 0. But then ˆλ ′ ˆb − f ˆx = 0, giving 5.23. Next let x ∈ X. Then f x, f x ∈ A, hence again by Exercise 3 M ˆb − ˆλ ′ ˆb ≥ f x − ˆλ ′ f x . Since f ˆ x = M ˆb, and ˆ λ ′ f ˆ x − ˆb = 0, we can rewrite the inequality above as 1 From the Greek “hypo” meaning below or under. This neologism contrasts with the epigraph of a function which is the set lying above the graph of the function. 62 CHAPTER 5. NONLINEAR PROGRAMMING . . M ˆb M b ˆb A b No non-vertical hyperplane supporting A at M ˆb, ˆb λ , . . . , λ m M ˆb M b ˆ π A b ˆb ˆ π is a non-vertical hyperplane supporting A at M ˆb, ˆb Figure 5.4: Hypograph and supporting hyperplane. f ˆ x + ˆ λ ′ f ˆ x − ˆb ≥ f x − ˆλ ′ f x − ˆb , so that 5.21 holds. It follows that ˆ x, ˆ λ satisfy the optimality conditions. Conversely, suppose ˆ x ∈ X, ˆλ ≥ 0 satisfy 5.21, 5.22, and 5.23. Let x ∈ Ωb, i.e., x ∈ X, f x ≤ b. Then ˆλ ′ f x − b ≤ 0 so that f x ≤ f x ˆ λ ′ f x − b = f x − ˆλ ′ f x − ˆb + ˆλ ′ b − ˆb ≤ f ˆ x − ˆλ ′ f ˆ x − ˆb + ˆλ ′ b − ˆb by 5.21 = f ˆ x + ˆ λ ′ b − ˆb by 5.23 = M ˆb + ˆ λ ′ b − ˆb . Hence M b = sup{f x|x ∈ Ωb} ≤ Mˆb + ˆλ ′ b − ˆb , so that ˆ λ ′ is a supergradient of M at ˆb. ♦ Lemma 6: Suppose that ˆ b ∈ B, and Mˆb ∞. Then ˆλ is a supergradient of M at ˆb iff ˆλ is an optimal solution of the dual 5.19 and mˆ λ = M ˆb. Proof: Let ˆ λ be a supergradient of M at ˆb. Let x ∈ X. By Exercises 2 and 3 5.2. DUALITY THEORY 63 M ˆb − ˆλ ′ ˆb ≥ f x − ˆλ ′ f x or M ˆb ≥ f x − ˆλ ′ f x − ˆb , so that M ˆb ≥ sup{f x − ˆλ ′ f x − ˆb|x ∈ X} = mˆλ . By weak duality Lemma 2 it follows that M ˆb = mˆ λ and ˆ λ is optimal for 5.19. Conversely suppose ˆ λ ≥ 0, and mˆλ = Mˆb. Then for any x ∈ X M ˆb ≥ f x − ˆλ ′ f x − ˆb , and if moreover f x ≤ b, then ˆλ ′ f x − b ≤ 0, so that M ˆb ≥ f x − ˆλ ′ f x − ˆb + ˆλ ′ f x − b = f x − ˆλ ′ b + ˆ λ ′ ˆb for x ∈ Ωb . Hence, M b = sup{f x|x ∈ Ωb} ≤ Mˆb + ˆλ ′ b − ˆb , so that ˆ λ is a supergradient. ♦ We can now summarize our results as follows. Theorem 2: Duality Suppose ˆ b ∈ B, Mˆb ∞, and M is stable at ˆb. Then i there exists an optimal solution ˆ λ for the dual, and mˆ λ = M ˆb, ii ˆ λ is optimal for the dual iff ˆ λ is a supergradient of M at ˆb, iii if ˆ λ is any optimal solution for the dual, then ˆ x is optimal for the primal iff ˆ x, ˆ λ satisfy the optimality conditions of 5.21, 5.22, and 5.23. Proof: i follows from Lemmas 4,6. ii is implied by Lemma 6. The “if” part of iii follows from Theorem 1, whereas the “only if” part of iii follows from Lemma 5. ♦ Corollary 1: Under the hypothesis of Theorem 2, if ˆ λ is an optimal solution to the dual then ∂M + ∂b i ˆb ≤ ˆλ i ≤ ∂M − ∂b i ˆb. Exercise 4: Prove Corollary 1. Hint: See Theorem 5 of 4.2.3.

5.2.2 Interpretation and extensions.