Minimum-time problems Variable Final Time

8.3. VARIABLE FINAL TIME 107 which is a contradiction. It is trivial to verify that ˜ p ∗ · satisfies 8.31, and, on the other hand 8.37 and 8.38 respectively imply 8.32 and 8.33. Next, 8.39 is equivalent to λ ∗ sf t ∗ s, z ∗ s, v ∗ s +λ ∗ s ′ f t ∗ s, z ∗ s, v ∗ s + λ ∗ n+1 s = 0 8.41 and λ ∗ sf t ∗ s, z ∗ s, v ∗ s + λ ∗ s ′ f t ∗ s, z ∗ s, v ∗ s = Sup {[λ ∗ sf t ∗ s, z ∗ s, w + λ ∗ s ′ f t ∗ s, z ∗ s, w]|w ∈ Ω}. 8.42 Evidently 8.42 is equivalent to 8.34 and 8.35 follows from 8.41 and the fact that λ ∗ n+1 1 = 0. Finally, the last assertion of the Theorem follows from 8.35 and the fact that ˜ M t, x ∗ t, ˜ p ∗ t ≡ constant if f , f are not explicitly dependent on t. ♦

8.3.2 Minimum-time problems

. We consider the following special case of 8.27: Maximize Z t f t −1dt subject to dynamics: ˙xt = f t, xt, ut, t ≤ t ≤ t f initial condition: xt = x , final condition: xt f = x f , control constraint: u· ∈ U , final-time constraint: t f ∈ t , ∞ . 8.43 In 8.43, x , x f are fixed, so that the optimal control problem consists of finding a control which transfers the system from state x at time t to state x f in minimum time. Applying Theorem 1 to this problem gives Theorem 2. Theorem 2: Let t ∗ f ∈ t , ∞ and let u ∗ : [t , t ∗ f ] → Ω be optimal. Let x ∗ · be the corresponding trajectory. Then there exists a function p ∗ : [t , t ∗ f ] → R n , not identically zero, satisfying adjoint equation: ˙p ∗ t = −[ ∂f ∂x t, x ∗ t, u ∗ t] ′ p ∗ t, t ≤ t ≤ t ∗ f , initial condition: p ∗ t ∈ R n , final condition: p ∗ t ∗ f ∈ R n . Also the maximum principle Ht, x ∗ t, p ∗ t, u ∗ t = M t, x ∗ t, p ∗ t 8.44 holds for all t ∈ [t , t ∗ f ] except possibly for a finite set. Finally, M t ∗ f , x ∗ t f , p ∗ t f ≥ 0 8.45 and if f does not depend explicitly on t then M t, x ∗ t, p ∗ , t ≡ constant . 8.46 108 CHAPTER 8. CONINUOUS-TIME OPTIMAL CONTROL Exercise 2: Prove Theorem 2. We now study a simple example illustrating Theorem 2. Example 1: The motion of a particle is described by m¨ xt + σ ˙xt = ut , where m = mass, σ = coefficient of friction, u = applied force, and x = position of the particle. For simplicity we suppose that x ∈ R, u ∈ R and ut constrained by |ut| ≤ 1. Starting with an initial condition x0 = x 01 , ˙x0 = x 02 we wish to find an admissible control which brings the particle to the state x = 0, ˙x = 0 in minimum time. Solution: Taking x 1 = x, x 2 = ˙x we rewrite the particle dynamics as ˙x 1 t ˙x 2 t = 1 0 − α x 1 t x 2 t + b ut , 8.47 where α = σm 0 and b = 1m 0. The control constraint set is Ω = [−1, 1]. Suppose that u ∗ · is optimal and x ∗ · is the corresponding trajectory. By Theorem 2 there exists a non-zero solution p ∗ · of ˙p ∗ 1 t ˙p ∗ 2 t = − 1 − α p ∗ 1 t p ∗ 2 t 8.48 such that 8.44, 8.45, and 8.46 hold. Now the transition matrix function of the homogeneous part of 8.47 is Φt, τ = 1 1 α 1 − e −αt−τ e −αt−τ , so that the solution of 8.48 is p ∗ 1 t p ∗ 2 t = 1 1 α 1 − e αt e αt p ∗ 1 p ∗ 2 , or p ∗ 1 t ≡ p ∗ 1 0 , and p ∗ 2 t = 1 α p ∗ 1 0 + e αt − 1 α p ∗ 1 0 + p ∗ 2 0 . 8.49 The Hamiltonian H is given by Hx ∗ t, p ∗ t, v = p ∗ 1 t − αp ∗ 2 tx ∗ 2 t + bp ∗ 2 tv = e αt p ∗ 1 0 − αp ∗ 2 0x ∗ 2 t + pb ∗ 2 tv , 8.3. VARIABLE FINAL TIME 109 so that from the maximum principle we can immediately conclude that u ∗ t =    +1 if p ∗ 2 t 0, −1 if p ∗ 2 t 0, ? if p ∗ 2 t = 0 . 8.50 Furthermore, since the right-hand side of 8.47 does not depend on t explicitly we must also have e αt p ∗ 1 0 − αp ∗ 2 0x ∗ 2 t + bp ∗ 2 tu ∗ t ≡ constant. 8.51 We now proceed to analyze the consequences of 8.49 and 8.50. First of all since p ∗ 1 t ≡ p ∗ 1 0, p ∗ 2 · can have three qualitatively different forms. Case 1. −p ∗ 1 0 + αp ∗ 2 0 0: Evidently then, from 8.49 we see that p ∗ 2 t must be a strictly monotonically increasing function so that from 8.50 u ∗ · can behave in one of two ways: either u ∗ t = −1 for t ˆt and p ∗ 2 t 0 for t ˆ t, +1 for t ˆ t and p ∗ 2 t 0 for t ˆ t, or u ∗ t ≡ +1 and p ∗ 2 t 0 for all t. Case 2. −p ∗ 1 0 + αp ∗ 2 0 0 : Evidently u ∗ · can behave in one of two ways: either u ∗ t = +1 for t ˆ t and p ∗ 2 t 0 for t ˆ t, −1 for t ˆt and p ∗ 2 t 0 for t ˆ t, or u ∗ t ≡ −1 and p ∗ t 0 for all t. Case 3. −p ∗ 1 0 + αp ∗ 2 0 = 0 : In this case p ∗ 2 t ≡ 1αp ∗ 1 0. Also since p ∗ t 6≡ 0, we must have in this case p ∗ 1 0 6= 0. Hence u ∗ · we can behave in one of two ways: either u ∗ t ≡ +1 and p ∗ 2 t ≡ 1 α p ∗ 1 0 0 , or u ∗ t ≡ −1 and p ∗ 2 t ≡ 1 α p ∗ 1 0 0 , 110 CHAPTER 8. CONINUOUS-TIME OPTIMAL CONTROL Thus, the optimal control u ∗ is always equal to +1 or -1 and it can switch at most once between these two values. The optimal control is given by u ∗ t = sgn p ∗ 2 t = sgn [ 1 α p ∗ 1 0 + e αt − 1 α p ∗ 1 0 + p ∗ 2 0] . Thus the search for the optimal control reduces to finding p ∗ 1 0, p ∗ 2 0 such that the solution of the differential equation ˙x = x 2 ˙x 2 = −αx 2 + b sgn[ 1 α p ∗ 1 0 + e αt − 1 α p ∗ 1 0 + p ∗ 2 0] , 8.52 with initial condition x 1 0 = x 10 , x 20 = x 20 8.53 also satisfies the final condition x 1 t ∗ f = 0, x 2 t ∗ f = 0 , 8.54 for some t ∗ f 0; and then t ∗ f is the minimum time. There are at least two ways of solving the two-point boundary value problem 8.52, 8.53, and 8.54. One way is to guess at the value of p ∗ 0 and then integrate 8.52 and 8.53 forward in time and check if 8.54 is satisfied. If 8.54 is not satisfied then modify p ∗ 0 and repeat. An alternative is to guess at the value of p ∗ 0 and then integrate 8.52 and 8.54 backward in time and check of 8.53 is satisfied. The latter approach is more advantageous because we know that any trajectory obtained by this procedure is optimal for initial conditions which lie on the trajectory. Let us follow this procedure. Suppose we choose p ∗ 0 such that −p ∗ 1 0 = αp ∗ 2 0 = 0 and p ∗ 2 0 0. Then we must have u ∗ t ≡ 1. Integrating 8.52 and 8.54 backward in time give us a trajectory ξt where ˙ξ 1 t = − ˙ξ 2 t ˙ξ 2 t = αξ 2 t − b , with ξ 1 0 − ξ 2 0 = 0 . This gives ξ 1 t = b α −t + e αt −1 α , ξ 2 t = b α 1 − e αt , which is the curve OA in Figure 8.3. On the other hand, if p ∗ 0 is such that −p ∗ 1 0 + αp ∗ 2 0 = 0 and p ∗ 2 0 0, then u ∗ t ≡ −1 and we get ξ 1 t = − b α −t + e αt −1 α , ξ 2 t = − b α 1 − e αt , which is the curve OB. 8.3. VARIABLE FINAL TIME 111 B u ∗ ≡ −1 D u ∗ ≡ 1 C ξ 1 O ξ 2 u ∗ ≡ 1 A E u ∗ ≡ −1 F Figure 8.3: Backward integration of 8.52 and 8.54. Next suppose p ∗ 0 is such that −p ∗ 1 0 + αp ∗ 2 0 0, and p ∗ 2 0 0. Then [1αp ∗ 1 0 + e αt −1αp ∗ 1 0 + p ∗ 2 0] will have a negative value for t ∈ 0, ˆt and a positive value for t ∈ ˆ t, ∞. Hence, if we integrate 8.52, 8.54 backwards in time we get trajectory ξt where ˙ξt = −ξ 2 t ˙ξ 2 t = αξ 2 t+ −b for t ˆt b for t ˆ t , with ξ 1 0 = 0, ξ 2 0 = 0. This give us the curve OCD. Finally if p ∗ 0 is such that −p ∗ 1 0 + αp ∗ 2 0 0, and p ∗ 2 0 0, then u ∗ t = 1 for t ˆ t and u ∗ t = −1 for t ˆt, and we get the curve OEF . We see then that the optimal control u ∗ · has the following characterizing properties: u ∗ t = 1 if x ∗ t is above BOA or on OA −1 if x ∗ t is below BOA or on OB . Hence we can synthesize the optimal control in feedback from: u ∗ t = ψx ∗ t where the B u ∗ ≡ −1 x 2 u ∗ ≡ 1 x 1 A u ∗ ≡ 1 O u ∗ ≡ −1 Figure 8.4: Optimal trajectories of Example 1. 112 CHAPTER 8. CONINUOUS-TIME OPTIMAL CONTROL function ψ : R 2 → {1, −1} is given by see Figure 8.4 ψx 1 , x 2 = 1 if x 1 , x 2 is above BOA or on OA −1 if x 1 , x 2 is below BOA or on OB .

8.4 Linear System, Quadratic Cost