Chapter 7
SEQUENTIAL DECISION PROBLEMS: CONTINUOUS-TIME OPTIMAL
CONTROL OF LINEAR SYSTEMS
We will investigate decision problems similar to those studied in the last chapter with one math- ematically crucial difference. A choice of control has to be made at each instant of time
t where t varies continuously over a finite interval. The evolution in time of the state of the systems to be
controlled is governed by a differential equation of the form:
˙xt = f t, xt, ut , where
xt ∈ R
n
and ut ∈ R
p
are respectively the state and control of the system at time t.
To understand the main ideas and techniques of analysis it will prove profitable to study the linear case first. The general nonlinear case is deferred to the next chapter. In Section 1 we present the
general linear problem and study the case where the initial and final conditions are particularly simple. In Section 2 we study more general boundary conditions.
7.1 The Linear Optimal Control Problem
We consider a dynamical system governed by the linear differential equation 7.1: ˙xt = Atxt + Btut, t ≥ t
. 7.1
Here A· and B· are n × n- and n × p-matrix valued functions of time; we assume that they are
piecewise continuous functions. The control u· is constrained to take values in a fixed set Ω ⊂ R
p
, and to be piecewise continuous.
Definition: A piecewise continuous function
u : [t , ∞ → Ω will be called an admissible control.
U denotes the set of all admissible controls. Let
c ∈ R
n
, x ∈ R
n
be fixed and let t
f
≥ t be a fixed time. We are concerned with the
83
84
CHAPTER 7. CONTINUOUS-TIME LINEAR OPTIMAL CONTROL
decision problem 7.2. Maximize
c
′
xt
f
, subject to
dynamics: ˙xt = Atxt + Btut , t
≤ t ≤ t
f
, initial condition:
xt = x
, final condition:
xt
f
∈ R
n
, control constraint:
u· ∈ U . 7.2
Definition: i For any piecewise continuous function
u· : [t , t
f
] → R
p
, for any z ∈ R
n
, and any
t ≤ t
1
≤ t
2
≤ t
f
let φt
2
, t
1
, z, u denote the state of 7.1 at time
t
2
, if a time t
1
it is in state z, and the control u· is applied.
ii Let Kt
2
, t
1
, z = {φt
2
, t
1
, z, u|u ∈ U} . Thus,
Kt
2
, t
1
, z is the set of states reachable at time t
2
starting at time t
1
in state z and using
admissible controls. We call K the reachable set.
Definition: Let
Φt, τ , t ≤ τ ≤ t ≤ t
f
, be the transition-matrix function of the homogeneous part of 7.1, i.e.,
Φ satisfies the differential equation
∂Φ ∂t
t, τ = AtΦt, τ , and the boundary condition
Φt, t ≡ I
n
. The next result is well-known. See Desoer [1970].
Lemma 1: φt
2
, t
1
, z, u = Φt
2
, t
1
z + Z
t
2
t
1
Φt
2
, τ Bτ uτ dτ .
Exercise 1: i Assuming that
Ω is convex, show that U is a convex set. ii Assuming that U is convex show that
Kt
2
, t
1
, z is a convex set. It is a deep result that Kt
2
, t
1
, z is convex even if Ω is not convex see Neustadt [1963], provided we include in U any measurable function
u : [t , ∞ → Ω.
Definition: Let
K ⊂ R
n
, and let x
∗
∈ K. We say that c is the outward normal to a hyperplane supporting K at
x
∗
if c 6= 0, and
c
′
x
∗
≥ c
′
x for all x ∈ K . The next result gives a geometric characterization of the optimal solutions of 2.
Lemma 2: Suppose
c 6= 0. Let u
∗
· ∈ U and let x
∗
t = φt, t , x
, u
∗
. Then u
∗
is an optimal solution of 2 iff
i x
∗
t
f
is on the boundary of K = Kt
f
, t , x
, and ii
c is the outward normal to a hyperplane supporting K at x
∗
. See Figure 7.1. Proof:
Clearly i is implied by ii because if x
∗
t
f
is in the interior of K there is δ 0 such that
x
∗
t
f
+ δc ∈ K; but then
7.1.
THE LINEAR OPTIMAL CONTROL PROBLEM 85
x
3
c x
2
x
1
c
x
∗
t
f
π
∗
= {x|c
′
x = c
′
x
∗
t
f
} K
Figure 7.1: c is the outward normal to π
∗
supporting K at x
∗
t
f
. c
′
x
∗
t
f
+ δc = c
′
x
∗
t
f
+ δ|c|
2
c
′
x
∗
t
f
. Finally, from the definition of
K it follows immediately that u
∗
is optimal iff c
′
x
∗
t
f
≥ c
′
x for all x ∈ K .
♦ The result above characterizes the optimal control
u
∗
in terms of the final state x
∗
t
f
. The beauty and utility of the theory lies in the following result which translates this characterization directly in
terms of u
∗
. Theorem 1:
Let u
∗
· ∈ U and let x
∗
t = φt, t , x
, u
∗
, t ≤ t ≤ t
f
. Let p
∗
t be the solution of 7.3 and 7.4:
adjoint equation: ˙p
∗
t = −A
′
tp
∗
t , t ≤ t ≤ t
f
. 7.3
final condition: p
∗
t
f
= c . 7.4
Then u
∗
· is optimal iff p
∗
t
′
Btu
∗
t = sup{p
∗
t
′
Btv|v ∈ Ω} , 7.5
for all t ∈ [t
, t
f
], except possibly for a finite set. Proof:
u
∗
· is optimal iff for every u· ∈ U p
∗
t
f ′
[Φt
f
, t x
+ R
t
f
t
Φt
f
, τ Bτ u
∗
τ dτ ] ≥ p
∗
t
f ′
[Φt
f
, t x
+ R
t
f
t
Φt
f
, τ Bτ uτ dτ ] , which is equivalent to 7.6.
R
t
f
t
p
∗
t
f ′
Φt
f
, τ Bτ u
∗
τ dτ ≥
R
t
f
t
p
∗
t
f ′
Φt
f
, τ Bτ uτ dτ 7.6
86
CHAPTER 7. CONTINUOUS-TIME LINEAR OPTIMAL CONTROL
Now by properties of the adjoint equation we know that p
∗
t
′
= p
∗
t
f ′
Φt
f
, t so that 7.6 is equivalent to 7.7,
R
t
f
t
p
∗
τ
′
Bτ u
∗
τ dτ ≥ R
t
f
t
p
∗
τ
′
Bτ uτ dτ, 7.7
and the sufficiency of 7.5 is immediate. To prove the necessity let
D be the finite set of points where the function B· or u
∗
· is discon- tinuous. We shall show that if
u
∗
· is optimal then 7.5 is satisfied for t 6∈ D. Indeed if this is not the case, then there exists
t
∗
∈ [t , t
f
], t
∗
6∈ D, and v ∈ Ω such that p
∗
t
∗ ′
Bt
∗
u
∗
t
∗
p
∗
t
∗ ′
Bt
∗
v , and since
t
∗
is a point of continuity of B· and u
∗
·, it follows that there exists δ 0 such that p
∗
t
′
Btu
∗
t p
∗
t
′
Btv, for |t − t
∗
| δ . 7.8
Define ˜
u· ∈ U by ˜
ut = v
|t − t
∗
| δ, t ∈ [t , t
f
] u
∗
t otherwise . Then 7.8 implies that
R
t
f
t
p
∗
t
′
Bt˜ utdt
R
t
f
t
p
∗
t
′
Btu
∗
tdt . But then from 7.7 we see that
u
∗
· cannot be optimal, giving a contradiction. ♦
Corollary 1: For
t ≤ t
1
≤ t
2
≤ t
f
, p
∗
t
2
x
∗
t
2
≥ p
∗
t
2 ′
x for all x ∈ Kt
2
, t
1
, x
∗
t
1
. 7.9
Exercise 2: Prove Corollary 1.
Remark 1: The geometric meaning of 7.9 is the following. Taking
t
1
= t in 7.9, we see that if
u
∗
· is optimal, i.e., if c = p
∗
t
f
is the outward normal to a hyperplane supporting Kt
f
, t , x
at x
∗
t
f
, then x
∗
t is on the boundary of Kt, t , x
and p
∗
t is the normal to a hyperplane supporting
Kt, t , x
at x
∗
t. This normal is obtained by transporting backwards in time, via the adjoint differential equation, the outward normal
p
∗
t
f
at time t
f
. The situation is illustrated in Figure 7.2.
Remark 2: If we define the Hamiltonian function
H by Ht, x, u, p = p
′
Atx + Btu , and we define
M by M t, x, p = sup{Ht, x, u, p|u ∈ Ω},
then 7.5 can be rewritten as Ht, x
∗
t, u
∗
t, p
∗
t = M t, x
∗
t, p
∗
t . 7.10
This condition is known as the maximum principle.
7.2.
MORE GENERAL BOUNDARY CONDITIONS 87
Exercise 3: i Show that
mt = M t, x
∗
t, p
∗
t is a Lipschitz function of t. ii If At, Bt are constant, show that
mt is constant. Hint: Show that dmdt ≡ 0. The next two exercises show how we can obtain important qualitative properties of an optimal
control.
Exercise 4:
Suppose that Ω is bounded and closed. Show that there exists an optimal control u
∗
· such that
u
∗
t belongs to the boundary of Ω for all t.
Exercise 5:
Suppose Ω = [α, β], so that Bt is an n × 1 matrix. Suppose that At ≡ A and
Bt ≡ B are constant matrices and A has n real eigenvalues. Show that there is an optimal control
u
∗
· and t ≤ t
1
≤ t
2
≤ . . . ≤ t
n
≤ t
f
such that u
∗
t ≡ α or β on [t
i
, t
i+1
, 0 ≤ i ≤ n. Hint: first show that
p
∗
t
′
B = γ
1
exp δ
1
t + . . . + γ
n
exp δ
n
t for some γ
i
, δ
i
in R.
Exercise 6:
Assume that Kt
f
, t , x
is convex see remark in Exercise 1 above. Let f
: R
n
→ R be a differentiable function and suppose that the objective function in 7.2 is f
xt
f
instead of c
′
xt
f
. Suppose u
∗
· is an optimal control. Show that u
∗
· satisfies the maximum principle 7.10 where
p
∗
· is the solution of the adjoint equation 7.3 with the final condition
p
∗
t
f
= ▽f x
∗
t
f
. Also show that this condition is sufficient for optimality if
f is concave. Hint: Use Lemma 1 of
5.1.1 to show that if u
∗
· is optimal, then f
0x
x
∗
t
f
x
∗
t
f
− x ≤ for all x ∈ Kt
f
, t , x
.
7.2 More General Boundary Conditions