Chapter 2
OPTIMIZATION OVER AN OPEN SET
In this chapter we study in detail the first example of Chapter 1. We first establish some notation which will be in force throughout these Notes. Then we study our example. This will generalize
to a canonical problem, the properties of whose solution are stated as a theorem. Some additional properties are mentioned in the last section.
2.1 Notation
2.1.1
All vectors are column vectors, with two consistent exceptions mentioned in 2.1.3 and 2.1.5 below and some other minor and convenient exceptions in the text. Prime denotes transpose so that if
x ∈ R
n
then x
′
is the row vector x
′
= x
1
, . . . , x
n
, and x = x
1
, . . . , x
n ′
. Vectors are normally denoted by lower case letters, the ith component of a vector
x ∈ R
n
is denoted x
i
, and different vectors denoted by the same symbol are distinguished by superscripts as in
x
j
and x
k
. 0 denotes
both the zero vector and the real number zero, but no confusion will result. Thus if
x = x
1
, . . . , x
n ′
and y = y
1
, . . . , y
n ′
then x
′
y = x
1
y
1
+ . . . + x
n
y
n
as in ordinary matrix multiplication. If
x ∈ R
n
we define |x| = +
√ x
′
x.
2.1.2
If x = x
1
, . . . , x
n ′
and y = y
1
, . . . , y
n ′
then x ≥ y means x
i
≥ y
i
, i = 1, . . . , n. In particular if
x ∈ R
n
, then x ≥ 0, if x
i
≥ 0, i = 1, . . . , n.
2.1.3
Matrices are normally denoted by capital letters. If A is an m × n matrix, then A
j
denotes the jth column of A
, and A
i
denotes the ith row of A. Note that A
i
is a row vector. A
j i
denotes the entry of
A in the ith row and jth column; this entry is sometimes also denoted by the lower case letter a
ij
, and then we also write A = {a
ij
}. I denotes the identity matrix; its size will be clear from the context. If confusion is likely, we write
I
n
to denote the n × n identity matrix.
7
8
CHAPTER 2. OPTIMIZATION OVER AN OPEN SET 2.1.4
If f : R
n
→ R
m
is a function, its ith component is written f
i
, i = 1, . . . , m. Note that f
i
: R
n
→ R. Sometimes we describe a function by specifying a rule to calculate
f x for every x. In this case we write
f : x 7→ f x. For example, if A is an m × n matrix, we can write F : x 7→ Ax to denote the function
f : R
n
→ R
m
whose value at a point x ∈ R
n
is Ax.
2.1.5
If f : R
n
7→ R is a differentiable function, the derivative of f at ˆx is the row vector ∂f ∂x
1
ˆ x, . . . , ∂f ∂x
n
ˆ x.
This derivative is denoted by ∂f ∂xˆ
x or f
x
ˆ x or ∂f ∂x|
x=ˆ x
or f
x
|
x=ˆ x
, and if the argument ˆ
x is clear from the context it may be dropped. The column vector
f
x
ˆ x
′
is also denoted ∇
x
f ˆ x,
and is called the gradient of f at ˆ
x. If f : x, y 7→ f x, y is a differentiable function from R
n
× R
m
into R, the partial derivative of f with respect to x at the point ˆ
x, ˆ y is the n-dimensional
row vector f
x
ˆ x, ˆ
y = ∂f ∂xˆ x, ˆ
y = ∂f ∂x
1
ˆ x, ˆ
y, . . . , ∂f ∂x
n
ˆ x, ˆ
y, and similarly f
y
ˆ x, ˆ
y = ∂f ∂yˆ x, ˆ
y = ∂f ∂y
1
ˆ x, ˆ
y, . . . , ∂f ∂y
m
ˆ x, ˆ
y. Finally, if f : R
n
→ R
m
is a differentiable function with components
f
1
, . . . , f
m
, then its derivative at ˆ
x is the m × n matrix
∂f ∂x
ˆ x = f
x
ˆ x =
f
1x
ˆ x
.. .
f
mx
ˆ x
=
∂f
1
∂x
1
ˆ x
.. .
∂f
m
∂x
1
ˆ x
. . . . . .
∂f
1
∂x
n
ˆ x
.. .
∂f
m
∂x
n
ˆ x
2.1.6
If f : R
n
→ R is twice differentiable, its second derivative at ˆx is the n×n matrix ∂
2
f ∂x∂xˆ x =
f
xx
ˆ x where f
xx
ˆ x
j i
= ∂
2
f ∂x
j
∂x
i
ˆ x. Thus, in terms of the notation in Section 2.1.5 above,
f
xx
ˆ x = ∂∂xf
x ′
ˆ x.
2.2 Example