APPLICATIONS OF DIFFERENTIAL CALCULUS
APPLICATIONS OF DIFFERENTIAL CALCULUS
9.1 Partial differential equations
The theorems of differential calculus developed in Chapter 8 have a wide variety of applications. This chapter illustrates their use in some examples related to partial differen- tial equations, implicit functions, and extremum problems. We begin with some elemen- tary remarks concerning partial differential equations.
An equation involving a scalar fieldfand its partial derivatives is called a partial equation. Two simple examples in
is a function of two variables are the order equation
and the second-order equation
Each of these is a homogeneous linear partial differential equation. That is, each has the form L(f) = 0, where is a linear differential operator involving one or more partial derivatives. Equation (9.2) is called the two-dimensional
equation.
Some of the theory of linear ordinary differential equations can be extended to partial differential equations. For example, it is easy to verify that for each of Equations (9.1) and (9.2) the set of solutions is a linear space. However, there is an important difference between ordinary and partial linear differential equations that should be realized at the outset. We illustrate this difference by comparing the partial differential equation (9.1)
with the ordinary differential equation
f'(x) = 0.
where C is an arbitrary constant: In other words, the solution-space of (9.3) is one-dimensional. But the most general function satisfying (9.1) is
The most general function satisfying (9.3)
Applications of
calculus
where g is any function of y. Since g is arbitrary we can easily obtain an infinite set of independent solutions. For example, we can take g(y) =
and let c vary over all real numbers. Thus, the solution-space of (9.1) is inznite-dimensional.
In some respects this example is typical of what happens in general. Somewhere in the process of solving a first-order partial differential equation, an integration is required to remove each partial derivative. At this step an arbitrary function is introduced in the solution. This results in an infinite-dimensional solution space.
In many problems involving partial differential equations it is necessary to select from the wealth of solutions a particular solution satisfying one or more auxiliary conditions. As might be expected, the nature of these conditions has a profound effect on the existence or uniqueness of solutions. A systematic study of such problems will not be attempted in this book. Instead, we will treat some special cases to illustrate
introducedinChapter8.
9.2 A first-order partial differential equation with constant coefficients
Consider the first-order partial differential equation
All the solutions of this equation can be found by geometric considerations. We express the left member as a dot product, and write the equation in the form
(x,
at each point (x, y). But we also know that
This tells us that the gradient vector
(x, y) is orthogonal to the vector 3i +
y) is orthogonal to the level curves Hence these level curves must be straight lines parallel to
In other words, the level curves off are the lines
2x
3y = c.
is constant when 2x is constant. This suggests that
for some function
Now we verify that, for each differentiable function g, the scalar fieldf defined by (9.5) does, indeed, satisfy (9.4). Using the chain rule to compute the partial derivatives off we find
Therefore, f satisfies (9.4).
285 Conversely, we can show that every differentiablefwhich satisfies (9.4) must necessarily
A first-order partial
equation with constant
have the form (9.5) for some To do this, we introduce a linear change of variables, (9.6)
into a function of u and v, say
Bv,
Dv) .
We shall choose the constants A, B, C, D so that h satisfies the simpler equation (9.7)
Then we shall solve this equation and show that has the required form. Using the chain rule we find
ah
satisfies (9.4) we have
so the equation for becomes
Therefore, h will satisfy (9.7) if we choose A = Taking A = 3 and C = 2 we find
x=
y=
For this choice of A and C, the function h satisfies
so
is a function of alone,
for some function g. To express in terms of x and we eliminate from (9.8) and obtain
D = 1. For this choice the transformation (9.6) is nonsingular; we have = 2x and hence
y) =
= g(c) =
This shows that every differentiable solutionfof (9.4) has the form (9.5).
Exactly the same type of argument proves the following theorem for first-order equations with constant coefficients.
dejined on the equation
THEOREM
9.1. Let he
on
and let f be the
ay),
Applications of
calculus
a and b are constants, not both zero. thejrst-order partial equation
everywhere in
solution of (9.10) necessarily has the form (9.9) for some
every
9.3 Exercises
In this set of exercises you may assume differentiability of all functions under consideration.
1. Determine that solution of the partial differential equation
which satisfies the
0) = sin x for all x.
2. Determine that solution of the partial differential equation
s------- ax
which satisfies the
y) prove that satisfies the partial differential equation
Find a solution such that
0, prove that satisfies the partial-differential equation
Find a solution such that
0. 4. v) satisfies the partial differential equation
1) = 2 and
l/x) = l/x for all x
prove v) =
is a function of alone and is a function of alone.
where
5. Assume f satisfies the partial differential equation
A, B, C, D are constant, and let
Introduce the linear change of variables, x = Au + Bv, y = Cu + Dv, where
v) =
+ Bv ,
+ Dv) . Compute
integer values of
287 A,
partial
equation with constant
C, D such thatg satisfies = 0. Solve this equation forg and thereby (Assume equality of the mixed partials.) 6. A function is defined by an equation of the form
Show that satisfies a partial differential equation of the form
and find 7. The substitution x =
y=
convertsf(x, y) into
known to satisfy the partial differential equation
show that satisfies the partial-differential equation
8. a scalar field that is differentiable on an open set Sin We say that is homogeneous of degree p over if
for every > 0 and every in for which For a homogeneous scalar field of degreep show that we have
= (x)
for each in .
This is known as Euler’s theorem for homogeneous functions. If = . . . , x,) it can be expressed as
[Hint: For fixed x,
and
9. Prove the converse of Euler’s theorem. That is, if satisfies (x) = p (x) for all in an open set
[Hint: For fixed define g(t)
then must be homogeneous of degree
over
and 10. Prove the following extension of Euler’s theorem for homogeneous functions of degree p in the 2-dimensional case. (Assume equality of the mixed partials.)
+ 2xy
Applications of
calculus
9.4 The one-dimensional wave equation
Imagine a string of infinite length stretched along the x-axis and allowed to vibrate in the We denote by =
the vertical displacement of the string at the point at time We assume that, at time = 0, the string is displaced along a prescribed curve, = F(x). An example is shown in Figure
and (c) show possible displacement curves for later values oft. We regard the
Figures
as an unknown function of x and to be determined. A mathematical model for this problem (suggested by physical considerations which we shall not discuss here) is the partial differential equation
where c is a positive constant depending on the physical characteristics of the string. This equation is called the one-dimensional
equation. We will solve this equation subject to certain auxiliary conditions.
(a) 0
F IGURE 9.1 The displacement curve = shown for various values of
Since the initial displacement is the prescribed curve = P(x), we seek a solution satisfying the condition
We also assume that the velocity of the vertical displacement, is prescribed at time =
say
where G is a given function. It seems reasonable to expect that this information should suffice to determine the subsequent motion of the string. We will show that, indeed, this is true by determining the functionfin terms of F and G. The solution is expressed in a form given by Jean
a French mathematician and philosopher.
THEOREM 9.2. SOLUTION OF THE WAVE EQUATION . Let Fand G be given functions such that is
and F is twice
on
Then the function f
by the formula (9.11)
2 2c I *--et
289 the one-dimensional
The one-dimensional
equation
equation
(9.12) and the initial conditions (9.13)
0) = G(x).
Conversely, any function with equal mixed (9.12) and (9.13) sari/y has the form (9.11).
Proof. It is a straightforward exercise to verify that the function given by (9.11) satisfies the wave equation and the given initial conditions. This verification is left to the reader. We shall prove the converse.
One way to proceed is to assume thatfis a solution of the wave equation, introduce a linear change of variables,
Dv,
which
into a function of and v, say
v) =
+ Bv, Cu + Dv)
and choose the constants A,
C, D so that g satisfies the simpler equation
Solving this equation for g we find that
where is a function of alone and
v) =
B, C, D can be chosen so that = x +
is a function of v alone. The constants
v = ct , from which we obtain (9.14)
Then we use the initial conditions (9.13) to determine the functions and in terms of the given functions F and G.
We will obtain (9.14) by another method which makes use of Theorem 9.1 and avoids the change of variables. First we rewrite the wave equation in the form
where and
are the first-order linear differential operators given by
Letf be a solution of (9.15) and let
Applications of differential calculus
Equation (9.15) states that satisfies the first-order equation = 0. Hence, by Theorem 9.1 we have
ct)
for some function Let
be any primitive of
say
and let
t) =
+ ct) .
We will show that
L,(f). We have
+ ct),
In other words, the difference v satisfies the first-order equation
v) = 0.
By Theorem 9.1 we must
ct) for some function y. There- fore
= v(x, +
ct) =
This proves (9.14) with
and
= y.
Now we use the initial conditions (9.13) to determine the functions and in terms of the given functions F and G. The
0) = F(x) implies
The other initial condition,
0) = G(x), implies
= G(x).
Differentiating (9.16) we obtain
Solving (9.17) and (9.18) for
and
we find
G(x),
= F’(x)
G(x).
291 Integrating these relations we get
The one-dimensional wave equation
ds. In the first equation we replace x by x + ct; in the second equation we replace x by
ct. Then we add the two resulting equations and use the fact that + to obtain
t) = + ct) +
ct)
et This completes the proof.
EXAMPLE .
Assume the initial displacement is given by the formula
1 + cos
for
F(x) =
0 for
F IGURE 9.2 A solution of the wave equation shown for = 0 and = 2.
The graph of F is shown in Figures and 9.2(a). Suppose that the initial velocity G(x) = 0 for all x. Then the resulting solution of the wave equation is given by the formula
The figures illustrate that the solution of the wave equation is a combination of two standing waves, one traveling to the right, the other to the left, each with speed c.
Figures 9.1 and 9.2 show the curve =
for various values of
Further examples illustrating the use of the chain rule in the study of partial differential
Applications of
calculus
9.5 Exercises
In this set of exercises you may assume differentiability of all functions under consideration.
If is a positive constant
let
du.
(a) Show that =
and
at’
(b) Show thatfsatisfies the partial differential equation
(the heat equation).
2. Consider a scalar fieldfdefined in
depends only on the distance r of (x, y) from the origin, say
such
y) = g(r), where r=
(a) Prove that for (x, y)
we have
= g’(r) + g”(r).
(b) Now assume further thatfsatisfies Laplace’s equation,
for all (x, y)
+ + for (0,
Use part (a) to prove that
(x,
= a log
where a and are constants.
3. Repeat Exercise 2 for the n-dimensional case, where
3. That is, assume that f(x)
x,) =g(r), where r =
Show that
for x 0. If
satisfies the n- dimensional
equation,
for all
0, deduce that f(x) = a for
0, where a,
are constants.
Note:
The linear operator
defined by the equation
Exercises
4. Laplacian in polar coordinates. The introduction of polar coordinates = r cos
= r sin converts
y) into
Verify the following formulas:
(a)
cos
5. Three- dimensional Laplacian in spherical coordinates. The introduction of spherical coordinates x=
= sin sin
z=
z) to This exercise shows how to express the Laplacian in
terms of partial derivatives of F.
(a) First introduce polar coordinates x = r cos = r sin to transformf (x, y, z) Use Exercise 4 to show that
r = sin Note that, except for a change in notation, this transformation is the same as that used in part (a). Deduce that
(b) Now transform
z) to
by taking z = cos
cos
6. This exercise shows how Legendre’s differential equation arises when we seek solutions of Laplace’s equation having a special form. Letfbe a scalar field satisfying the three-dimensional Laplaceequation,
= 0. Introduce spherical coordinates as in Exercise 5 and let
is independent of and has the special form
(a) Suppose we seek solutions f of Laplace’s equation such that
. Show that f satisfies Laplace’s equation if G
satisfies the second-order equation
(b) The change of variable x = cos
1) transforms to g(x). Show that
= arccos
satisfies the Legendre equation
7. Tw o- dimensional w av e equation.
A thin flexible membrane is stretched over the xy-plane and allowed to vibrate. Let z =
denote the vertical displacement of the membrane at the point (x, y) at time t. Physical considerations suggest equation,
Applications of
calculus
where c is a positive constant depending on the physical characteristics of the membrane. This exercise reveals a connection between this equation and Bessel’s differential equation.
(a) Introduce polar coordinates x = r cos y = r sin and let = cos 0, r sin . Iffsatisfies the wave equation show that F satisfies the equation
(b) If is independent of say F(r, = t) the equation in (a) simplifies to
Now let be a solution such that t) factors into a function of r times a function of say t) =
. Show that each of the functions R and satisfies an ordinary linear differ- ential equation of second order. (c) If the function in part (b) is periodic with period
show that R satisfies the Bessel equation
9.6 Derivatives of functions defined implicitly
Some surfaces in are described by Cartesian equations of the form
z) = 0.
An equation like this is said to provide an implicit representation of the surface. For example, the equation
= 0 represents the surface of a unit sphere with center at the origin. Sometimes it is possible to solve the equation
y, z) = 0 for one of the variables in terms of the other fwo, say for z in terms of x and y. This leads to one or more equations of the form
For the sphere we have two solutions,
and
one representing the upper hemisphere, the other the lower hemisphere.
In the general case it may not be an easy matter to obtain an explicit formula for z in terms of x and y. For example, there is no easy method for solving for z in the equation + xz +
4 = 0. Nevertheless, a judicious use of the chain rule makes it possible to deduce various properties of the partial derivatives
and without an explicit knowledge
y). The procedure is described in this section. We assume that there is a
y) such that
for all (x, y) in some open set although we may not have explicit formulas for calculating y). We describe this by saying that the equation
y, = 0 defines z implicitly as a
295 function of x and
Derivatives of functions
implicitly
and we write
Now we introduce an auxiliary function defined on S as follows:
Equation (9.19) states that = 0 on S; hence the partial derivatives and are also 0 on S. But we can also compute these partial derivatives by
rule. To do this we write
The chain rule gives us the formulas
where y) = x,
where each partial derivative
is to be evaluated at (x,
y)). Since we have
and
the first of the foregoing equations becomes
Solving this for we obtain
at those points at which
0. By a similar argument we obtain a corre- sponding formula for
[x,
0. These formulas are usually written more briefly as follows:
at those points at which
y)]
c 0 defines z as a function of x and
EXAMPLE . Assume that the equation
+ xz +
e) = 2, and compute the partial derivatives
say z y). Find a value of the constant c such that
and
at the point (x,
= (0, e) .
Applications of
calculus
Solution. and this is satisfied by c = 4. Let
4. From (9.20) and (9.21) we have
x+
4) and = 4). Note that we were able to compute the partial derivatives
When x = y= and z = 2 we find
and using only the value
y) at the single point (0, e). The foregoing discussion can be extended to functions of more than two variables.
THEOREM 9.3. Let F be a on an open set Tin Assume that the equation
. . . , x,) = 0
dejines implicitly as a
function of
say
for allpoints ...,
Then for each k = . . . , n 1, the partial derivative
in some open set S in
is given by the formula
at those points at which
and which appear in (9.22) are to be evaluated at the point
0. The partial derivatives
The proof is a direct extension of the argument used to derive Equations (9.20) and (9.21) and is left to the reader.
The discussion can be generalized in another way. Suppose we have two surfaces with the following implicit representations :
=o.
If these surfaces intersect along a curve C, it may be possible to obtain a parametric representation of C by solving the two equations in (9.23) simultaneously for two of the variables in terms of the third, say for x and y in terms of z. Let us suppose that it is possible to solve for x and y and that solutions are given by the equations
x = X(z),
for all z in some open interval (a, b). Then when x and y are replaced by X(z) and Y(z), respectively, the two equations in (9.23) are identically satisfied. That is, we can write
Derivatives of functions
implicitly
Y(z), z] = 0 and Y(z), z] = 0 for all z in (a, b). Again, by using the chain rule, we can compute the derivatives X’(z) and Y’(z) without an explicit knowledge of X(z) and Y(z). To do this we introduce new
and by means of the equations
and
z].
Then f(z) = g(z) = 0 for every z in (a,
and hence the derivatives f’(z) and g’(z) are also zero on (a, b). By the chain rule these derivatives are given by the formula
Y’(z) + = + Y’(z) + .
and g’(z) are both zero we can determine X’(z) and Y’(z) by solving the follow- ing pair of simultaneous linear equations:
-&X’(z) +
Y’(z) =
X’(z) +
Y’(z) =
At those points at which the determinant of the system is not zero, these equations have a unique solution which can be expressed as follows, using Cramer’s rule:
Y’(z) =
The determinants which appear in (9.24) are determinants of Jacobian matrices and are
called Jacobian determinants.
A special notation is often used to denote Jacobian deter- minants. We write
Applications of
calculus
In this notation, the formulas in (9.24) can be expressed more briefly in the form
(The minus sign has been incorporated into the numerators by interchanging the columns.)
The method can be extended to treat more general situations in which m equations in n variables are given, where n > m and we solve for m of the variables in terms of the remaining n
m variables. The partial derivatives of the new functions so defined can be expressed as quotients of Jacobian determinants, generalizing (9.25). An example with m 2 and n = 4 is described in Exercise 3 of Section 9.8.
9 . 7 W o rk e d e x ampl e s
In this section we illustrate some of the concepts of the foregoing section by solving various types of problems dealing with functions defined implicitly.
EXAMPLE 1. Assume that the equation y) = 0 determines y as a differentiable function of x, say y = Y(x) for all x in some open interval (a, b). Express the derivative Y’(x) in terms of the partial derivatives of g.
Solution. Let G(x) = Y(x)] for x in (a, b). Then the equation y) 0 implies G(x) = 0 in (a, b). By the chain rule we have
G’(x) =
Y’(x),
from which we obtain
at those points x in (a, at which
0. The partial derivatives and are given by the formulas
Y(x)]. EXAMPLE 2. When y is eliminated from the two equations z
Y(x)] and
y) and y) 0, the result can be expressed in the form z = h(x). Express the derivative h’(x) in terms of the partial derivatives off and
Solution. Let us assume that the equation y) 0 may be solved for y in terms of x and that a solution is given by y = Y(x) for all x in some open interval (a, b). Then the function h is given by the formula
if x (a, b) .
Applying the chain rule we have
h’(x) =
Y’(x).
299 Using Equation (9.26) of Example we obtain the formula
Worked examples
--_-- h’(x) =
The partial derivatives on the right are to be evaluated at the point (x, Y(x)). Note that the numerator can also be expressed as a Jacobian determinant, giving us
EXAMPLE
define and as functions x and y. Find formulas for
3. The two equations 2x =
and y =
Solution. If we hold y fixed and differentiate the two equations in question with respect to x, remembering that and v are functions of x and y, we obtain
and
Solving these simultaneously for
and
we find
On the other hand, if we hold x fixed and differentiate the two given equations with respect to y we obtain the equations
and
Solving these simultaneously we find
and
EXAMPLE
4. Let be defined as a function of x and y by means of the equation
Find and in terms of the partial derivatives of F. Solution. Suppose that =
y) for all (x, y) in some open set Substituting y) for in the original
we must have
Applications of
calculus
where y) x +
. Now we hold y fixed and differ- entiate both sides of (9.27) with respect to x, using the chain rule on the right, to obtain
y) and
y)
ax
But =1+
and
Hence (9.28) becomes
Solving this equation for
(and writing
for
we obtain
+y
In a similar way we find
This leads to the equation
+y
The partial derivatives and are to be evaluated at the EXAMPLE 5. When is eliminated from the two equations x = + v and y =
we get an equation of the form
y, v) = 0 which defines v implicitly as a function of x and y, say v y) . Prove that
y)
2x
and find a similar formula for Solution. Eliminating from the two given equations, we obtain the relation
Let F be the function defined by the equation
-y.
The discussion in Section 9.6 is now applicable and we can write
and
Worked examp
1. Hence the equations in (9.29) become
and
y) 2x and
2xv
2x 3v
2xv
y)
y)
6. The equation y, z) = 0 defines z implicitly as a function of x and y, say z
EXAMPLE
A ssuming that show that
where the partial derivatives on the right are to be evaluated at (x, y)). Solution. By Equation (9.20) of Section 9.6 we have
ax- --i@%*
We must remember that this quotient really means
Let us introduce
Our object is to evaluate the partial derivative with respect to of the quotient
and
holding y fixed. The rule for differentiating quotients gives us
Since G and H are composite functions, we use the chain rule to compute the partial derivatives
and
For we have
Applications of
calculus
Similarly, we find
af
Substituting these in (9.32) and replacing by the quotient in (9.31) we obtain the formula in (9.30).
9.8 Exercises
In the exercises in this section you may assume the existence and continuity of all derivatives under consideration.
1. The two equations x =
= u v determine x and implicitly as functions of and v, say x =
and
= ( XV if x , and find similar formulas for
v) and y =
v). Show that
Y/au,
2. The two equations x + y = and xy = v determine x and v as functions of and say x =
0, and find similar formulas for
y) and v =
y). Show that
y, u, v) = 0 determine x and y implicitly as functions of and v, say x =
3. The two equations
v) = 0 and
and =
v). Show that
ax
at points at which the Jacobian y) 0, and find similar formulas for the partial derivatives
a and
25 and
4. The intersection of the two surfaces given by the Cartesian equations +
= contains a curve C passing through the point P= These equations may be solved for x and y in terms of z to give a parametric representation of C with z as parameter.
explicit knowledge of the parametric representation.
(a) Find a unit tangent vector to C at the
P without
(b) Check the result in part (a) by determining a parametric representation of C with z as parameter.
define a surface in xyz-space. Find a normal vector to this surface at the point x = 1, y =
5. The three equations
v) = 0, =
and v =
z= if it is known that
2) = 1 and
6. The three equations
sin (uv) +
xy -sinucosv
define x, y, and z as functions of and v. Compute the partial derivatives and at
Maxima, minima, and saddle points
7. The = 0 defines z implicitly as a function of x and y, say z = y). Show that
at those points at which
is not zero.
8. Let F be a real-valued function of two real variables and assume that the partial derivatives and are never zero. Let be another real-valued function of two real variables such that the partial derivatives
and
are related by the equation F( ax, = 0.
Prove that a constant exists such that
and find Assume that
ax
ax).
9. The equation x z + (y + = 6 defines z implicitly as a function of x and y, say z
in terms of x, y, and z.
y) . Compute the partial derivatives
and
10. The equation sin (x y) + sin (y + z) = 1 defines z implicitly as a function of x and y, say z
y) . Compute the second derivative terms of x, and z.
11. The equation + y z,
= 0 defines z implicitly as a function of x and y, say z
y) . Determine the partial derivatives
and
in terms of x, z and the
partial derivatives
and
12. Letfandg be functions of one real variable and define
y) =
+ g(y)]. Find formulas
for all the partial derivatives of Fof first and second order, expressed in terms of the derivatives
off and Verify the relation
ax
9.9 Maxima, minima, and saddle points
A surface that is described explicitly by an equation of the form y) can be thought of as a level surface of the scalar field F defined by the equation
Iffis differentiable, the gradient of this field is given by the vector
A linear equation for the tangent plane at a point
can be written in the form
where
and
When both coefficients A and B are zero, the point
is called
a stationary point of the
Applications of differential calculus
The tangent plane is horizontal at a stationary point. The stationary points of a surface are usually classified into three categories : maxima, minima, and saddle points. If the surface is thought of as a mountain landscape, these categories correspond, respectively, to mountain tops, bottoms of valleys, and mountain passes.
The concepts of maxima, minima, and saddle points can be introduced for arbitrary scalar fields defined on subsets of R”.
DEFINITION.
A is said to have an absolute maximum at a point a of a set S
in (9.33)
for in S. The number
f (a) is called the absolute maximum value off on S. The function f
is said to have a relative maximum at a if the inequality in (9.33) is satisjiedfor every in
some n-ball B(a) lying in S.
In other words, a relative maximum at a is the absolute maximum in some neighborhood of a. The terms absolute minimum and relative minimum are defined in an analogous fashion, using the inequality opposite to that in (9.33). The adjectives global and local are sometimes used in place of absolute and relative, respectively.
A number which is either a relative maximum or a relative minimum off is called an extremum off.
DEFINITION.
Iff has an extremum at an interior point a and is differentiable there, then all first-order partial derivatives
must be zero. In other words, Vf(a) = 0. (This is easily proved by holding each component fixed and reducing the problem to the dimensional case.) In the case = 2, this means that there is a horizontal tangent plane
to the surface z =
f (a)). On the other hand, it is easy to find examples in which the vanishing of all partial derivatives at a does not necessarily imply an extremum at a. This occurs at the so-called saddlepoints which are defined as follows.
y) at the point (a,
DEFINITION.
Assume
is differentiable at a. If
(a) = 0 the point a is called a stationary
point of
A stationary point is called a saddle point every n-ball B(a) contains points such that
f (x) f(a) and other points such that f (x) > f (a).
The situation is somewhat analogous to the one-dimensional case in which stationary points of a function are classified as maxima, minima, and points of inflection. The following examples illustrate several types of stationary points. In each case the stationary point in question is at the origin.
EXAM PLE 1. Relative maximum. z = (x,
2 This surface is a parabo-
Maxima, minima, and saddle
(a) z 2 (b) Level curves: + =c
Example 1. Relative maximum at the origin.
(c) z +
Example 2. Relative minimum at the origin.
F IGURE 9.3 Examples 1 and 2.
level curves are circles, some of which are shown in Figure 9.3(b). Since y) =
0) for all (x, y), it follows thatfnot only has a relative maximum at
but also an absolute maximum there. Both partial derivatives and vanish at the origin.
This example, another parabo- loid of revolution, is essentially the same as Example
EXAMPLE 2. Relative minimum. =
y) =
except that there is a minimum at the origin rather than a maximum. The appearance of the surface near the origin is illustrated in Figure 9.3(c) and some of the level curves are shown in Figure 9.3(b).
Applications of
calculus
EXAMPLE 3. Saddle point. z y) = xy. This surface is a hyperbolic paraboloid. Near the origin the surface is saddle shaped, as shown in Figure 9.4(a). Both partial derivatives
and are zero at the origin but there is neither a relative maximum nor a relative minimum there. In fact, for points (x, in the first or third quadrants, x and y have the same sign, giving us
y) > 0 =
0) , whereas for points in the
second and fourth quadrants x and y have opposite signs, giving us f (x, y) < 0 =
Therefore, in every neighborhood of the origin there are points at which the function is less
0) and points at which the function 0), so the origin is a saddle
(a) z (b) Level curves: =
F IGURE 9.4 Example
3. Saddle point at the origin.
point. The presence of the saddle point is also revealed by Figure 9.4(b), which shows some of the level curves near
These are hyperbolas having the x- and y-axes as asymptotes.
Near the origin, this surface has the appearance of a mountain pass in the vicinity of three peaks. This surface, sometimes referred to as a “monkey saddle,” is shown in Figure 9.5(a). Some of the level curves are illustrated in Figure
EXAMPLE 4. Saddle point. z
y)
It is clear that there is a saddle point at the origin.
EXAMPLE 5. Relative minimum. =
y)=xy
This surface has the appearance of
a valley surrounded by four mountains, as suggested by Figure 9.6(a). There is an absolute minimum at the origin,
0) for all (x, y). The level curves [shown in Figure 9.6(b)] are hyperbolas having the x- and y-axes as asymptotes. Note that these level curves are similar to those in Example 3. In this case, however, the function assumes only nonnegative values on all its level curves.
y)
EXAMPLE 6. Relative maximum. z In this case the surface is a cylinder with generators parallel to the y-axis, as shown in Figure 9.7(a). Cross sections cut by planes parallel to the x-axis are parabolas. There is obviously an absolute maximum
(a) 3 (b) Level curves: 3 = c. F IGURE 9.5 Example 4. Saddle point at the origin.
(a) Level curves: c F IGURE 9.6 Example 5. Relative minimum at the origin.
Tangent plane at
(a) 1 (b) Level curves: 1 c F IGURE 9.7 Example 6. Relative maximum at the origin.
Applications of
calculus
9.10 Second-order Taylor formula for scalar fields
If a differentiable scalar field has a stationary point at a, the nature of the stationary point is determined by the algebraic sign of the difference f’(x) -f(a) for near a. If + , we have the first-order Taylor formula
0. At a stationary point,
where
as
0 and the Taylor formula becomes
= II
To determine the algebraic sign of
) we need more information about the error term
+ y)
y). The next theorem shows that if has continuous second-order partial derivatives at a, the error term is equal to a quadratic form,
plus a term of smaller order than The coefficients of the quadratic form are the second-order partial derivatives
, evaluated at a. The n x n matrix of second-order derivatives
is called the Hessian matrix? and is denoted by H(x).
Thus, we have
whenever the derivatives exist. The quadratic form can be written more simply in matrix notation as follows:
where y = ... is considered as a 1 x n row matrix, and is its transpose, an n x 1 column matrix. When the partial derivatives
are continuous we have =
and the matrix H(u) is symmetric. Taylor’s formula, giving a quadratic approximation to
+ y) f(u), now takes the following form.
THEOREM 9.4. SCALAR FIELDS . Let
be a scalar
with continuous second-order partial derivatives
in an n-ball B(u). Then for all
y in such that a + y B(u) we have
+ + w here
Named for Ludwig Otto Hesse a German mathematician who made many contributions to the theory of surfaces.
Second-order Taylor formula for scalar
This can also be written in the form
where
y) 0 as y 0.
Proof. Keep fixed and define g(u) for real by the equation
e will prove the theorem by applying the order Taylor formula to on the interval [0, We obtain
(9.36) where 0 c 1. Here we have used Lagrange’s form of the remainder (see Section 7.7 of Volume I).
Since is a composite function given by , where r(u) = + we can compute its derivative by the chain rule. We have r’(u) = y so the chain rule gives us
provided r(u) B(u). In particular, g’(0) = . Using the chain rule once more we
Hence g”(c) +
so Equation (9.36) becomes (9.34).
To prove (9.35) we
by the equation
if
and let
0) = 0. Then Equation (9.34) takes the form
-f(u) =
To complete the proof we need to show that
0 as 0. From (9.37) we find that
Applications of
calculus
Dividing by we obtain the inequality
for y
a, we have +
0. Since each second-order partial derivative is continuous at
so
y) 0 as y
0. This completes the proof.
9.11 The nature of a stationary point determined by the eigenvalues of the Hessian matrix
0, so the Taylor formula in Equation (9.35) becomes
At a stationary point we have V’(a)
it seems reasonable to expect that for small y the algebraic sign of
Since the error term
y) tends to zero faster than
+ y) -f(a) is the same as that of the quadratic form
hence the nature of the stationary point should be determined by the algebraic sign of the quadratic form. This section is devoted to a proof of this fact. First we give a connection between the algebraic sign of a quadratic form and its values.
THEOREM 9.5. Let A = be an n n realsymmetric matrix, and let
Then w e have: (a) Q(y) > 0 for
0 the eigenvalues of A are positive. (b) Q(y) < 0 for ally 0 and only
the eigenvalues of A are negative.
Note: In case (a), the quadratic form is calledpositive definite; in case (b) it is called
negative definite. Proof.
According to Theorem 5.11 there is an orthogonal matrix C that reduces the quadratic form
to a diagonal form. That is
is the row matrix =
and
are the eigenvalues of
A. The eigenvalues are real since A is symmetric.
If all the eigenvalues are positive, Equation (9.38) shows that Q(y) > 0 whenever
0. But since =
have y =
so
0 if and only if y
0. Therefore
0. Conversely, if Q(y) > 0 for all y
0 we can choose y so that = is the kth co- ordinate vector
. For this y, Equation (9.38) gives us Q(y) so each > 0. This proves part (a). The proof of (b) is entirely analogous.
Nature of a stationary point determined by eigenvalues of Hessian matrix
The next theorem describes the nature of a stationary point in terms of the algebraic sign of the quadratic form
with continuous second-order partial derivatives in an n-ball B(a), and let H(a) denote the Hessian matrix at a stationary point a. Then we have: (a)
THEOREM 9.6. Let f be a scalar
the eigenvalues of H(a) are positive, has a relative minimum at a.
(b) If the eigenvalues of H(a) are negative,
has a relative maximum at a. (c) If H(a) has both positive and negative eigenvalues, then has a saddle point at a.
Proof. Let Q(y) =
The Taylor formula gives us
where
0. We will prove that there is a positive number such that, if 0<
0 as
same as that of Q(y). Assume first that all the eigenvalues
< the algebraic sign
-f(u) is
of H(a) are positive. Let h be the smallest eigenvalue. If <
the numbers
are also positive. These numbers are the eigenvalues of the real symmetric matrix H(a) where I is the x identity matrix. By
9.5, the quadratic form is positive definite, and hence
0. There- fore
> 0 for all y
for all real
h . Taking =
we obtain the inequality
for all
there is a positive number r such that <
< r. For such y we have
and Taylor’s formula (9.39) shows that
Therefore
has a relative minimum at a, which proves part (a). To prove (b) we can use a
similar argument, or simply apply part (a) to To prove (c), let I, and
of opposite signs. Let h min
be two eigenvalues of
Then for each real satisfying -h <
h the numbers
and
Applications of differential calculus
are eigenvalues of opposite sign for the matrix Therefore, if , the quadratic form
takes both positive and negative values in every neighbor- hood of = 0. Choose r > 0 as above so that
whenever 0 < < r. Then, arguing as above, we see that for such the algebraic sign
+ -f(a) is the same as that of
has a saddle point at
Since both positive and negative values occur as
This completes the proof. Note: If all the eigenvalues of
are zero, Theorem 9.6 gives no information
concerning the stationary point. Tests involving higher order derivatives can be used to treat such examples, but we shall not discuss them here.
9.12 Second-derivative test for extrema of functions of two variables
In the case n = 2 the nature of the stationary point can also be determined by the algebraic sign of the second derivative
and the determinant of the Hessian matrix.
with continuous order partial derivatives in a 2-ball B(u). Let
THEOREM 9.7. Let a be a stationary point of a (x,,
Then we have: (a) If A
has a saddle point at (b) If A > 0 and A >
f has a relative minimum at a.
(c) If A > 0 and A
has a relative maximum at a.
(d) If A = 0, the test is inconclusive.
Proof. In this case the characteristic equation det H(u)] = 0 is a quadratic equation,
The eigenvalues
are related to the coefficients by the equations
= A.
If A < 0 the eigenvalues have opposite signs, so
has a saddle point at a, which proves (a).
If A > the eigenvalues have the same sign. In this case AC > so A and C have the same sign. This sign must be that of
since A + C = + This proves (b) and (c). To prove (d) we refer to Examples 4 and 5 of Section 9.9. In both these examples we have
and
A 0 at the origin. In Example 4 the origin is a saddle point, and in Example 5 it is a relative minimum.
Exercises
Even when Theorem 9.7 is applicable it may not be the simplest way determine the nature of a stationary point. For example, when
where y) = +
2 , the test is applicable, but the computations are lengthy. In this case we may express
y) + (1 cos . We see at once
y) as a sum of squares by writing
has relative maxima at the points at which and (1 cos
0. These are the points
when n is any integer.
9.13 Exercises
In Exercises 1 through 15, locate and classify the stationary points (if any) of the surfaces having the Cartesian equations given.
13. z =sinxsinysin(x
mx the function has a minimum at (0, 0), but that there is no relative minimum in any two-dimensional neighborhood of the origin. Make a sketch indicating the set of points (x, y) at
16. y) =
Show that on every line y
y) > 0 and the set at whichf(x, y) < 0.
17. y) = (3 x)(3
+ y 3).
(a) Make a sketch indicating the set of points (x, y) at
f ( x, y ) 0 . (b) Find all points (x, y) in the plane at which
( x, y ) =
( x, y ) = 0. [ H in t: y)
has (3 y) as a factor.] (c) Which of the stationary points are relative maxima. Which are relative minima? Which are neither? Give reasons for your answers.
(d) Does f have an absolute minimum or an absolute maximum on the whole plane? Give reasons for your answers.
18. Determine all the relative and absolute extreme values and the saddle points for the function
19. Determine constants a and such that the integral
{ a x + - f( x) } ” dx
will be as small as possible if (a) f ( x) =
(b)
f ( x) =
A > 0 and < AC. (a) Prove that a point
20. Let f( x, y ) =
+ F, where
Transform the quad- ratic part to a sum of squares.]
exists at which f has a minimum.
(b) Prove that
+ Ey, + Fat this minimum.
Show that
= AC
Applications of
calculus
21. Method of least squares. Given distinct numbers
and n further numbers ..., (not necessarily distinct), it is generally impossible to find a straight
ax + which passes through all the points
for each However, we can try a linear function which makes the “total square error”
that is, such
a minimum. Determine values of a and which do this.
22. Extend the method of least squares to 3-space. That is, find a linear = ax + + which minimizes the total square error
where are given distinct points and . . . , are given real numbers.
23. be n distinct points in m-space. If
, define
Prove thatfhas a minimum at the point a = (the centroid).
24. Let be a stationary point of a scalar fieldfwith continuous second-order partial derivatives k=l in an n-ball B(a). Prove thatfhas a saddle point at a if at least two of the diagonal entries
of the Hessian matrix
have opposite signs.
has a stationary point at (1, and determine the nature of this stationary point by computing the eigenvalues of its Hessian matrix.
25. Verify that the scalar
y, z) =
9.14 Extrema with constraints. Lagrange’s multipliers
We begin this section with two examples of extremum problems with constraints. EXAMPLE 1. Given a surface
not passing through the origin, determine those points of S which are nearest to the origin.
determine the maximum and minimum values of the temperature on a given curve C in 3-space.
EXAMPLE 2. denotes the temperature at (x, y,
Both these examples are special cases of the following general problem: Determine the extreme values of a
w h en is restricted to lie in a given subset of the domain
In Example 1 the scalar field to be minimized is the distance function,
y, =
315 the constraining subset is the given surface S. In Example 2 the constraining subset is the
Extrema with constraints. Lagrange’s
given curve C. Constrained extremum problems are often very difficult; no general method is known for attacking them in their fullest generality. Special methods are available when the constraining subset has a fairly simple structure, for instance, if it is a surface as in Example
or a curve as in Example 2. This section discusses the method of Lagrange’s multipliers for solving such problems. First we describe the method in its general form, and then we give geometric arguments to show why it works in the two examples mentioned above.
The method of Lagrange’s multipliers. If a scalar field
. . . , x,) has a relative
extremum when it is subject to m constraints, say (9.40) where m < n , then there exist m scalars
such that
(9.41) at each extremum point.
To determine the extremum points in practice we consider the system of n + m equations obtained by taking the m constraint equations in (9.40)
with the n scalar equations determined by the vector relation (9.41). These equations are to be solved (if possible) for the n + m unknowns
..., at which relative extrema occur are found among the solutions to these equations. The scalars
and
. . . , 1,. The points
..., which are introduced to help us solve this type of problem are called Lagrange’s multipliers. One multiplier is introduced for each constraint. The scalar
are assumed to be differentiable. The method is valid if the number of constraints, m, is less than the number of variables, n, and if not all the Jacobian determinants of the constraint functions with respect to m of the variables
field f and the constraint functions
..., are zero at the extreme value in question. The proof of the validity of the method is an important result in advanced calculus and will not be discussed here. (See Chapter 7 of the author’s M athematical Analysis, Adldison-Wesley, Reading, Mass., 1957.) Instead we give geometric arguments to show why the method works in the two
examples described at the beginning of this section.
Geometric solution of Example 1. We wish to determine: those points on a given surface S which are nearest to the origin. A point (x,
z) in 3-space lies at a distance r from the origin if and only if it lies on the sphere
This sphere is a level surface of the function f (x,
z) =
+ which is being
minimized. If we start with r
0 and let r increase until the corresponding level surface first touches the given surface S, each point of contact will be a point of S nearest to the origin.
Applications of
calculus
To determine the coordinates of the contact points we assume that S is described by a Cartesian equation
z) = 0. If S has a tangent plane at a point of contact, this plane must also be tangent to the contacting level surface. Therefore the gradient vector of the surface
z) = 0 must be parallel to the gradient vector of the contacting level surface =
Hence there is a constant such that
at each contact point. This is the vector equation (9.41) provided by Lagrange’s method when there is one constraint.
Geometric solution to Example 2. We seek the extreme values of a temperature function
z) on a given curve C. If we regard the curve C as the intersection of two surfaces,
and
we have an extremum problem with two constraints. The two gradient vectors
and
are normals to these surfaces, hence they are also normal to C, the curve of inter- section. (See Figure 9.8.) We show next that the gradient vector
of the temperature
F IGURE 9.9 The gradient vector Vf lies in a shown lying in the same plane.
F IGURE 9.8 The vectors
and Vf
plane normal to C.
function is also normal to C at each relative extremum on C. This implies that in the same plane as
and
hence if
and
are independent we can express Vf as a
linear combination of
and
, say
This is the vector equation (9.41) provided by Lagrange’s method when there are two constraints.
317 To show that
Extrema with constraints. Lagrange’s multipliers
is normal to C at an extremum we imagine C as being described by a vector-valued function a(t), where t varies over an interval [a,
On the curve C the temperature becomes a function oft, say
= f [a(t)]. If has a relative extremum at an
interior point of [a,
On the other hand, the chain rule tells us that
we must have
is given by the dot product
= [a(t)] . a’(t).
This dot product is zero at , hence
. But is tangent to C, so
is perpendicular to
lies in the plane normal to C, as shown in Figure 9.9.
The two gradient vectors
and
are independent if a.nd only if their cross product is
The cross product is given by
Therefore, independence of
means that not all three of the Jacobian deter- minants on the right are zero. As remarked earlier, Lagrange’s method is applicable whenever this condition is satisfied.
and
If and are dependent the method may fail. For example, suppose we try to apply Lagrange’s method to find the extreme values of
+ on the curve of intersection of the two surfaces
z)
0, where z) = and
z) = 0 and
z) = 1)“. The two surfaces, a plane and a cylinder, intersect along the straight line C shown in Figure 9.10. The problem obviously has a solution, because
F IGURE 9.10 An example where Lagrange’s method is not applicable.
Applications of
calculus
represents the distance of the point (x, y, z) from the z-axis and this distance is a minimum on C when the point is at (0,
0). However, at this point the gradient vectors are
and it is clear that there are no scalars and that satisfy Equation (9.41).
and Of =
9.15 Exercises
1. Find the extreme values of z = xy subject to the condition x + y = 1. Find the maximum and minimum distances from the origin to the curve
3. Assume a and b are fixed positive numbers. (a) Find the extreme values of z =
+ y/b subject to the condition + =1. (b) Find the extreme values of z =
subject to the condition x/a y/b = 1 . In each case, interpret the problem geometrically.
4. Find the extreme values of z =
x+
y subject to the side condition x y =
5. Find the extreme values of the scalar z) = x 2y + on the sphere + + = 1.
6. Find the points of the surface xy = 1 nearest to the origin.
7. Find the shortest distance from the point
to the parabola
= 4x.
8. Find the points on the curve of intersection of the two surfaces
+ =1 which are nearest to the origin.
xy = 1
and
9. If b, and c are positive numbers, find the maximum value off (x, z) = subject to the side condition x + y + z = 1 .
10. Find the minimum volume bounded by the planes x = 0, y = 0, z = 0, and a plane which is tangent to the ellipsoid
11. Find the maximum of log x + logy + 3 log z on that portion of the sphere + + where x >
Use the result to prove that for real positive numbers a, c we have
y>
z>
< AC. Let m and M denote the distances from the origin to the nearest and furthest points of the conic. Show that
12. Given the conic section
= 1, where A > 0 and
A C+
and
a companion formula for
13. Use the method of Lagrange’s multipliers to find the greatest and least distances of a point on the ellipse
= 4 from the straight line x + y = 4.
14. The cross section of a trough is an trapezoid. If the trough is made by bending up the sides of a strip of metal c inches wide, what should be the angle of inclination of the sides and the width across the bottom if the cross-sectional area is to be a maximum?
The extreme-value theorem for continuous 319
9.16 The extreme-value theorem for continuous scalar fields
The extreme-value theorem for real-valued functions continuous on a closed and bounded interval can be extended to scalar fields. We consider scalar fields continuous on a closed n-dimensional interval. Such an interval is defined as the Cartesian product of dimensional closed intervals. If a = (a,, ..., a,) and 6 =
..., we write
For example, when = 2 the Cartesian product [a, is a rectangle. The proof of the extreme-value theorem parallels the proof given in Volume I for the l-dimensional case. First we prove that continuity offimplies boundedness, then we prove thatfactually attains its maximum and minimum values somewhere in [a, b].
THEOREM 9.8. BOUNDEDNESS THEOREM FOR CONTINUOUS SCALAR FIELDS .
fis a
continuous at each point of
a closed interval [a, b] in R”, then f is bounded on [a, b].
That is, there is a number C 0 such that Cfor all in [a, b].
Proof. We argue by contradiction, using the method of successive bisection. Figure
9.11 illustrates the method for the case n = 2.
Assumefis unbounded on [a, b]. Let
= [a, b] and let
= [a,,
so that
Bisect each one-dimensional interval to form two subintervals, a left half and a right half
. Now consider all possible Cartesian of the form
F IGURE 9.11 Illustrating the method of successive bisection in the plane.
Applications of differential calculus
where = I or 2. There are exactly 2” such products. Each product is an n-dimen- sional subinterval of [a, b], and their union is equal to [a,
The functionf’is unbounded in at least one of these subintervals (if it were bounded in each of them it would also be bounded on [a, b]). One of these we denote by
which we express as
where each is one of the one-dimensional subintervals of of length We now proceed with
bisecting each one-dimensional component interval
as we did with
is unbounded. We continue the process, obtaining an infinite set of n-dimensional intervals
and arriving at an n-dimensional interval
in which
in each of unbounded. The
interval
can be expressed in the form
Since each one-dimensional interval
is obtained by
1 successive bisections of
[a,,
if we write
we have
. . must therefore be equal to the infimum of all right endpoints
For each fixed k, the supremum of all left endpoints
= 2, . . . and their common value we denote by
lies in [a, b]. By continuity off’at there is an n-ball
The point =
r) in which we have
This inequality implies
for all in
r)
[a,
[a, . But this set contains the entire interval when m is large enough so that each of the numbers in (9.42) is less than
so bounded on the set
r)
Therefore
for such m the function f is bounded on
contradicting the fact that f is unbounded on
This contradiction completes the proof. If is bounded on [a, b], the set of all function values
is a set of real numbers bounded above and below. Therefore this set has a supremum and an infimum which we
denote by sup f and
respectively. That is, we write
inf
The small-span theorem for continuous (uniform 321 Now we prove that a continuous function takes on both values inff’and supfsomewhere
in [a,
THEOREM 9.9 THEOREM FIELDS . Iff is con-
tinuous on a closed interval [a, b] in then there exist points c and d in [a, b] such that
and
= inff.
Proof. It suffices to prove that f attains its supremum in [a, b ]. The result for the
infimum then follows as a consequence because the infimum off is the supremum of -f.
b] for which = and obtain a contradiction. Let g(x) =
Let
= sup f. We shall assume that there is no in
Then g(x) > 0 for all in [a, b] so the reciprocal l/g is continuous on [a, b]. By the boundedness theorem, l/g is bounded on [a, b], say
< C for all in [a, b], where C > 0. This implies
-f(x) > l/C, so
< A4 l/C for all in [a, b]. This contradicts the fact that is the least upper
bound off on [a, b]. Hence f(x) =
for at least one x in [a, b].
9.17 The small-span theorem for continuous scalar fields (uniform continuity)
Let f be continuous on a bounded closed interval [a, b] in and let M(f) and m(f)
denote, respectively, the maximum and minimum values off on [a, b]. The difference
is called the span off on [a, b]. As in the one-dimensional case we have a small-span theorem for continuous functions which tells us that the interval [a, b] can be partitioned so that the span off in each subinterval is arbitrarily small.
be a partition of the interval [a,,
Write [a, b] = [a,,
x ... x [a,,
and let
That is, is a set of points
such that =
= b,. The Cartesian product
P=
is called a partition of the interval [a, b]. The small-span theorem, also called the theorem on uniform continuity, now takes the following form.
continuous on a closed interval [a, b] in Then for every 0 there is a partition of [a, b] into
THEOREM 9.10. Let f be a
number of subintervals such that the span off in every subintercal is less than
Proof. The proof is entirely analogous to the one-dimensional case so we only outline the principal steps. We argue by contradiction, using the method of successive bisection.
We assume the theorem is false; that is, we assume that for some the interval [a, b]
Applications of
calculus
cannot be partitioned into a finite number of subintervals in each of which the span off is less than
By successive bisection we obtain an infinite set of subintervals ..., in each of which the span off is at least
By considering the least upper bound of the leftmost endpoints of the component intervals of
. . . we obtain a point in [a, lying in all these intervals. By continuity off at there is an n-ball
such that the span off is less than
in [a, . But, when m is sufficiently large, the interval lies in the set
[a, , so the span off is no larger than in contradicting the fact that the span off is at least in