APPLICATIONS OF DIFFERENTIAL CALCULUS

APPLICATIONS OF DIFFERENTIAL CALCULUS

9.1 Partial differential equations

The theorems of differential calculus developed in Chapter 8 have a wide variety of applications. This chapter illustrates their use in some examples related to partial differen- tial equations, implicit functions, and extremum problems. We begin with some elemen- tary remarks concerning partial differential equations.

An equation involving a scalar fieldfand its partial derivatives is called a partial equation. Two simple examples in

is a function of two variables are the order equation

and the second-order equation

Each of these is a homogeneous linear partial differential equation. That is, each has the form L(f) = 0, where is a linear differential operator involving one or more partial derivatives. Equation (9.2) is called the two-dimensional

equation.

Some of the theory of linear ordinary differential equations can be extended to partial differential equations. For example, it is easy to verify that for each of Equations (9.1) and (9.2) the set of solutions is a linear space. However, there is an important difference between ordinary and partial linear differential equations that should be realized at the outset. We illustrate this difference by comparing the partial differential equation (9.1)

with the ordinary differential equation

f'(x) = 0.

where C is an arbitrary constant: In other words, the solution-space of (9.3) is one-dimensional. But the most general function satisfying (9.1) is

The most general function satisfying (9.3)

Applications of

calculus

where g is any function of y. Since g is arbitrary we can easily obtain an infinite set of independent solutions. For example, we can take g(y) =

and let c vary over all real numbers. Thus, the solution-space of (9.1) is inznite-dimensional.

In some respects this example is typical of what happens in general. Somewhere in the process of solving a first-order partial differential equation, an integration is required to remove each partial derivative. At this step an arbitrary function is introduced in the solution. This results in an infinite-dimensional solution space.

In many problems involving partial differential equations it is necessary to select from the wealth of solutions a particular solution satisfying one or more auxiliary conditions. As might be expected, the nature of these conditions has a profound effect on the existence or uniqueness of solutions. A systematic study of such problems will not be attempted in this book. Instead, we will treat some special cases to illustrate

introducedinChapter8.

9.2 A first-order partial differential equation with constant coefficients

Consider the first-order partial differential equation

All the solutions of this equation can be found by geometric considerations. We express the left member as a dot product, and write the equation in the form

(x,

at each point (x, y). But we also know that

This tells us that the gradient vector

(x, y) is orthogonal to the vector 3i +

y) is orthogonal to the level curves Hence these level curves must be straight lines parallel to

In other words, the level curves off are the lines

2x

3y = c.

is constant when 2x is constant. This suggests that

for some function

Now we verify that, for each differentiable function g, the scalar fieldf defined by (9.5) does, indeed, satisfy (9.4). Using the chain rule to compute the partial derivatives off we find

Therefore, f satisfies (9.4).

285 Conversely, we can show that every differentiablefwhich satisfies (9.4) must necessarily

A first-order partial

equation with constant

have the form (9.5) for some To do this, we introduce a linear change of variables, (9.6)

into a function of u and v, say

Bv,

Dv) .

We shall choose the constants A, B, C, D so that h satisfies the simpler equation (9.7)

Then we shall solve this equation and show that has the required form. Using the chain rule we find

ah

satisfies (9.4) we have

so the equation for becomes

Therefore, h will satisfy (9.7) if we choose A = Taking A = 3 and C = 2 we find

x=

y=

For this choice of A and C, the function h satisfies

so

is a function of alone,

for some function g. To express in terms of x and we eliminate from (9.8) and obtain

D = 1. For this choice the transformation (9.6) is nonsingular; we have = 2x and hence

y) =

= g(c) =

This shows that every differentiable solutionfof (9.4) has the form (9.5).

Exactly the same type of argument proves the following theorem for first-order equations with constant coefficients.

dejined on the equation

THEOREM

9.1. Let he

on

and let f be the

ay),

Applications of

calculus

a and b are constants, not both zero. thejrst-order partial equation

everywhere in

solution of (9.10) necessarily has the form (9.9) for some

every

9.3 Exercises

In this set of exercises you may assume differentiability of all functions under consideration.

1. Determine that solution of the partial differential equation

which satisfies the

0) = sin x for all x.

2. Determine that solution of the partial differential equation

s------- ax

which satisfies the

y) prove that satisfies the partial differential equation

Find a solution such that

0, prove that satisfies the partial-differential equation

Find a solution such that

0. 4. v) satisfies the partial differential equation

1) = 2 and

l/x) = l/x for all x

prove v) =

is a function of alone and is a function of alone.

where

5. Assume f satisfies the partial differential equation

A, B, C, D are constant, and let

Introduce the linear change of variables, x = Au + Bv, y = Cu + Dv, where

v) =

+ Bv ,

+ Dv) . Compute

integer values of

287 A,

partial

equation with constant

C, D such thatg satisfies = 0. Solve this equation forg and thereby (Assume equality of the mixed partials.) 6. A function is defined by an equation of the form

Show that satisfies a partial differential equation of the form

and find 7. The substitution x =

y=

convertsf(x, y) into

known to satisfy the partial differential equation

show that satisfies the partial-differential equation

8. a scalar field that is differentiable on an open set Sin We say that is homogeneous of degree p over if

for every > 0 and every in for which For a homogeneous scalar field of degreep show that we have

= (x)

for each in .

This is known as Euler’s theorem for homogeneous functions. If = . . . , x,) it can be expressed as

[Hint: For fixed x,

and

9. Prove the converse of Euler’s theorem. That is, if satisfies (x) = p (x) for all in an open set

[Hint: For fixed define g(t)

then must be homogeneous of degree

over

and 10. Prove the following extension of Euler’s theorem for homogeneous functions of degree p in the 2-dimensional case. (Assume equality of the mixed partials.)

+ 2xy

Applications of

calculus

9.4 The one-dimensional wave equation

Imagine a string of infinite length stretched along the x-axis and allowed to vibrate in the We denote by =

the vertical displacement of the string at the point at time We assume that, at time = 0, the string is displaced along a prescribed curve, = F(x). An example is shown in Figure

and (c) show possible displacement curves for later values oft. We regard the

Figures

as an unknown function of x and to be determined. A mathematical model for this problem (suggested by physical considerations which we shall not discuss here) is the partial differential equation

where c is a positive constant depending on the physical characteristics of the string. This equation is called the one-dimensional

equation. We will solve this equation subject to certain auxiliary conditions.

(a) 0

F IGURE 9.1 The displacement curve = shown for various values of

Since the initial displacement is the prescribed curve = P(x), we seek a solution satisfying the condition

We also assume that the velocity of the vertical displacement, is prescribed at time =

say

where G is a given function. It seems reasonable to expect that this information should suffice to determine the subsequent motion of the string. We will show that, indeed, this is true by determining the functionfin terms of F and G. The solution is expressed in a form given by Jean

a French mathematician and philosopher.

THEOREM 9.2. SOLUTION OF THE WAVE EQUATION . Let Fand G be given functions such that is

and F is twice

on

Then the function f

by the formula (9.11)

2 2c I *--et

289 the one-dimensional

The one-dimensional

equation

equation

(9.12) and the initial conditions (9.13)

0) = G(x).

Conversely, any function with equal mixed (9.12) and (9.13) sari/y has the form (9.11).

Proof. It is a straightforward exercise to verify that the function given by (9.11) satisfies the wave equation and the given initial conditions. This verification is left to the reader. We shall prove the converse.

One way to proceed is to assume thatfis a solution of the wave equation, introduce a linear change of variables,

Dv,

which

into a function of and v, say

v) =

+ Bv, Cu + Dv)

and choose the constants A,

C, D so that g satisfies the simpler equation

Solving this equation for g we find that

where is a function of alone and

v) =

B, C, D can be chosen so that = x +

is a function of v alone. The constants

v = ct , from which we obtain (9.14)

Then we use the initial conditions (9.13) to determine the functions and in terms of the given functions F and G.

We will obtain (9.14) by another method which makes use of Theorem 9.1 and avoids the change of variables. First we rewrite the wave equation in the form

where and

are the first-order linear differential operators given by

Letf be a solution of (9.15) and let

Applications of differential calculus

Equation (9.15) states that satisfies the first-order equation = 0. Hence, by Theorem 9.1 we have

ct)

for some function Let

be any primitive of

say

and let

t) =

+ ct) .

We will show that

L,(f). We have

+ ct),

In other words, the difference v satisfies the first-order equation

v) = 0.

By Theorem 9.1 we must

ct) for some function y. There- fore

= v(x, +

ct) =

This proves (9.14) with

and

= y.

Now we use the initial conditions (9.13) to determine the functions and in terms of the given functions F and G. The

0) = F(x) implies

The other initial condition,

0) = G(x), implies

= G(x).

Differentiating (9.16) we obtain

Solving (9.17) and (9.18) for

and

we find

G(x),

= F’(x)

G(x).

291 Integrating these relations we get

The one-dimensional wave equation

ds. In the first equation we replace x by x + ct; in the second equation we replace x by

ct. Then we add the two resulting equations and use the fact that + to obtain

t) = + ct) +

ct)

et This completes the proof.

EXAMPLE .

Assume the initial displacement is given by the formula

1 + cos

for

F(x) =

0 for

F IGURE 9.2 A solution of the wave equation shown for = 0 and = 2.

The graph of F is shown in Figures and 9.2(a). Suppose that the initial velocity G(x) = 0 for all x. Then the resulting solution of the wave equation is given by the formula

The figures illustrate that the solution of the wave equation is a combination of two standing waves, one traveling to the right, the other to the left, each with speed c.

Figures 9.1 and 9.2 show the curve =

for various values of

Further examples illustrating the use of the chain rule in the study of partial differential

Applications of

calculus

9.5 Exercises

In this set of exercises you may assume differentiability of all functions under consideration.

If is a positive constant

let

du.

(a) Show that =

and

at’

(b) Show thatfsatisfies the partial differential equation

(the heat equation).

2. Consider a scalar fieldfdefined in

depends only on the distance r of (x, y) from the origin, say

such

y) = g(r), where r=

(a) Prove that for (x, y)

we have

= g’(r) + g”(r).

(b) Now assume further thatfsatisfies Laplace’s equation,

for all (x, y)

+ + for (0,

Use part (a) to prove that

(x,

= a log

where a and are constants.

3. Repeat Exercise 2 for the n-dimensional case, where

3. That is, assume that f(x)

x,) =g(r), where r =

Show that

for x 0. If

satisfies the n- dimensional

equation,

for all

0, deduce that f(x) = a for

0, where a,

are constants.

Note:

The linear operator

defined by the equation

Exercises

4. Laplacian in polar coordinates. The introduction of polar coordinates = r cos

= r sin converts

y) into

Verify the following formulas:

(a)

cos

5. Three- dimensional Laplacian in spherical coordinates. The introduction of spherical coordinates x=

= sin sin

z=

z) to This exercise shows how to express the Laplacian in

terms of partial derivatives of F.

(a) First introduce polar coordinates x = r cos = r sin to transformf (x, y, z) Use Exercise 4 to show that

r = sin Note that, except for a change in notation, this transformation is the same as that used in part (a). Deduce that

(b) Now transform

z) to

by taking z = cos

cos

6. This exercise shows how Legendre’s differential equation arises when we seek solutions of Laplace’s equation having a special form. Letfbe a scalar field satisfying the three-dimensional Laplaceequation,

= 0. Introduce spherical coordinates as in Exercise 5 and let

is independent of and has the special form

(a) Suppose we seek solutions f of Laplace’s equation such that

. Show that f satisfies Laplace’s equation if G

satisfies the second-order equation

(b) The change of variable x = cos

1) transforms to g(x). Show that

= arccos

satisfies the Legendre equation

7. Tw o- dimensional w av e equation.

A thin flexible membrane is stretched over the xy-plane and allowed to vibrate. Let z =

denote the vertical displacement of the membrane at the point (x, y) at time t. Physical considerations suggest equation,

Applications of

calculus

where c is a positive constant depending on the physical characteristics of the membrane. This exercise reveals a connection between this equation and Bessel’s differential equation.

(a) Introduce polar coordinates x = r cos y = r sin and let = cos 0, r sin . Iffsatisfies the wave equation show that F satisfies the equation

(b) If is independent of say F(r, = t) the equation in (a) simplifies to

Now let be a solution such that t) factors into a function of r times a function of say t) =

. Show that each of the functions R and satisfies an ordinary linear differ- ential equation of second order. (c) If the function in part (b) is periodic with period

show that R satisfies the Bessel equation

9.6 Derivatives of functions defined implicitly

Some surfaces in are described by Cartesian equations of the form

z) = 0.

An equation like this is said to provide an implicit representation of the surface. For example, the equation

= 0 represents the surface of a unit sphere with center at the origin. Sometimes it is possible to solve the equation

y, z) = 0 for one of the variables in terms of the other fwo, say for z in terms of x and y. This leads to one or more equations of the form

For the sphere we have two solutions,

and

one representing the upper hemisphere, the other the lower hemisphere.

In the general case it may not be an easy matter to obtain an explicit formula for z in terms of x and y. For example, there is no easy method for solving for z in the equation + xz +

4 = 0. Nevertheless, a judicious use of the chain rule makes it possible to deduce various properties of the partial derivatives

and without an explicit knowledge

y). The procedure is described in this section. We assume that there is a

y) such that

for all (x, y) in some open set although we may not have explicit formulas for calculating y). We describe this by saying that the equation

y, = 0 defines z implicitly as a

295 function of x and

Derivatives of functions

implicitly

and we write

Now we introduce an auxiliary function defined on S as follows:

Equation (9.19) states that = 0 on S; hence the partial derivatives and are also 0 on S. But we can also compute these partial derivatives by

rule. To do this we write

The chain rule gives us the formulas

where y) = x,

where each partial derivative

is to be evaluated at (x,

y)). Since we have

and

the first of the foregoing equations becomes

Solving this for we obtain

at those points at which

0. By a similar argument we obtain a corre- sponding formula for

[x,

0. These formulas are usually written more briefly as follows:

at those points at which

y)]

c 0 defines z as a function of x and

EXAMPLE . Assume that the equation

+ xz +

e) = 2, and compute the partial derivatives

say z y). Find a value of the constant c such that

and

at the point (x,

= (0, e) .

Applications of

calculus

Solution. and this is satisfied by c = 4. Let

4. From (9.20) and (9.21) we have

x+

4) and = 4). Note that we were able to compute the partial derivatives

When x = y= and z = 2 we find

and using only the value

y) at the single point (0, e). The foregoing discussion can be extended to functions of more than two variables.

THEOREM 9.3. Let F be a on an open set Tin Assume that the equation

. . . , x,) = 0

dejines implicitly as a

function of

say

for allpoints ...,

Then for each k = . . . , n 1, the partial derivative

in some open set S in

is given by the formula

at those points at which

and which appear in (9.22) are to be evaluated at the point

0. The partial derivatives

The proof is a direct extension of the argument used to derive Equations (9.20) and (9.21) and is left to the reader.

The discussion can be generalized in another way. Suppose we have two surfaces with the following implicit representations :

=o.

If these surfaces intersect along a curve C, it may be possible to obtain a parametric representation of C by solving the two equations in (9.23) simultaneously for two of the variables in terms of the third, say for x and y in terms of z. Let us suppose that it is possible to solve for x and y and that solutions are given by the equations

x = X(z),

for all z in some open interval (a, b). Then when x and y are replaced by X(z) and Y(z), respectively, the two equations in (9.23) are identically satisfied. That is, we can write

Derivatives of functions

implicitly

Y(z), z] = 0 and Y(z), z] = 0 for all z in (a, b). Again, by using the chain rule, we can compute the derivatives X’(z) and Y’(z) without an explicit knowledge of X(z) and Y(z). To do this we introduce new

and by means of the equations

and

z].

Then f(z) = g(z) = 0 for every z in (a,

and hence the derivatives f’(z) and g’(z) are also zero on (a, b). By the chain rule these derivatives are given by the formula

Y’(z) + = + Y’(z) + .

and g’(z) are both zero we can determine X’(z) and Y’(z) by solving the follow- ing pair of simultaneous linear equations:

-&X’(z) +

Y’(z) =

X’(z) +

Y’(z) =

At those points at which the determinant of the system is not zero, these equations have a unique solution which can be expressed as follows, using Cramer’s rule:

Y’(z) =

The determinants which appear in (9.24) are determinants of Jacobian matrices and are

called Jacobian determinants.

A special notation is often used to denote Jacobian deter- minants. We write

Applications of

calculus

In this notation, the formulas in (9.24) can be expressed more briefly in the form

(The minus sign has been incorporated into the numerators by interchanging the columns.)

The method can be extended to treat more general situations in which m equations in n variables are given, where n > m and we solve for m of the variables in terms of the remaining n

m variables. The partial derivatives of the new functions so defined can be expressed as quotients of Jacobian determinants, generalizing (9.25). An example with m 2 and n = 4 is described in Exercise 3 of Section 9.8.

9 . 7 W o rk e d e x ampl e s

In this section we illustrate some of the concepts of the foregoing section by solving various types of problems dealing with functions defined implicitly.

EXAMPLE 1. Assume that the equation y) = 0 determines y as a differentiable function of x, say y = Y(x) for all x in some open interval (a, b). Express the derivative Y’(x) in terms of the partial derivatives of g.

Solution. Let G(x) = Y(x)] for x in (a, b). Then the equation y) 0 implies G(x) = 0 in (a, b). By the chain rule we have

G’(x) =

Y’(x),

from which we obtain

at those points x in (a, at which

0. The partial derivatives and are given by the formulas

Y(x)]. EXAMPLE 2. When y is eliminated from the two equations z

Y(x)] and

y) and y) 0, the result can be expressed in the form z = h(x). Express the derivative h’(x) in terms of the partial derivatives off and

Solution. Let us assume that the equation y) 0 may be solved for y in terms of x and that a solution is given by y = Y(x) for all x in some open interval (a, b). Then the function h is given by the formula

if x (a, b) .

Applying the chain rule we have

h’(x) =

Y’(x).

299 Using Equation (9.26) of Example we obtain the formula

Worked examples

--_-- h’(x) =

The partial derivatives on the right are to be evaluated at the point (x, Y(x)). Note that the numerator can also be expressed as a Jacobian determinant, giving us

EXAMPLE

define and as functions x and y. Find formulas for

3. The two equations 2x =

and y =

Solution. If we hold y fixed and differentiate the two equations in question with respect to x, remembering that and v are functions of x and y, we obtain

and

Solving these simultaneously for

and

we find

On the other hand, if we hold x fixed and differentiate the two given equations with respect to y we obtain the equations

and

Solving these simultaneously we find

and

EXAMPLE

4. Let be defined as a function of x and y by means of the equation

Find and in terms of the partial derivatives of F. Solution. Suppose that =

y) for all (x, y) in some open set Substituting y) for in the original

we must have

Applications of

calculus

where y) x +

. Now we hold y fixed and differ- entiate both sides of (9.27) with respect to x, using the chain rule on the right, to obtain

y) and

y)

ax

But =1+

and

Hence (9.28) becomes

Solving this equation for

(and writing

for

we obtain

+y

In a similar way we find

This leads to the equation

+y

The partial derivatives and are to be evaluated at the EXAMPLE 5. When is eliminated from the two equations x = + v and y =

we get an equation of the form

y, v) = 0 which defines v implicitly as a function of x and y, say v y) . Prove that

y)

2x

and find a similar formula for Solution. Eliminating from the two given equations, we obtain the relation

Let F be the function defined by the equation

-y.

The discussion in Section 9.6 is now applicable and we can write

and

Worked examp

1. Hence the equations in (9.29) become

and

y) 2x and

2xv

2x 3v

2xv

y)

y)

6. The equation y, z) = 0 defines z implicitly as a function of x and y, say z

EXAMPLE

A ssuming that show that

where the partial derivatives on the right are to be evaluated at (x, y)). Solution. By Equation (9.20) of Section 9.6 we have

ax- --i@%*

We must remember that this quotient really means

Let us introduce

Our object is to evaluate the partial derivative with respect to of the quotient

and

holding y fixed. The rule for differentiating quotients gives us

Since G and H are composite functions, we use the chain rule to compute the partial derivatives

and

For we have

Applications of

calculus

Similarly, we find

af

Substituting these in (9.32) and replacing by the quotient in (9.31) we obtain the formula in (9.30).

9.8 Exercises

In the exercises in this section you may assume the existence and continuity of all derivatives under consideration.

1. The two equations x =

= u v determine x and implicitly as functions of and v, say x =

and

= ( XV if x , and find similar formulas for

v) and y =

v). Show that

Y/au,

2. The two equations x + y = and xy = v determine x and v as functions of and say x =

0, and find similar formulas for

y) and v =

y). Show that

y, u, v) = 0 determine x and y implicitly as functions of and v, say x =

3. The two equations

v) = 0 and

and =

v). Show that

ax

at points at which the Jacobian y) 0, and find similar formulas for the partial derivatives

a and

25 and

4. The intersection of the two surfaces given by the Cartesian equations +

= contains a curve C passing through the point P= These equations may be solved for x and y in terms of z to give a parametric representation of C with z as parameter.

explicit knowledge of the parametric representation.

(a) Find a unit tangent vector to C at the

P without

(b) Check the result in part (a) by determining a parametric representation of C with z as parameter.

define a surface in xyz-space. Find a normal vector to this surface at the point x = 1, y =

5. The three equations

v) = 0, =

and v =

z= if it is known that

2) = 1 and

6. The three equations

sin (uv) +

xy -sinucosv

define x, y, and z as functions of and v. Compute the partial derivatives and at

Maxima, minima, and saddle points

7. The = 0 defines z implicitly as a function of x and y, say z = y). Show that

at those points at which

is not zero.

8. Let F be a real-valued function of two real variables and assume that the partial derivatives and are never zero. Let be another real-valued function of two real variables such that the partial derivatives

and

are related by the equation F( ax, = 0.

Prove that a constant exists such that

and find Assume that

ax

ax).

9. The equation x z + (y + = 6 defines z implicitly as a function of x and y, say z

in terms of x, y, and z.

y) . Compute the partial derivatives

and

10. The equation sin (x y) + sin (y + z) = 1 defines z implicitly as a function of x and y, say z

y) . Compute the second derivative terms of x, and z.

11. The equation + y z,

= 0 defines z implicitly as a function of x and y, say z

y) . Determine the partial derivatives

and

in terms of x, z and the

partial derivatives

and

12. Letfandg be functions of one real variable and define

y) =

+ g(y)]. Find formulas

for all the partial derivatives of Fof first and second order, expressed in terms of the derivatives

off and Verify the relation

ax

9.9 Maxima, minima, and saddle points

A surface that is described explicitly by an equation of the form y) can be thought of as a level surface of the scalar field F defined by the equation

Iffis differentiable, the gradient of this field is given by the vector

A linear equation for the tangent plane at a point

can be written in the form

where

and

When both coefficients A and B are zero, the point

is called

a stationary point of the

Applications of differential calculus

The tangent plane is horizontal at a stationary point. The stationary points of a surface are usually classified into three categories : maxima, minima, and saddle points. If the surface is thought of as a mountain landscape, these categories correspond, respectively, to mountain tops, bottoms of valleys, and mountain passes.

The concepts of maxima, minima, and saddle points can be introduced for arbitrary scalar fields defined on subsets of R”.

DEFINITION.

A is said to have an absolute maximum at a point a of a set S

in (9.33)

for in S. The number

f (a) is called the absolute maximum value off on S. The function f

is said to have a relative maximum at a if the inequality in (9.33) is satisjiedfor every in

some n-ball B(a) lying in S.

In other words, a relative maximum at a is the absolute maximum in some neighborhood of a. The terms absolute minimum and relative minimum are defined in an analogous fashion, using the inequality opposite to that in (9.33). The adjectives global and local are sometimes used in place of absolute and relative, respectively.

A number which is either a relative maximum or a relative minimum off is called an extremum off.

DEFINITION.

Iff has an extremum at an interior point a and is differentiable there, then all first-order partial derivatives

must be zero. In other words, Vf(a) = 0. (This is easily proved by holding each component fixed and reducing the problem to the dimensional case.) In the case = 2, this means that there is a horizontal tangent plane

to the surface z =

f (a)). On the other hand, it is easy to find examples in which the vanishing of all partial derivatives at a does not necessarily imply an extremum at a. This occurs at the so-called saddlepoints which are defined as follows.

y) at the point (a,

DEFINITION.

Assume

is differentiable at a. If

(a) = 0 the point a is called a stationary

point of

A stationary point is called a saddle point every n-ball B(a) contains points such that

f (x) f(a) and other points such that f (x) > f (a).

The situation is somewhat analogous to the one-dimensional case in which stationary points of a function are classified as maxima, minima, and points of inflection. The following examples illustrate several types of stationary points. In each case the stationary point in question is at the origin.

EXAM PLE 1. Relative maximum. z = (x,

2 This surface is a parabo-

Maxima, minima, and saddle

(a) z 2 (b) Level curves: + =c

Example 1. Relative maximum at the origin.

(c) z +

Example 2. Relative minimum at the origin.

F IGURE 9.3 Examples 1 and 2.

level curves are circles, some of which are shown in Figure 9.3(b). Since y) =

0) for all (x, y), it follows thatfnot only has a relative maximum at

but also an absolute maximum there. Both partial derivatives and vanish at the origin.

This example, another parabo- loid of revolution, is essentially the same as Example

EXAMPLE 2. Relative minimum. =

y) =

except that there is a minimum at the origin rather than a maximum. The appearance of the surface near the origin is illustrated in Figure 9.3(c) and some of the level curves are shown in Figure 9.3(b).

Applications of

calculus

EXAMPLE 3. Saddle point. z y) = xy. This surface is a hyperbolic paraboloid. Near the origin the surface is saddle shaped, as shown in Figure 9.4(a). Both partial derivatives

and are zero at the origin but there is neither a relative maximum nor a relative minimum there. In fact, for points (x, in the first or third quadrants, x and y have the same sign, giving us

y) > 0 =

0) , whereas for points in the

second and fourth quadrants x and y have opposite signs, giving us f (x, y) < 0 =

Therefore, in every neighborhood of the origin there are points at which the function is less

0) and points at which the function 0), so the origin is a saddle

(a) z (b) Level curves: =

F IGURE 9.4 Example

3. Saddle point at the origin.

point. The presence of the saddle point is also revealed by Figure 9.4(b), which shows some of the level curves near

These are hyperbolas having the x- and y-axes as asymptotes.

Near the origin, this surface has the appearance of a mountain pass in the vicinity of three peaks. This surface, sometimes referred to as a “monkey saddle,” is shown in Figure 9.5(a). Some of the level curves are illustrated in Figure

EXAMPLE 4. Saddle point. z

y)

It is clear that there is a saddle point at the origin.

EXAMPLE 5. Relative minimum. =

y)=xy

This surface has the appearance of

a valley surrounded by four mountains, as suggested by Figure 9.6(a). There is an absolute minimum at the origin,

0) for all (x, y). The level curves [shown in Figure 9.6(b)] are hyperbolas having the x- and y-axes as asymptotes. Note that these level curves are similar to those in Example 3. In this case, however, the function assumes only nonnegative values on all its level curves.

y)

EXAMPLE 6. Relative maximum. z In this case the surface is a cylinder with generators parallel to the y-axis, as shown in Figure 9.7(a). Cross sections cut by planes parallel to the x-axis are parabolas. There is obviously an absolute maximum

(a) 3 (b) Level curves: 3 = c. F IGURE 9.5 Example 4. Saddle point at the origin.

(a) Level curves: c F IGURE 9.6 Example 5. Relative minimum at the origin.

Tangent plane at

(a) 1 (b) Level curves: 1 c F IGURE 9.7 Example 6. Relative maximum at the origin.

Applications of

calculus

9.10 Second-order Taylor formula for scalar fields

If a differentiable scalar field has a stationary point at a, the nature of the stationary point is determined by the algebraic sign of the difference f’(x) -f(a) for near a. If + , we have the first-order Taylor formula

0. At a stationary point,

where

as

0 and the Taylor formula becomes

= II

To determine the algebraic sign of

) we need more information about the error term

+ y)

y). The next theorem shows that if has continuous second-order partial derivatives at a, the error term is equal to a quadratic form,

plus a term of smaller order than The coefficients of the quadratic form are the second-order partial derivatives

, evaluated at a. The n x n matrix of second-order derivatives

is called the Hessian matrix? and is denoted by H(x).

Thus, we have

whenever the derivatives exist. The quadratic form can be written more simply in matrix notation as follows:

where y = ... is considered as a 1 x n row matrix, and is its transpose, an n x 1 column matrix. When the partial derivatives

are continuous we have =

and the matrix H(u) is symmetric. Taylor’s formula, giving a quadratic approximation to

+ y) f(u), now takes the following form.

THEOREM 9.4. SCALAR FIELDS . Let

be a scalar

with continuous second-order partial derivatives

in an n-ball B(u). Then for all

y in such that a + y B(u) we have

+ + w here

Named for Ludwig Otto Hesse a German mathematician who made many contributions to the theory of surfaces.

Second-order Taylor formula for scalar

This can also be written in the form

where

y) 0 as y 0.

Proof. Keep fixed and define g(u) for real by the equation

e will prove the theorem by applying the order Taylor formula to on the interval [0, We obtain

(9.36) where 0 c 1. Here we have used Lagrange’s form of the remainder (see Section 7.7 of Volume I).

Since is a composite function given by , where r(u) = + we can compute its derivative by the chain rule. We have r’(u) = y so the chain rule gives us

provided r(u) B(u). In particular, g’(0) = . Using the chain rule once more we

Hence g”(c) +

so Equation (9.36) becomes (9.34).

To prove (9.35) we

by the equation

if

and let

0) = 0. Then Equation (9.34) takes the form

-f(u) =

To complete the proof we need to show that

0 as 0. From (9.37) we find that

Applications of

calculus

Dividing by we obtain the inequality

for y

a, we have +

0. Since each second-order partial derivative is continuous at

so

y) 0 as y

0. This completes the proof.

9.11 The nature of a stationary point determined by the eigenvalues of the Hessian matrix

0, so the Taylor formula in Equation (9.35) becomes

At a stationary point we have V’(a)

it seems reasonable to expect that for small y the algebraic sign of

Since the error term

y) tends to zero faster than

+ y) -f(a) is the same as that of the quadratic form

hence the nature of the stationary point should be determined by the algebraic sign of the quadratic form. This section is devoted to a proof of this fact. First we give a connection between the algebraic sign of a quadratic form and its values.

THEOREM 9.5. Let A = be an n n realsymmetric matrix, and let

Then w e have: (a) Q(y) > 0 for

0 the eigenvalues of A are positive. (b) Q(y) < 0 for ally 0 and only

the eigenvalues of A are negative.

Note: In case (a), the quadratic form is calledpositive definite; in case (b) it is called

negative definite. Proof.

According to Theorem 5.11 there is an orthogonal matrix C that reduces the quadratic form

to a diagonal form. That is

is the row matrix =

and

are the eigenvalues of

A. The eigenvalues are real since A is symmetric.

If all the eigenvalues are positive, Equation (9.38) shows that Q(y) > 0 whenever

0. But since =

have y =

so

0 if and only if y

0. Therefore

0. Conversely, if Q(y) > 0 for all y

0 we can choose y so that = is the kth co- ordinate vector

. For this y, Equation (9.38) gives us Q(y) so each > 0. This proves part (a). The proof of (b) is entirely analogous.

Nature of a stationary point determined by eigenvalues of Hessian matrix

The next theorem describes the nature of a stationary point in terms of the algebraic sign of the quadratic form

with continuous second-order partial derivatives in an n-ball B(a), and let H(a) denote the Hessian matrix at a stationary point a. Then we have: (a)

THEOREM 9.6. Let f be a scalar

the eigenvalues of H(a) are positive, has a relative minimum at a.

(b) If the eigenvalues of H(a) are negative,

has a relative maximum at a. (c) If H(a) has both positive and negative eigenvalues, then has a saddle point at a.

Proof. Let Q(y) =

The Taylor formula gives us

where

0. We will prove that there is a positive number such that, if 0<

0 as

same as that of Q(y). Assume first that all the eigenvalues

< the algebraic sign

-f(u) is

of H(a) are positive. Let h be the smallest eigenvalue. If <

the numbers

are also positive. These numbers are the eigenvalues of the real symmetric matrix H(a) where I is the x identity matrix. By

9.5, the quadratic form is positive definite, and hence

0. There- fore

> 0 for all y

for all real

h . Taking =

we obtain the inequality

for all

there is a positive number r such that <

< r. For such y we have

and Taylor’s formula (9.39) shows that

Therefore

has a relative minimum at a, which proves part (a). To prove (b) we can use a

similar argument, or simply apply part (a) to To prove (c), let I, and

of opposite signs. Let h min

be two eigenvalues of

Then for each real satisfying -h <

h the numbers

and

Applications of differential calculus

are eigenvalues of opposite sign for the matrix Therefore, if , the quadratic form

takes both positive and negative values in every neighbor- hood of = 0. Choose r > 0 as above so that

whenever 0 < < r. Then, arguing as above, we see that for such the algebraic sign

+ -f(a) is the same as that of

has a saddle point at

Since both positive and negative values occur as

This completes the proof. Note: If all the eigenvalues of

are zero, Theorem 9.6 gives no information

concerning the stationary point. Tests involving higher order derivatives can be used to treat such examples, but we shall not discuss them here.

9.12 Second-derivative test for extrema of functions of two variables

In the case n = 2 the nature of the stationary point can also be determined by the algebraic sign of the second derivative

and the determinant of the Hessian matrix.

with continuous order partial derivatives in a 2-ball B(u). Let

THEOREM 9.7. Let a be a stationary point of a (x,,

Then we have: (a) If A

has a saddle point at (b) If A > 0 and A >

f has a relative minimum at a.

(c) If A > 0 and A

has a relative maximum at a.

(d) If A = 0, the test is inconclusive.

Proof. In this case the characteristic equation det H(u)] = 0 is a quadratic equation,

The eigenvalues

are related to the coefficients by the equations

= A.

If A < 0 the eigenvalues have opposite signs, so

has a saddle point at a, which proves (a).

If A > the eigenvalues have the same sign. In this case AC > so A and C have the same sign. This sign must be that of

since A + C = + This proves (b) and (c). To prove (d) we refer to Examples 4 and 5 of Section 9.9. In both these examples we have

and

A 0 at the origin. In Example 4 the origin is a saddle point, and in Example 5 it is a relative minimum.

Exercises

Even when Theorem 9.7 is applicable it may not be the simplest way determine the nature of a stationary point. For example, when

where y) = +

2 , the test is applicable, but the computations are lengthy. In this case we may express

y) + (1 cos . We see at once

y) as a sum of squares by writing

has relative maxima at the points at which and (1 cos

0. These are the points

when n is any integer.

9.13 Exercises

In Exercises 1 through 15, locate and classify the stationary points (if any) of the surfaces having the Cartesian equations given.

13. z =sinxsinysin(x

mx the function has a minimum at (0, 0), but that there is no relative minimum in any two-dimensional neighborhood of the origin. Make a sketch indicating the set of points (x, y) at

16. y) =

Show that on every line y

y) > 0 and the set at whichf(x, y) < 0.

17. y) = (3 x)(3

+ y 3).

(a) Make a sketch indicating the set of points (x, y) at

f ( x, y ) 0 . (b) Find all points (x, y) in the plane at which

( x, y ) =

( x, y ) = 0. [ H in t: y)

has (3 y) as a factor.] (c) Which of the stationary points are relative maxima. Which are relative minima? Which are neither? Give reasons for your answers.

(d) Does f have an absolute minimum or an absolute maximum on the whole plane? Give reasons for your answers.

18. Determine all the relative and absolute extreme values and the saddle points for the function

19. Determine constants a and such that the integral

{ a x + - f( x) } ” dx

will be as small as possible if (a) f ( x) =

(b)

f ( x) =

A > 0 and < AC. (a) Prove that a point

20. Let f( x, y ) =

+ F, where

Transform the quad- ratic part to a sum of squares.]

exists at which f has a minimum.

(b) Prove that

+ Ey, + Fat this minimum.

Show that

= AC

Applications of

calculus

21. Method of least squares. Given distinct numbers

and n further numbers ..., (not necessarily distinct), it is generally impossible to find a straight

ax + which passes through all the points

for each However, we can try a linear function which makes the “total square error”

that is, such

a minimum. Determine values of a and which do this.

22. Extend the method of least squares to 3-space. That is, find a linear = ax + + which minimizes the total square error

where are given distinct points and . . . , are given real numbers.

23. be n distinct points in m-space. If

, define

Prove thatfhas a minimum at the point a = (the centroid).

24. Let be a stationary point of a scalar fieldfwith continuous second-order partial derivatives k=l in an n-ball B(a). Prove thatfhas a saddle point at a if at least two of the diagonal entries

of the Hessian matrix

have opposite signs.

has a stationary point at (1, and determine the nature of this stationary point by computing the eigenvalues of its Hessian matrix.

25. Verify that the scalar

y, z) =

9.14 Extrema with constraints. Lagrange’s multipliers

We begin this section with two examples of extremum problems with constraints. EXAMPLE 1. Given a surface

not passing through the origin, determine those points of S which are nearest to the origin.

determine the maximum and minimum values of the temperature on a given curve C in 3-space.

EXAMPLE 2. denotes the temperature at (x, y,

Both these examples are special cases of the following general problem: Determine the extreme values of a

w h en is restricted to lie in a given subset of the domain

In Example 1 the scalar field to be minimized is the distance function,

y, =

315 the constraining subset is the given surface S. In Example 2 the constraining subset is the

Extrema with constraints. Lagrange’s

given curve C. Constrained extremum problems are often very difficult; no general method is known for attacking them in their fullest generality. Special methods are available when the constraining subset has a fairly simple structure, for instance, if it is a surface as in Example

or a curve as in Example 2. This section discusses the method of Lagrange’s multipliers for solving such problems. First we describe the method in its general form, and then we give geometric arguments to show why it works in the two examples mentioned above.

The method of Lagrange’s multipliers. If a scalar field

. . . , x,) has a relative

extremum when it is subject to m constraints, say (9.40) where m < n , then there exist m scalars

such that

(9.41) at each extremum point.

To determine the extremum points in practice we consider the system of n + m equations obtained by taking the m constraint equations in (9.40)

with the n scalar equations determined by the vector relation (9.41). These equations are to be solved (if possible) for the n + m unknowns

..., at which relative extrema occur are found among the solutions to these equations. The scalars

and

. . . , 1,. The points

..., which are introduced to help us solve this type of problem are called Lagrange’s multipliers. One multiplier is introduced for each constraint. The scalar

are assumed to be differentiable. The method is valid if the number of constraints, m, is less than the number of variables, n, and if not all the Jacobian determinants of the constraint functions with respect to m of the variables

field f and the constraint functions

..., are zero at the extreme value in question. The proof of the validity of the method is an important result in advanced calculus and will not be discussed here. (See Chapter 7 of the author’s M athematical Analysis, Adldison-Wesley, Reading, Mass., 1957.) Instead we give geometric arguments to show why the method works in the two

examples described at the beginning of this section.

Geometric solution of Example 1. We wish to determine: those points on a given surface S which are nearest to the origin. A point (x,

z) in 3-space lies at a distance r from the origin if and only if it lies on the sphere

This sphere is a level surface of the function f (x,

z) =

+ which is being

minimized. If we start with r

0 and let r increase until the corresponding level surface first touches the given surface S, each point of contact will be a point of S nearest to the origin.

Applications of

calculus

To determine the coordinates of the contact points we assume that S is described by a Cartesian equation

z) = 0. If S has a tangent plane at a point of contact, this plane must also be tangent to the contacting level surface. Therefore the gradient vector of the surface

z) = 0 must be parallel to the gradient vector of the contacting level surface =

Hence there is a constant such that

at each contact point. This is the vector equation (9.41) provided by Lagrange’s method when there is one constraint.

Geometric solution to Example 2. We seek the extreme values of a temperature function

z) on a given curve C. If we regard the curve C as the intersection of two surfaces,

and

we have an extremum problem with two constraints. The two gradient vectors

and

are normals to these surfaces, hence they are also normal to C, the curve of inter- section. (See Figure 9.8.) We show next that the gradient vector

of the temperature

F IGURE 9.9 The gradient vector Vf lies in a shown lying in the same plane.

F IGURE 9.8 The vectors

and Vf

plane normal to C.

function is also normal to C at each relative extremum on C. This implies that in the same plane as

and

hence if

and

are independent we can express Vf as a

linear combination of

and

, say

This is the vector equation (9.41) provided by Lagrange’s method when there are two constraints.

317 To show that

Extrema with constraints. Lagrange’s multipliers

is normal to C at an extremum we imagine C as being described by a vector-valued function a(t), where t varies over an interval [a,

On the curve C the temperature becomes a function oft, say

= f [a(t)]. If has a relative extremum at an

interior point of [a,

On the other hand, the chain rule tells us that

we must have

is given by the dot product

= [a(t)] . a’(t).

This dot product is zero at , hence

. But is tangent to C, so

is perpendicular to

lies in the plane normal to C, as shown in Figure 9.9.

The two gradient vectors

and

are independent if a.nd only if their cross product is

The cross product is given by

Therefore, independence of

means that not all three of the Jacobian deter- minants on the right are zero. As remarked earlier, Lagrange’s method is applicable whenever this condition is satisfied.

and

If and are dependent the method may fail. For example, suppose we try to apply Lagrange’s method to find the extreme values of

+ on the curve of intersection of the two surfaces

z)

0, where z) = and

z) = 0 and

z) = 1)“. The two surfaces, a plane and a cylinder, intersect along the straight line C shown in Figure 9.10. The problem obviously has a solution, because

F IGURE 9.10 An example where Lagrange’s method is not applicable.

Applications of

calculus

represents the distance of the point (x, y, z) from the z-axis and this distance is a minimum on C when the point is at (0,

0). However, at this point the gradient vectors are

and it is clear that there are no scalars and that satisfy Equation (9.41).

and Of =

9.15 Exercises

1. Find the extreme values of z = xy subject to the condition x + y = 1. Find the maximum and minimum distances from the origin to the curve

3. Assume a and b are fixed positive numbers. (a) Find the extreme values of z =

+ y/b subject to the condition + =1. (b) Find the extreme values of z =

subject to the condition x/a y/b = 1 . In each case, interpret the problem geometrically.

4. Find the extreme values of z =

x+

y subject to the side condition x y =

5. Find the extreme values of the scalar z) = x 2y + on the sphere + + = 1.

6. Find the points of the surface xy = 1 nearest to the origin.

7. Find the shortest distance from the point

to the parabola

= 4x.

8. Find the points on the curve of intersection of the two surfaces

+ =1 which are nearest to the origin.

xy = 1

and

9. If b, and c are positive numbers, find the maximum value off (x, z) = subject to the side condition x + y + z = 1 .

10. Find the minimum volume bounded by the planes x = 0, y = 0, z = 0, and a plane which is tangent to the ellipsoid

11. Find the maximum of log x + logy + 3 log z on that portion of the sphere + + where x >

Use the result to prove that for real positive numbers a, c we have

y>

z>

< AC. Let m and M denote the distances from the origin to the nearest and furthest points of the conic. Show that

12. Given the conic section

= 1, where A > 0 and

A C+

and

a companion formula for

13. Use the method of Lagrange’s multipliers to find the greatest and least distances of a point on the ellipse

= 4 from the straight line x + y = 4.

14. The cross section of a trough is an trapezoid. If the trough is made by bending up the sides of a strip of metal c inches wide, what should be the angle of inclination of the sides and the width across the bottom if the cross-sectional area is to be a maximum?

The extreme-value theorem for continuous 319

9.16 The extreme-value theorem for continuous scalar fields

The extreme-value theorem for real-valued functions continuous on a closed and bounded interval can be extended to scalar fields. We consider scalar fields continuous on a closed n-dimensional interval. Such an interval is defined as the Cartesian product of dimensional closed intervals. If a = (a,, ..., a,) and 6 =

..., we write

For example, when = 2 the Cartesian product [a, is a rectangle. The proof of the extreme-value theorem parallels the proof given in Volume I for the l-dimensional case. First we prove that continuity offimplies boundedness, then we prove thatfactually attains its maximum and minimum values somewhere in [a, b].

THEOREM 9.8. BOUNDEDNESS THEOREM FOR CONTINUOUS SCALAR FIELDS .

fis a

continuous at each point of

a closed interval [a, b] in R”, then f is bounded on [a, b].

That is, there is a number C 0 such that Cfor all in [a, b].

Proof. We argue by contradiction, using the method of successive bisection. Figure

9.11 illustrates the method for the case n = 2.

Assumefis unbounded on [a, b]. Let

= [a, b] and let

= [a,,

so that

Bisect each one-dimensional interval to form two subintervals, a left half and a right half

. Now consider all possible Cartesian of the form

F IGURE 9.11 Illustrating the method of successive bisection in the plane.

Applications of differential calculus

where = I or 2. There are exactly 2” such products. Each product is an n-dimen- sional subinterval of [a, b], and their union is equal to [a,

The functionf’is unbounded in at least one of these subintervals (if it were bounded in each of them it would also be bounded on [a, b]). One of these we denote by

which we express as

where each is one of the one-dimensional subintervals of of length We now proceed with

bisecting each one-dimensional component interval

as we did with

is unbounded. We continue the process, obtaining an infinite set of n-dimensional intervals

and arriving at an n-dimensional interval

in which

in each of unbounded. The

interval

can be expressed in the form

Since each one-dimensional interval

is obtained by

1 successive bisections of

[a,,

if we write

we have

. . must therefore be equal to the infimum of all right endpoints

For each fixed k, the supremum of all left endpoints

= 2, . . . and their common value we denote by

lies in [a, b]. By continuity off’at there is an n-ball

The point =

r) in which we have

This inequality implies

for all in

r)

[a,

[a, . But this set contains the entire interval when m is large enough so that each of the numbers in (9.42) is less than

so bounded on the set

r)

Therefore

for such m the function f is bounded on

contradicting the fact that f is unbounded on

This contradiction completes the proof. If is bounded on [a, b], the set of all function values

is a set of real numbers bounded above and below. Therefore this set has a supremum and an infimum which we

denote by sup f and

respectively. That is, we write

inf

The small-span theorem for continuous (uniform 321 Now we prove that a continuous function takes on both values inff’and supfsomewhere

in [a,

THEOREM 9.9 THEOREM FIELDS . Iff is con-

tinuous on a closed interval [a, b] in then there exist points c and d in [a, b] such that

and

= inff.

Proof. It suffices to prove that f attains its supremum in [a, b ]. The result for the

infimum then follows as a consequence because the infimum off is the supremum of -f.

b] for which = and obtain a contradiction. Let g(x) =

Let

= sup f. We shall assume that there is no in

Then g(x) > 0 for all in [a, b] so the reciprocal l/g is continuous on [a, b]. By the boundedness theorem, l/g is bounded on [a, b], say

< C for all in [a, b], where C > 0. This implies

-f(x) > l/C, so

< A4 l/C for all in [a, b]. This contradicts the fact that is the least upper

bound off on [a, b]. Hence f(x) =

for at least one x in [a, b].

9.17 The small-span theorem for continuous scalar fields (uniform continuity)

Let f be continuous on a bounded closed interval [a, b] in and let M(f) and m(f)

denote, respectively, the maximum and minimum values off on [a, b]. The difference

is called the span off on [a, b]. As in the one-dimensional case we have a small-span theorem for continuous functions which tells us that the interval [a, b] can be partitioned so that the span off in each subinterval is arbitrarily small.

be a partition of the interval [a,,

Write [a, b] = [a,,

x ... x [a,,

and let

That is, is a set of points

such that =

= b,. The Cartesian product

P=

is called a partition of the interval [a, b]. The small-span theorem, also called the theorem on uniform continuity, now takes the following form.

continuous on a closed interval [a, b] in Then for every 0 there is a partition of [a, b] into

THEOREM 9.10. Let f be a

number of subintervals such that the span off in every subintercal is less than

Proof. The proof is entirely analogous to the one-dimensional case so we only outline the principal steps. We argue by contradiction, using the method of successive bisection.

We assume the theorem is false; that is, we assume that for some the interval [a, b]

Applications of

calculus

cannot be partitioned into a finite number of subintervals in each of which the span off is less than

By successive bisection we obtain an infinite set of subintervals ..., in each of which the span off is at least

By considering the least upper bound of the leftmost endpoints of the component intervals of

. . . we obtain a point in [a, lying in all these intervals. By continuity off at there is an n-ball

such that the span off is less than

in [a, . But, when m is sufficiently large, the interval lies in the set

[a, , so the span off is no larger than in contradicting the fact that the span off is at least in