SYSTEMS OF DIFFERENTIAL EQUATIONS
SYSTEMS OF DIFFERENTIAL EQUATIONS
7.1 Introduction
Although the study of differential equations began in the 17th century, it was not until the 19th century that mathematicians realized that relatively few differential equations could
be solved by elementary means. The work of Liouville, and others showed the importance of establishing general theorems to guarantee the existence of solutions to certain specific classes of differential equations. Chapter 6 illustrated the use of an uniqueness theorem in the study of linear differential equations. This chapter is concerned with a proof of this theorem and related topics.
Existence theory for differential equations of higher order can be reduced to the order case by the introduction of systems of equations. For example, the second-order equation
y=
can be transformed to a system of two first-order equations by introducing two unknown functions
and , where
Then we have = = y”, so (7.1) can be written as a system of two first-order equations :
We cannot solve the equations separately by the methods of Chapter 6 because each of them involves two unknown functions.
In this chapter we consider systems consisting of linear differential equations of first order involving unknown functions
. These systems have the form =
Systems of
equations
The and which appear in (7.3) are considered as given functions on
are unknown functions to be determined. Systems of this type are
a given interval J. The functions
linear systems. In general, each equation in the system involves more than one unknown function so the equations cannot be solved separately.
A linear differential equation of order can always be transformed to a linear system. Suppose the given nth order equation is
R(t),
where the coefficients are given functions. To transform this to a system we write =y and introduce a new unknown function for each of the successive derivatives of y. That is, we put
. . . Yn =
and rewrite (7.4) as the system
Yn-1 =
+ R(t). The discussion of systems may be simplified considerably by the use of vector and matrix
notation. Consider the general system (7.3) and introduce vector-valued functions Y .
and a matrix-valued function P = defined by the equations
1 column matrices and write the system (7.3) in the simpler form
for each t in J. We regard the vectors as n x
Y’ = P(t) Y + Q(t).
For example, in system (7.2) we have
193 In system (7.5) we have
Calculus of matrix functions
-a,-
An initial-value problem for system (7.6) is to find a vector-valued function Y which satisfies (7.6) and which also satisfies an initial condition of the form Y(a) = B, where . , b,) is a given n-dimensional vector.
In the case n = (the scalar case) we know from Theorem 6.1 that, if P and Q are continuous on J, all solutions of (7.6) are given by the explicit formula
where A(x) P(t) dt , and a is any point in J. We will show that this formula can be suitably generalized for systems, that is, when P(t) is an n x matrix function and Q(t) is an n-dimensional vector function. To do this we must assign a meaning to integrals of
matrices and to exponentials of matrices. Therefore, we digress briefly to discuss the calculus of matrix functions.
7.2 Calculus of matrix functions
The generalization of the concepts of integral and derivative for matrix functions is
straightforward. If P(t) =
we define the integral
P(t) dt by the equation
That is, the integral of matrix P(t) is the matrix obtained by integrating each entry of P(t), assuming of course, that each entry is integrable on [a, b]. The reader can verify that the linearity property for integrals generalizes to matrix functions.
Continuity and differentiability of matrix functions are also defined in terms of the entries. We say that a matrix function P =
is continuous at if each entry is continuous at
The derivative is defined by differentiating each entry,
whenever all derivatives exist. It is easy to verify the basic differentiation rules for sums and products. For example, if P and Q are differentiable matrix functions, we have
+ Q’
Systems of
equations
if P and Q are of the same size, and we also have
= PQ’ + P’Q
if the product PQ is defined. The chain rule also holds. That is, if =
, where P
is a differentiable matrix function and is a differentiable scalar function, then = . The zero-derivative theorem, and the first and second fundamental theorems of calculus are also valid for matrix functions. Proofs of these properties are requested in the next set of exercises.
The definition of the exponential of a matrix is not so simple and requires further preparation. This is discussed in the next section.
7.3 Infinite series of matrices. Norms of matrices
be an x n matrix of real or complex entries. We wish to define the exponential
Let A =
in such a way that it possesses some of the fundamental properties of the ordinary real or complex-valued exponential. In particular, we shall require the law of exponents in the form
for all real and ,
and the relation
where 0 and Z are the n x n zero and identity matrices, respectively. It might seem natural to define
to be the matrix However, this is unacceptable since it satisfies neither of properties (7.8) or (7.9). Instead, we shall define
by means of a power series expansion,
We know that this formula holds if A is a real or complex number, and we will prove that it implies properties (7.8) and (7.9) if A is a matrix. Before we can do this we need to explain what is meant by a convergent series of matrices.
sequence of m X n matrices
DEFINITION OF CONVERGENT SERIES OF MATRICES.
Given an
whose entries are real or complex numbers, denote the ij-entry of by . all mn series
are convergent, then we say the series of matrices
is convergent, and its sum is
to be the m x n matrix whose ij-entry is the series in (7.10).
Exercises
A simple and useful test for convergence of a series of matrices can be given in terms of the norm of a matrix, a generalization of the absolute value of a number.
is an m x n matrix of real or complex entries, the norm of A, denoted by
DEFINITION OF NORM OF A MATRIX.
If A=
is defined to be the nonnegative number given by the formula
(7.11) In other words, the norm of A is the sum of the absolute values of all its entries. There
are other definitions of norms that are sometimes used, but we have chosen this one because of the ease with which we can prove the following properties.
7.1. FUNDAM ENTAL PROPERTIES OF NORM S. For rectangular matrices A and B, and all real or complex scalars c we have
THEOREM
. Proof. We prove only the result for
, assuming that A is x n and B is nxp. The proofs of the others are simpler and are left as exercises. Writing A =
,B = , we have AB , so from (7.11) we obtain
becomes By induction we also have
Note that in the special case B = A the inequality for
for
These inequalities will be useful in the discussion of the exponential matrix.
The next theorem gives a useful sufficient condition for convergence of a series of matrices.
THEOREM
7.2. TEST FOR CONVERGENCE OF A M ATRIX SERIES. is a sequence Of
matrices such that
also converges. Proof. Let the
converges, then the matrix series
, convergence of im absolute convergence of each series
of be denoted by
Since
Hence each series is convergent, so the matrix series
is convergent.
7.4 Exercises
1. Verify that the linearity property of integrals also holds for integrals of matrix functions.
2. Verify each of the following differentiation rules for matrix functions, assuming P and Q are differentiable. In (a), P and Q must be of the same size so that P Q is meaningful. In
Sy stems of
equations
and they need not be of the same size provided the products are meaningful. In (c) and (d), Q is assumed to be nonsingular. (a)
+ Q)’ = P’ + Q’ .
(c)
(b) (PQ)’ = PQ’ + P’Q.
(d)
be a differentiable matrix function. Prove that the derivatives of and are given by the formulas
3. (a) Let P
= PP’ + P’P,
(b) Guess a general formula for the derivative of and prove it by induction.
be a differentiable matrix function and let be a differentiable scalar function whose range is a subset of the domain of P. Define the composite function
4. Let P
= and prove the chain rule, F’(t)
5. Prove the zero-derivative theorem for matrix functions: If P’(t) = 0 for every t in an open
interval (a, b), then the matrix function P is constant on (a, b).
6. State and prove generalizations of the first and second fundamental theorems of calculus for matrix functions.
7. State and prove a formula for integration by parts in which the integrands are matrix functions.
8. Prove the following properties of matrix norms :
9. If a matrix function P is integrable on an interval [a, b] prove that
10. Let D be an n x
Prove that the matrix series converges and is also a diagonal matrix,
diagonal matrix, say D = diag
(The term corresponding to k = 0 is understood to be the identity matrix
If the matrix series converges, prove that
11. Let D be an
diagonal matrix, D = diag
is an x matrix. Prove that the matrix series
12. Assume that the matrix series
converges, where each
also converges and that its sum is the matrix
Here A and B are matrices such that the products are meaningful.
The
equation
by
7.5 The exponential matrix
Using Theorem 7.2 it is easy to prove that the matrix series
converges for every square matrix A with real or complex entries. (The term corresponding to k 0 is understood to be the identity matrix Z.) The norm of each term satisfies the inequality
Since the series converges for every real a, Theorem 7.2 implies that the series in (7.12) converges for every square matrix A.
For any n matrix A with real or complex entries we
DEFINITION OF THE EXPONENTIAL M ATRIX.
the exponential to be the x matrix given by the convergent series in (7.12). That is,
Note that this definition implies where 0 is the zero matrix. Further properties of the exponential will be developed with the help of differential equations.
7.6 The differential equation satisfied by
Let be a real number, let A be an x matrix, and let
be the x matrix given by
E(t) =
We shall keep A fixed and study this matrix as a function oft. First we obtain a differential equation satisfied by E.
satisfies the matrix
T H EO R EM
7.3. For every real t the matrix function E defined by E(t) =
equation
E’(t) = E(t)A = AE(t).
Proof. From the definition of the exponential matrix we have
Systems of
equations
Let denote the ij-entry of A Then the ij-entry of
. Hence, from the definition of a matrix series, we have
is
Each entry on the right of (7.13) is a power series in convergent for all Therefore its derivative exists for all
and is given by the differentiated series
k=l k!
This shows that the derivative exists and is given by the matrix series
A=
In the last equation we used the property A
. Since A commutes with we could also have written
This completes the proof.
to obtain the relation
Note: The foregoing proof also shows that A commutes with .
7.7 Uniqueness theorem for the matrix differential equation F’(t) =
In this section we prove a uniqueness theorem which characterizes all solutions of the matrix differential equation
= AF(t) . The proof makes use of the following theorem.
For any n n matrix A and any scalar t we have (7.14)
THEOREM
7.4. OF
Hence is nonsingular, and its inverse is Proof. Let F be the matrix function defined for all real t by the equation
F(t) =
We shall prove that F(t) is the identity matrix by showing that the derivative F’(t) is the zero matrix. Differentiating F as a product, using the result of Theorem 7.3, we find
since A commutes with Therefore, by the zero-derivative theorem, F is a constant matrix. But F(0) =
so F(t) = for all This proves (7.14).
The law of exponents for matrices THEOREM 7.5. UNIQUENESS THEOREM. Let A and B be given x n constant matrices.
Then the only n x n matrix function F satisfying the initial-value problem
F’(t) = AF(t),
F(0) = B
for t < + is (7.15)
F(t)
Proof. First we note that is a solution. Now let F be any solution and consider the matrix function
G(t) =
Differentiating this product we obtain G’(t) =
= 0. Therefore G(t) is a constant matrix,
G(t) = G(0) = F(0) = B.
In other words,
and using (7.14) we obtain (7.15). Note: The same type of proof shows that
= B. Multiplying by
is the only solution of the initial-value problem
F(0) = B.
7.8 The law of exponents for exponential matrices
is not always true for matrix exponentials. A counter example is given in Exercise 13 of Section 7.12. However, it is not difficult to prove that
The law of exponents
the formula is true for matrices A and B which commute.
T H EORE M 7.6. Let A and B be two x n matrices which commute, AB = BA . Then we have
Proof.
From the equation AB BA we find that A(BA) = (AB)A = (BA)A =
so B commutes with By induction, B commutes with every power of A. By writing
for every real Now let F be the matrix function defined by the equation
as a power series we find that B also commutes with
Systems of differential equations
Differentiating F(t) and using the fact that B commutes with we find
+ (A +
. By the uniqueness theorem we have
(A +
= (A
F(t) =
But F(0) so F(t) 0 for all Hence
When t = 1 we obtain (7.16). EXAMPLE . The matrices
and
commute for all scalars and Hence we have
7.9 Existence and uniqueness theorems for homogeneous linear systems with constant coefficients
A Y(t), where A is an x n constant matrix and Y is an n-dimensional vector function (regarded as an n x 1 column matrix) is called a homogeneous linear system with constant
The vector differential equation Y’(t)
We shall use the exponential matrix to give an explicit formula for the solution of such a system.
THEOREM 7.7. Let A be a given n x n constant matrix and let B be a given n-dimensional vector. Then the initial-value problem
Y’(t)
Y(0) = B,
has a unique solution on the interval t + This solution is given by the formula (7.18)
Y(t) =
More generally, the unique solution of the initial value problem
Y’(t) = A Y(t),
Y(a) B,
is Y(t) = Proof. Differentiation of (7.18) gives us Y’(t)
A Y(t). Since Y(0) = B , this is a solution of the initial-value problem (7.17).
To prove that it is the only solution we argue as in the proof of Theorem 7.5. Let Z(t)
be another vector function satisfying Z’(t) = AZ(t) with Z(0) B , and let G(t) = Then we easily verify that G’(t)
so G(t) G(0) = Z(0) = B. In other words,
201 = B, so Z(t) =
The problem of calculating
B is treated in exactly the same way.
Y(t). The more general case with initial value Y(a)
7.10 The problem of calculating
Although Theorem 7.7 gives an explicit formula for the solution of a homogeneous system with constant coefficients, there still remains the problem of actually computing the exponential matrix e
directly from the series definition we would have to compute all the powers
If we were to calculate
, . . , and then compute the sum of each series
for k = 0,
In general this is a hopeless task unless A is a matrix whose powers may be readily calculated. For example, if A is a diagonal matrix, say
A = diag
then every power of A is also a diagonal matrix, in fact,
Therefore in this case
is a diagonal matrix given by
Another easy case to handle is when A is a matrix which can be diagonalized. For example, if there is a nonsingular matrix C such that
is a diagonal matrix, say
D , then we have A
, from which we find
and, more generally,
Therefore in this case we have
k=O
k=O
Here the difficulty lies in determining C and its inverse. Once these are known, is easily calculated. Of course, not every matrix can be diagonalized so the usefulness of the foregoing remarks is limited.
EXAMPLE 1. Calculate
for the 2 x
2 matrix A =
Solution. This matrix has distinct eigenvalues = 6, 1, = 1 , so there is a nonsingular matrix C =
such that
where D
diag
= To
Systems of
equations
determine C we can write AC = CD, or
Multiplying the matrices, we find that this equation is satisfied for any scalars a, b, d -d.
Therefore
EX AMPLE 2. Solve the linear system
subject to the initial conditions
Solution. In matrix form the system can be written as
Y’(t) = A Y(t),
where A
By Theorem 7.7 the solution is Y(t) =
Y(0). Using the matrix
calculated in Example
1 we find
from which we obtain
There are many methods known for calculating when A cannot be diagonalized. Most of these methods are rather complicated and require preliminary matrix transforma- tions, the nature of which depends on the multiplicities of the eigenvalues of A. In a later section we shall discuss a practical and straightforward method for calculating
which can be used whether or not A can be diagonalized. It is valid for
matrices A and requires no preliminary transformations of any kind. This method was developed by E. J. Putzer in a paper in the American Mathematical Monthly, Vol. 73
pp. 2-7. It is based on a famous theorem attributed to Arthur Cayley (1821-1895) and William
Hamilton
203 (18051865) which states that every square matrix satisfies its characteristic equation.
The Cayley-Hamilton theorem
First we shall prove the Cayley-Hamilton theorem and then we shall use it to obtain Putzer’s formulas for calculating
7.11 The Cayley-Hamilton theorem
THEOREM 7.8. CAYLEY-HAMILTON THEOREM. Let A be an n x matrix and let (7.19)
det
be its characteristic polynomial. Then f (A) = 0. In other words, A satisfies the equation (7.20)
A” +
Proof. The proof is based on Theorem 3.12 which states that for any square matrix A we have
A (cof
= (det
We apply this formula with A replaced by
A) = Equation (7.21) becomes
A . Since det
(7.22) This equation is valid for all real 1. The idea of the proof is to show that it is also valid
when is replaced by A. The entries of the matrix cof
A) are the cofactors of A. Except for a factor
A of order 1. Therefore each entry of cof
1 , each such cofactor is the determinant of a minor of
A), and hence of
A)}*, is a polynomial in of degree
1. Therefore
k=O
where each coefficient is an n x n matrix with scalar entries. Using this in (7.22) we obtain the relation
k=O
which can be rewritten in the form
n-1
S-l
k=l
Systems of
equations
At this stage we equate coefficients of like powers of in (7.24) to obtain the equations
AB, = =
Equating coefficients is permissible because (7.24) is equivalent to scalar equations, in each of which we may equate coefficients of like powers of il. Now we multiply the equa- tions in (7.25) in succession by A”,
. . . , A, and add the results. The terms on the left cancel and we obtain
A” +
This proves the Cayley-Hamilton theorem.
Note: Hamilton proved the theorem in 1853 for a special class of matrices. A few years later, Cayley announced that the theorem is true for all matrices, but gave no proof.
EXAMPLE .
The matrix A = has characteristic polynomial
12. The Cayley-Hamilton theorem states that A satisfies the equation (7.26)
This equation can be used to express and all higher powers of A in terms of I, A, and For example, we have
It can also be used to express as a polynomial in A. From (7.26) we write 9A + 201) =
and we obtain
9A + 201).
Putzer’s methodfor calculating
7.12 Exercises
In each of Exercises 1 through 4, (a) express A- l , and all higher powers of A as a linear combination of Z and A. (The Cayley-Hamilton theorem can be of help.) (b) Calculate
5. (a) If A =
, prove that
ab
(b) Find a corresponding formula for
when A =
, a, b real.
-b a
t t-l
6. If F(t) =
prove that
t,
= 1 t and show that the result is not equal to either of the two products A(t) 00
7. If A(t) is a scalar function of
Compute the derivative of when
the derivative of
is
or
In each of Exercises
and express in terms of A, (b) Calculate .
10, (a) calculate
, express
as a linear combination of
A,
0 1 0 xy
e e is not always true for matrix exponentials. Compute each of the matrices
13. This example shows that the equation
= note that the three results are distinct.
when A=
IO]
7.13 method for calculating
The Cayley-Hamilton theorem shows that the nth power of any x n matrix A can be
as a linear combination of the lower powers A, ..., It follows that
Systems of
equations
I, A, ..., Therefore, in the infinite series defining each term with k
n is a linear combination of
Hence we can expect that
should be expressible as a polynomial in A of the form
where the scalar coefficients
Putzer developed two useful methods for expressing
depend on
as a polynomial in A. The next theorem describes the simpler of the two methods.
a sequence of polynomials in A as follows:
THEOREM 7.9. ...,
be the eigenvalues of an n n matrix A, and
n. Then we have
for
where the scalar r,(t), . . . , r,(t) are determined recursively from the system of linear
equations
n-l). Note: Equation (7.29) does not express
directly in powers of A as indicated in but as a linear combination of the polynomials
..., These
polynomials are easily calculated once the eigenvalues of A are determined. Also the multipliers
..., r,(t) in (7.30) are easily calculated. Although this requires solving
a system of linear differential equations, this particular system has a triangular matrix and the solutions can be determined in succession.
Proof. Let . . . , r,(t) be the scalar functions determined by (7.30) and define a matrix function F by the equation
Note that F(0) =
by showing that F satisfies the same differential equation as
We will prove that F(t) =
namely, F’(t) = AF(t) . Differentiating (7.31) and using the recursion formulas (7.30) we obtain
207 where r,,(t) is defined to be 0. We rewrite this in the form
Putzer’s method for calculating
n-2
k=O + k=O
then subtract =
to obtain the relation
n-2
= k=O + .
But from (7.28) we see that
Therefore Equation (7.32) becomes
The Cayley-Hamilton theorem implies that P,(A) = 0, so the last equation becomes F’(t)
from which we find F’(t) =
the uniqueness theorem (Theorem 7.7) shows that
Since F(0) =
= . EXAMPLE I. Express
as a linear combination of Zand A if A is a 2 x 2 matrix with both its eigenvalues equal to il.
Solution. Writing = = we are to solve the system of differential equations
Solving these first-order equations in succession we find
= te”‘.
Since = Z and P,(A) =A
is (7.33)
the required formula for
Systems of
equations
EXAMPLE 2. Solve Example 1 if the eigenvalues of A are and where Solution. In this case the system of differential equations is
r,(O) = 0.
Its solutions are given by
Since P,(A) Z and P,(A)
A the required formula for
is
If the eigenvalues are complex numbers, the exponentials and will also be complex numbers. But if and are complex conjugates, the scalars multiplying Z and A in (7.34) will be real. For example, suppose
Then so Equation (7.34) becomes
=e
(cos
+ i sin + (A
The terms involving i cancel and we get
cos
a sin
+ sin A}.
7.14 Alternate methods for calculating
in special cases
Putzer’s method for expressing as a polynomial in A is completely general because it is valid for all square matrices A. A general method is not always the simplest method to use in certain special cases. In this section we give simpler methods for computing
in
when all the eigenvalues of A are distinct, and (c) when A has two distinct eigenvalues, exactly one of which has multiplicity 1.
three special cases: (a) When all the eigenvalues of A are equal,
Alternate methods for calculating in special cases 209 THEOREM 7.10. A is an n x n matrix with all its eigenvalues equal to then we have
Proof. Since the matrices
and t(A
commute we have
(A
The Cayley-Hamilton theorem implies that (A
n , so the theorem is proved.
= 0 for every k
THEOREM 7.11. If A is an n x n matrix with n distinct eigenvalues ..., , then we have
where is a polynomial in A of degree n 1 given by the formula
Note: The polynomials L,(A) are called Lagrange interpolation Proof. We define a matrix function F by the equation
and verify that F satisfies the differential equation F’(t) = AF(t) and the initial condition F(0) = I. From (7.37) we see that
AF(t)
F’(t) =
0 for each k, so F satisfies the differential equation F’(t) = AF(t) . To complete the proof we need to show that F satisfies the initial condition F(0) = which becomes
By the Cayley-Hamilton theorem we have (A
A proof of (7.38) is outlined in Exercise 16 of Section 7.15. The next theorem treats the case when A has two distinct eigenvalues, exactly one of
which has multiplicity 1.
Systems of
equations
THEOREM 7.12. Let A be an n x n matrix (n 3) with two distinct eigenvalues and where has multiplicity n
1 and has multiplicity 1. Then we have
(A
k!
Proof. As in the proof of Theorem 7.10 we begin by writing
h=O
k=O
(A
r) , (A
k=O
Now we evaluate the series over r in closed form by using the Cayley-Hamilton theorem. Since
A =A
we find (A
= (A
The left member is 0 by the Cayley-Hamilton theorem so
(A
Using this relation repeatedly we find
(A
Therefore the series over r becomes
This completes the proof. The explicit formula in Theorem 7.12 can also be deduced by applying Putzer’s method,
but the details are more complicated.
The explicit formulas in Theorems 7.10, 7.11 and 7.12 cover all matrices of order n 3.
Since the 3 x 3 case often arises in practice, the formulas in this case are listed below for easy reference.
CASE 1. If a 3 x 3 matrix A has eigenvalues
then
+ t(A
Exercises
CASE 2. If a 3 x
3 matrix A has distinct eigenvalues
v, then
CASE
3. If a 3 x 3 matrix A has eigenvalues
with
then
EXAMPLE . Compute
when A=
Solution. The eigenvalues of A are 1, 1, 2, so the formula of Case 3 gives us
Z)} + By collecting powers of A we can also write this as follows,
{(t +
and perform the indicated operations in (7.39) or (7.40) to write the result as a 3 x 3 matrix,
At this stage we can calculate (A
or
-(t + + +
(3t +
(3t +
(3t +
-(t + +
7.15 Exercises
For each of the matrices in Exercises 1 through 6, express as a polynomial in A.
7. (a) A 3 x 3 matrix A is known to have all its eigenvalues equal to 1. Prove that =
+ 2)Z +
+ 2t)A +
(b) Find a corresponding formula if A is a 4 x
4 matrix with all its eigenvalues equal to
Systems of
equations
In each of Exercises 8 through 15, solve the system Y’ = A Y subject to the given initial condition.
I 0 0 0 4’
16. This exercise outlines a proof of Equation (7.38) used in the proof of Theorem 7.11, I1 Let 1
be the polynomial in of degree 1 defined by the equation
where . . . , I, are distinct scalars. (a) Prove that
be arbitrary scalars, and let
Prove that is the only polynomial of degree 1 which satisfies the equations
for
n.
(c) Prove that = 1 for every and deduce that for every square matrix A we have
where Z is the identity matrix.
Nonhomogeneous linear systems with constant 213
7.16 Nonhomogeneous linear systems with constant coefficients
We consider next the nonhomogeneous initial-value problem (7.41)
=A
+ Q(t),
Y(a)
B,
on an interval Here A is an n x n constant matrix, Q is an n-dimensional vector function (regarded as an x I column matrix) continuous on
and a is a given point in We can obtain an explicit formula for the solution of this problem by the same process used to treat the scalar case.
First we multiply both members of (7.41) by the exponential matrix and rewrite the differential equation in the form
Y’(t)
The left member of (7.42) is the derivative of the product Therefore, if we integrate both members of (7.42) from a to x, where x
we obtain
dt .
Multiplying by we obtain the explicit formula (7.43) which appears in the following theorem.
THEOREM 7.13. Let A be an n x n constant matrix and let Q be an n-dimensional vector function continuous on an interval J. Then the initial-value problem
=A
Y(a) = B,
has a unique solution on J given by the explicit formula
Y(x) =
dt.
As in the homogeneous case, the difficulty in applying this formula in practice lies in the calculation of the exponential matrices. Note that the first term,
is the solution of the homogeneous problem Y’(t) =
A Y(t), Y(a) = The second term is the solution of the nonhomogeneous problem
=A
Y(a) = 0.
We illustrate Theorem 7.13 with an example.
EXAMPLE .
Solve the initial-value problem
Y(0) = B,
Systems of
equations
on the interval (- , + co), where
Solution. According to Theorem 7.13, the solution is given by (7.44)
dt . The eigenvalues of A are 2, 2, and 4. To calculate
Y(x) =
dt =
we use the formula of Case 3, Section 7.14, to obtain
2x
We can replace x by
Therefore the integrand in (7.44) is
t in this formula to obtain e
= + (x t)(A 2Z)
2(x t)
Integrating, we find
Y(x) dt =
Since we have
we find
1. Let be a solution of the nonhomogeneous system
Z’(r) = AZ(r) + Q(r),
on an interval J with initial value Z(a). Prove that there is only one solution of the non- homogeneous system
Y’(t) =
on J with initial value Y(a) and that it is given by the formula
Y(t) =
Y(a)
Special methods are often available for determining a particular solution Z(t) which resembles the given function Q(t). Exercises 2, 3, 5, and 7 indicate such methods for Q(t) = C, Q(r) = =
where C and D are constant vectors. If the particular solution Z(t) so obtained does not have the required initial value, we modify Z(t) as indicated in Exercise 1 to obtain another solution
with the required initial value.
matrix, and C constant n-dimensional vectors. Prove that the solution of the system
2. (a) Let A be a constant
Y’(f) =
c,
Y(a) = B,
on + is given by the formula
C.
Y(x) =
(b) If A is nonsingular, show that the integral in part (a) has the value (c) Compute Y(x) explicitly when
3. Let A be an n x n constant matrix, let B and C be n-dimensional constant vectors, and let
be a given scalar. (a) Prove that the nonhomogeneous system Z’(t) = AZ(t) + Z(t) =
if, and only if,
A)B = C.
(b) If a is not an eigenvalue of A, prove that the vector can always be chosen so that the system in (a) has a solution of the form Z(t) =
(c) If is not an eigenvalue of A, prove that every solution of the system Y’(t) = Y(i) + has the form Y(t) =
Y(0)
where B = (al
4. Use the method suggested by Exercise 3 to find a solution of the nonhomogeneous system Y’(t) =
+ with
Systems of
equations
5. Let A be an x constant matrix, let B and C be n-dimensional constant vectors, and let
be a positive integer. (a) Prove that the nonhomogeneous system Y’(t) = A Y(t) + PC, Y(0) = B, has a particular solution of the form
Y(f) =
are constant vectors, if and only if 1
Determine the coefficients
for such a solution.
(b) If A is nonsingular, prove that the initial vector B can always be chosen so that the system in (a) has a solution of the specified form.
6. Consider the nonhomogeneous system
(a) Find a particular solution of the form Y(f) =
+ (b) Find a solution of the system with
7. Let A be an
constant matrix, let B, C, D be n-dimensional constant vectors, and let
be a given real number. Prove that the nonhomogeneous system Y’(t) = A Y(t) + (cos
Y(0) = B, has a particular solution of the form
+ (sin
Y(f) = (cos
+ (sin
where E and Fare constant vectors, if and only if
= -(AC +
C for such a solution. Note that if + is nonsingular, the initial vector B can always be chosen so that the system has a solution of the specified form.
Determine E and Fin terms of
A, B,
8. (a) Find a particular solution of the nonhomogeneous system
(b) Find a solution of the system with
217 In each of Exercises 9 through 12, solve the nonhomogeneous system Y’(t) = A Y(t) + Q(t)
The general linear system Y’(t) = P(t) Y(t) + Q(t)
subject to the given initial condition.
7.18 The general linear system Y’(t) =
+ Q(t)
Theorem 7.13 gives an explicit formula for the solution of the linear system
=A
= B,
where A is a constant
1 column matrices. We turn now to the more general case
matrix and Q(t), Y(t) are
Y(a) = B,
where the x matrix
is not necessarily constant.
If and Q are continuous on an open interval J, a general existence-uniqueness theorem which we shall prove in a later section tells us that for each a in J and each initial vector B there is exactly one solution to the initial-value problem (7.45). In this section we use this result to obtain a formula for the solution, generalizing Theorem 7.13.
In the scalar case = 1) the differential equation (7.45) can be solved as follows. We let
to rewrite the differ- ential equation in the form
then multiply both members of (7.45) by
Y’(t)
Now the left member is the derivative of the product Y(t). Therefore, we can integrate both members from a to x, where a and x are points in J, to obtain
Multiplying by we obtain the explicit formula
(7.47) Y(x) = e
Systems of
equations
The only part of this argument that does not apply immediately to matrix functions is the statement that the left-hand member of (7.46) is the derivative of the product
Y(t). At this stage we used the fact that the derivative of
In the scalar case this is a consequence of the following formula for differentiating exponential functions :
is
If E(t) =
then
E’(t) =
Unfortunately, this differentiation formula is not always true when A is a matrix function.
1 t For example, it is false for the 2 x 2 matrix function A(t)
Section 7.12.) Therefore a modified argument is needed to extend Equation (7.47) to the 1
0 0 . (See Exercise 7 of
matrix case. Suppose we multiply each member of (7.45) by an unspecified x n matrix
This gives us the relation
Now we add to both members in order to transform the left member to the derivative of the product F(t) Y(t). If we do this, the last equation gives us
r(t)}’ = {F’(t) +
If we can choose the matrix
on the right is the zero matrix, the last equation simplifies to
so that the sum
Integrating this from a to x we obtain
dt .
If, in addition, the matrix is nonsingular, we obtain the explicit formula
F(t)Q(t) dt . This is a generalization of the scalar formula (7.47). The process will work if we can find a
= F(x)-‘F(a) Y(a) +
n x matrix function which satisfies the matrix differential equation
and which is nonsingular.
Note that this differential equation is very much like the original differential equation (7.45) with Q(t) =
except that the unknown function is a square matrix instead of a column matrix. Also, the unknown function is multiplied on the right by
instead of on the left by
We shall prove next that the differential equation for F always has a nonsingular solution. The proof will depend on the following existence theorem for homogeneous linear systems.
219 THEOREM 7.14. Assume A(t) is an
The general linear system Y’(t) = P(t) Y(t) + Q(t)
x n matrix function continuous on an open interval J. If a J and if B is a given n-dimensional vector, the homogeneous linear system
Y’(t) = A(t) Y(t),
Y(a) = B,
has an n-dimensional vector solution Y on J.
A proof of Theorem 7.14 appears in Section 7.21. With the help of this theorem we can prove the following.
matrix function P, continuous on an open interval and given any point a in J, there exists an n x n matrix function F which
THEOREM 7.15. Given an n x
the matrix equation
on J with initial value F(a) = I. Moreover, F(x) is nonsingular for each x in J. Proof. Let Y,(x) be a vector solution of the differential equation
on J with initial vector Y,(a) = , where is the kth column of the x n identity matrix I. Here
denotes the transpose of P(x). Let G(x) be the n x n matrix whose kth column is Y,(x). Then G satisfies the matrix differential equation
G’(x) =
on J with initial value G(a) = Now take the transpose of each member of (7.50). Since the transpose of a product is the product of transposes in reverse order, we obtain
{G’(x)}” =
Also, the transpose of the derivative G’ is the derivative of the transpose Therefore the matrix F(x) =
satisfies the differential equation (7.49) with initial value F(a) =
Now we prove that F(x) is nonsingular by exhibiting its inverse. Let H be the x n matrix function whose kth column is the solution of the differential equation
Y’(x) = P(x) Y(x)
with initial vector
= I,, the kth column of
Then H satisfies the initial-value problem
H’(x) =
H(a) = I,
on J. The product
has derivative
Systems of
equations
for each in J. Therefore the product
= = so H(x) is the inverse of
is constant,
This completes the proof.
The results of this section are summarized in the following theorem.
THEOREM 7.16. Given an n x matrix function P and an n-dimensional vector function Q, both continuous on an open interval J, the solution of the initial-value problem
r’(x) = P(x) Y(x) +
Y(a) = B,
on J is given by the formula
Y(x) = F(x)-‘Y(a) + dt.
matrix F(x) is the transpose of the matrix whose kth column is the solution of the initial-value problem
The
Y’(x) =
where I, is the kth column of the identity matrix I.
Although Theorem 7.16 provides an explicit formula for the solution of the general linear system
the formula is not always a useful one for calculating the solution because of the difficulty involved in determining the matrix function F. The determination of F requires the solution of n homogeneous linear systems (7.53). The next section describes a power-series method that is sometimes used to solve homogeneous linear systems.
We remind the reader once more that the proof of Theorem 7.16 was based on Theorem
7.14, the existence theorem for homogeneous linear systems, which we have not yet proved.
7.19 A power-series method for solving homogeneous linear systems
Consider a homogeneous linear system
Y’(x) =
Y(x)
Y(0) = B,
has a power-series expansion in convergent in some open interval containing the origin, say
in which the given x matrix
where the coefficients A,, A,, . . . are given x matrices. Let us try to find a power-series solution of the form
221 with vector coefficients
the initial condition will be satisfied by taking
Since Y(0) =
the prescribed initial vector. To determine the remaining coefficients we substitute the power series for Y(x) in the differential equation and equate coefficients of like powers of x to obtain the following system of equations:
for
These equations can be solved in succession for the vectors . . . . If the resulting power series for Y(x) converges in some interval
then Y(x) will be a solution of the initial-value problem (7.54) in the interval
where = min
for 1, so the system of equations in (7.55) becomes
For example, if A(x) is a constant matrix A, then A,
and
for
Solving these equations in succession we find
for
Therefore the series solution in this case becomes
This agrees with the result obtained earlier for homogeneous linear systems with constant coefficients.
7.20 Exercises
1. Let be a real-valued function and Q an x 1 matrix function, both continuous on an interval Let be an
constant matrix. Prove that the initial-value problem
Y(u) =
has the solution
on J, where
2. Consider the special case of Exercise 1 in which is nonsingular, a = 0, = 2x, and Q(x) =
where C is a constant vector. Show that the solution becomes
Let Q(t), Y(r), and be x 1 column matrices. Assume that
3. Let
be an
matrix function and let E(t) =
Systems of
equations
on an open interval J. If a and if A’ and Q are continuous on prove that the initial-value
problem
Y’(t)
Y(u) = B,
has the following solution on J:
Let E(t) = This exercise describes examples of matrix functions A(t) for which E’(t) = (a) Let A(t) =
constant matrix and r is a positive integer. Prove that E’(t) =
where A is an
on (b) Let A(t) be a polynomial in t with matrix coefficients, say
A(t) =
PA,.,
where the coefficients commute, = for all r and Prove that E’(t) = on
(c) Solve the homogeneous linear system
Y’(r) =
+ tA)Y(t),
Y(0) = B
on the interval (
where A is an x constant matrix.
Assume that the x n matrix function A(x) has a power-series expansion convergent for
Develop a power-series procedure for solving the following homogeneous linear system of second order:
Y’(0) = c. Consider the second-order system Y”(x) +
Y”(x) =
with Y(0) = B,
B, Y’(0) = where A is a constant x matrix. Prove that the system has the power-series solution
A Y(x) =
with Y(0) =
convergent for x
7.21 Proof of the existence theorem by the method of successive approximations
In this section we prove the existence and uniqueness of a solution for any homogeneous linear system
Y’(t) = A(t) Y(t),
where A(t) is an n x n matrix function, continuous on an open interval We shall prove that for any point a in and any given initial-vector there exists exactly one solution Y(t) on satisfying the initial condition Y(a) = .
Proof of the existence theorem by the method of successive approximations
We shall use the method of successive approximations, an iterative method which also has applications in many other problems. The method was first published by Liouville in 1838 in connection with the study of linear differential equations of second order. It was later extended by J.
in 1864, L. Fuchs in 1870, and G. in 1888 to the study of linear equations of order n. In 1890
extended the method to encompass nonlinear differential equations as well. In recognition of his fundamental contributions, some writers refer to the method as
method. The method is not only of theoretical interest but can also be used to obtain numerical approximations to solutions in some cases. The method begins with an initial guess at a solution of the equation (7.56). We take as initial guess the given initial vector although this is not essential. We then substitute this guess in the right-hand member of the equation and obtain a new differential equation,
Y’(t) =
In this equation the right-hand member no longer contains the unknown function, so the equation can be solved immediately by integrating both members from a to x, where x is an arbitrary point in J. This equation has exactly one solution
on J satisfying the initial condition Y,(a) =
namely
=B+
A(t)B dt.
Now we replace Y(t) by in the right-hand member of the original differential equation (7.56) to obtain a new differential equation
Y’(t) =
This equation has a unique solution on J with Y,(a) (7.57)
dt.
We then substitute in the right-hand member of (7.56) and solve the resulting equation to determine
with
and so on. This process generates a sequence of functions where = and where
is determined from by the recursion formula
for
Our goal is to prove that the sequence of functions so defined converges to a limit function
is a solution of the differential equation (7.56) on Jand which also satisfies the initial condition Y(a) = B. The functions
, , . . . are called successive approxi- mations to Y. Before we investigate the convergence of the process we illustrate the method with an example.
EXAMPLE . Consider the initial-value problem Y’(t) = A Y(t), Y(0) = B, where A is a constant x matrix. We know that the solution is given by the formula Y(x) = for all real x. We will show how this solution can be obtained by the method of successive
Systems of
equations
The initial guess is Y,(x) = B. The recursion formula (7.58) gives us
By induction we find
=B+
The sum on the right is a partial sum of the series for
Therefore when k
we find
lim
for all x. Thus, in this example we can show directly that the successive approximations converge to a solution of the initial-value problem on
co, +
Proof of convergence of the sequence of successive approximations. We return now to the general sequence defined by the recursion formula (7.58). To prove that the sequence converges we write each term
as a telescoping sum,
k-l
To prove that
tends to a limit as k
co we shall prove that the infinite series
eonverges for each x in For this purpose it suffices to prove that the series
II
converges. In this series we use the matrix norm introduced in Section 7.3; the norm of a matrix is the sum of the absolute values of all its entries.
containing a. We shall prove that for every x in
Consider a closed and bounded subinterval
of
the series in (7.61) is dominated by a convergent series of constants inde- pendent of x. This implies that the series converges uniformly on
To estimate the size of the terms in (7.61) we use the recursion formula repeatedly. Initially, we have
A(t)B dt.
Proof of the existence theorem by the method of successive approximations
For simplicity, we assume that a < x . Then we can write
Since each entry of A(t) is continuous on J, each entry is bounded on the closed bounded interval
is the sum of the bounds of all the entries of A(t) on the interval
Therefore
where
Therefore the integrand in (7.62) is bounded by
(The number
depends on
so we have
for all x > a in Now we use the recursion formula once more to express the difference
in terms of
and then use the estimate just obtained for to obtain
a)
for all x > a in By induction we find
for m = 0,
and for all x > a in If x < a a similar argument gives the same inequality with appearing instead of (x
a). If we denote by L the length of the interval then we have L for all in
so we obtain the estimate
for
and for all in Therefore the series in (7.61) is dominated by the convergent series
This proves that the series in (7.61) converges uniformly on
The foregoing argument shows that the sequence of successive approximations always converges and the convergence is uniform on
Let Y denote the limit function. That is, define Y(x) for each x in
by the equation
= lim
Systems of
equations
We shall prove that Y has the following properties: (a) Y is continuous on (b) Y(x) = B +
for all in
(c) Y(a) = B and
Y’(x) =
for all x in
Part (c) shows that Y is a solution of the initial-value problem on Proof of (a). Each function
is a column matrix whose entries are scalar functions, continuous on
Each entry of the limit function Y is the limit of a uniformly convergent sequence of continuous functions so, by Theorem 11.1 of Volume I, each entry of Y is also continuous on
Therefore Y itself is continuous on Proof of(b). The recursion formula (7.58) states that
dt .
Therefore Y(x) = lim
The interchange of the limit symbol with the integral sign is valid because of the uniform convergence of the sequence
on
Proof of (c). The equation Y(a) = B follows at once from (b). Because of (a), the integrand in (b) is continuous on
so, by the first fundamental theorem of calculus, Y’(x) exists and equals A(x) Y(x) on The interval
was any closed and bounded subinterval of J containing a. If is enlarged, the process for obtaining Y(x) doesn’t change because it only involves integration from a to x. Since for every x in J there is a closed bounded subinterval of J containing a and x, a solution exists over the full interval J.
A(t) is continuous on an open interval J, the
THEOREM 7.17. UNIQUENESS THEOREM FOR HOMOGENEOUS LINEAR SYSTEMS.
equation Y’(t) = A(t)
has at most one solution on J satisfying a given initial condition Y(a) = B . Proof. Let Y and Z be two solutions on J. Let
be any closed and bounded subinterval of J containing a. We will prove that
= Y(x) for every x in This implies that Z = Y on the full interval J. Since both Y and Z are solutions we have
Z’(t)
Y(t)}.
The method of successive approximations applied tojrst-order nonlinear systems 227 Choose x in
and integrate this equation from a to x to obtain
Z(x)
Y(x) =
Y(t)} dt.
This implies the inequality (7.63) where Mis an upper bound for
be an upper bound for the continuous function
Let
on
Then the inequality (7.63) gives us
(7.64) Using (7.64) in the right-hand member of (7.63) we obtain
dt =
By induction we find (7.65)
When m co the right-hand member approaches 0, so Z(x) = Y(x). This completes the proof.
The results of this section can be summarized in the following existence-uniqueness theorem.
THEOREM 7.18. Let A be an n x n matrix function continuous on an open interval J. If a J and if B is any n-dimensional vector, the homogeneous linear system
Y’(t) = A(t) Y(t),
Y(a)
B,
has one and only one n-dimensional vector solution on J.
7.22 The method of successive approximations applied to first-order nonlinear systems
The method of successive approximations can also be applied to some nonlinear systems. Consider a first-order system of the form
Y’ =
where Fis a given n-dimensional vector-valued function, and Yis an unknown n-dimensional vector-valued function to be determined. We seek a solution Y which satisfies the equation
Systems of
equations
for each in some interval and which also satisfies a given initial condition, say Y(a) = where a and is a given n-dimensional vector. In a manner parallel to the linear case, we construct a sequence of successive approxi- mations
in terms of by the recursion formula
. . . , by taking
= B and defining
= + for k = 0, 1, 2, . . . .
Under certain conditions on F, this sequence will converge to a limit function Y which will satisfy the given differential equation and the given initial condition.
Before we investigate the convergence of the process we discuss some one-dimensional examples chosen to illustrate some of the difficulties that can arise in practice.
EXAMPLE 1. Consider the nonlinear initial-value problem y’ = + withy = 0 when x = 0. We shall compute a few approximations to the solution. We choose
=0 and determine the next three approximations as follows:
dt =
It is now apparent that a great deal of labor will be needed to compute further approxi- mations. For example, the next two approximations
will be polynomials of degrees 3 1 and 63, respectively.
and
The next example exhibits a further difficulty that can arise in the computation of the successive approximations.
EXAMPLE 2. Consider the nonlinear initial-value problem y’ = 2x + , withy = 0 when = 0. We begin with the initial guess
= 0 and we find
Here further progress is impeded by the fact that the last integral cannot be evaluated in terms of elementary functions. However, for a given it is possible to calculate a numerical approximation to the integral and thereby obtain an approximation to
Because of the difficulties displayed in the last two examples, the method of successive approximations is sometimes not very useful for the explicit determination of solutions in practice. The real value of the method is its use in establishing existence theorems.
Proof of an existence-uniqueness theorem for nonlinear systems
7.23 Proof of an existence-uniqueness theorem for first-order nonlinear systems
We turn now to an existence-uniqueness theorem for first-order nonlinear systems. By placing suitable restrictions on the function which appears in the right-hand member of the differential equation
we can extend the method of proof used for the linear case in Section 7.21. Let Jdenote the open interval over which we seek a solution. Assume a J and let B be a given n-dimensional vector. Let denote a set in (n + 1)-space given by
where h > 0 and k > 0. [If
= 1 this is a rectangle with center at (a, B) and with base 2h
and altitude We assume that the domain of F includes a set S of this type and that F is bounded on S, say
(7.68) for all (x, Y) in
where
is a positive constant.
Next, we assume that the composite function G(x) = Y(x)) is continuous on the
interval (a
h , a + h) for every function Y which is continuous on (a
h , a + h) and
which has the property that (x, Y(x)) for all x in (a
h,a + h). This assumption guarantees the existence of the integrals that occur in the method of successive approxi- mations, and it also implies continuity of the functions so constructed.
Finally, we assume that F satisfies a condition of the form