VECTORS AND MATRICES

B.6 VECTORS AND MATRICES

An entity specified by n numbers in a certain order (ordered n-tuple) is an n-dimensional vector. Thus, an ordered n-tuple (x 1 ,x 2 ,..., x n ) represents an n-dimensional vector x. A vector may be represented as a row (row vector):

or as a column (column vector):

Simultaneous linear equations can be viewed as the transformation of one vector into another. Consider, for example, the n simultaneous linear equations

If we define two column vectors x and y as

then Eqs. (B.48) may be viewed as the relationship or the function that transforms vector x into vector y. Such a transformation is called the linear transformation of vectors. To perform a linear transformation, we need to define the array of coefficients a ij appearing in Eqs. (B.48) . This array is

called a matrix and is denoted by

A for convenience:

A matrix with m rows and n columns is called a matrix of the order (m, n) or an (m × n) matrix. For the special case of m = n, the matrix is called a square matrix of order n.

It should be stressed at this point that a matrix is not a number such as a determinant, but an array of numbers arranged in a particular order. It is convenient to abbreviate the representation of matrix

A in Eq. (B.50) with the form (a ij ) m x n, implying a matrix of order m x n with a ij as its ijth element. In practice, when the order m x n is understood or need not be specified, the notation can be abbreviated to (a ij ). Note that the first index i of a ij indicates the row and the second index j indicates the column of the element a ij in matrix A.

The simultaneous equations (B.48) may now be expressed in a symbolic form as The simultaneous equations (B.48) may now be expressed in a symbolic form as

Equation (B.51) is the symbolic representation of Eq. (B.48) . Yet we have not defined the operation of the multiplication of a matrix by a vector. The quantity Ax is not meaningful until such an operation has been defined.

B.6-1 Some Definitions and Properties

A square matrix whose elements are zero everywhere except on the main diagonal is a diagonal matrix. An example of a diagonal matrix is

A diagonal matrix with unity for all its diagonal elements is called an identity matrix or a unit matrix, denoted by I. This is a square matrix:

The order of the unit matrix is sometimes indicated by a subscript. Thus, I n represents the n − n unit matrix (or identity matrix). However, we shall omit the subscript. The order of the unit matrix will be understood from the context.

A matrix having all its elements zero is a zero matrix. A square matrix

A is a symmetric matrix if a ij =a ij (symmetry about the main diagonal). Two matrices of the same order are said to be equal if they are equal element by element. Thus, if

then A = B only if a ij =b ij for all i and j. If the rows and columns of an m × n matrix

A are interchanged so that the elements in the ith row now become the elements of the ith column (for i = 1, 2,..., m), the resulting matrix is called the transpose of

A and is denoted by A T . It is evident that A T is an n x m matrix. For example, if

Thus, if then

Note that

B.6-2 Matrix Algebra

We shall now define matrix operations, such as addition, subtraction, multiplication, and division of matrices. The definitions should be formulated so that they are useful in the manipulation of matrices.

ADDITION OF MATRICES

For two matrices A and B, both of the same order (m × n),

we define the sum A + B as we define the sum A + B as

MULTIPLICATION OF A MATRIX BY A SCALAR We multiply a matrix

A by a scalar c as follows:

We also observe that the scalar c and the matrix

A commute, that is,

MATRIX MULTIPLICATION

We define the product in which c ij , the element of

C in the ith row and jth column, is found by adding the products of the elements of A in the ith row with the corresponding elements of

B in the jth column. Thus,

This result is expressed as follows:

Note carefully that if this procedure is to work, the number of columns of A must be equal to the number of rows of B. In other words, AB, the product of matrices

A and B, is defined only if the number of columns of A is equal to the number of rows of B. If this condition is not satisfied, the product AB is not defined and is meaningless. When the number of columns of A is equal to the number of rows of B, matrix A is said to be

conformable to matrix B for the product AB. Observe that if A is an m × n matrix and B is an n × p matrix, A and B are conformable for the product, and

C is an m × p matrix. We demonstrate the use of the rule in Eq. (B.56) with the following examples.

In both cases, the two matrices are conformable. However, if we interchange the order of the matrices as follows,

the matrices are no longer conformable for the product. It is evident that in general,

Indeed, AB may exist and BA may not exist, or vice versa, as in our examples. We shall see later that for some special matrices, AB = BA. When Indeed, AB may exist and BA may not exist, or vice versa, as in our examples. We shall see later that for some special matrices, AB = BA. When

relationships:

We can verify that any matrix

A premultiplied or postmultiplied by the identity matrix I remains unchanged:

Of course, we must make sure that the order of

I is such that the matrices are conformable for the corresponding product.

We give here, without proof, another important property of matrices:

where | A| and |B| represent determinants of matrices A and B.

MULTIPLICATION OF A MATRIX BY A VECTOR

Consider the matrix Eq. (B.52) , which represents Eq. (B.48) . The right-hand side of Eq. (B.52) is a product of the m × n matrix A and a vector x. If, for the time being, we treat the vector x as if it were an n × 1 matrix, then the product Ax, according to the matrix multiplication rule, yields the right- hand side of Eq. (B.48) . Thus, we may multiply a matrix by a vector by treating the vector as if it were an n × 1 matrix. Note that the constraint of conformability still applies. Thus, in this case, xA is not defined and is meaningless.

MATRIX INVERSION To define the inverse of a matrix, let us consider the set of equations represented by Eq. (B.52) :

We can solve this set of equations for x 1 ,x 2 ,...,x n in terms of y 1 ,y 2 ,..., y n , by using Cramer's rule [see Eq. (B.31) ]. This yields

in which | A| is the determinant of the matrix A and |D ij | is the cofactor of element a ij in the matrix A. The cofactor of element a ij is given by ( −1) i+j times the determinant of the (n − 1) × (n − 1) matrix that is obtained when the ith row and the jth column in matrix A are deleted.

We can express Eq. (B.62a) in matrix form as

We now define A −1 , the inverse of a square matrix

A, with the property

Then, premultiplying both sides of Eq. (B.63) by A −1 , we obtain

or

A comparison of Eq. (B.65) with Eq. (B.62b) shows that

One of the conditions necessary for a unique solution of Eq. (B.62a) is that the number of equations must equal the number of unknowns. This One of the conditions necessary for a unique solution of Eq. (B.62a) is that the number of equations must equal the number of unknowns. This

Postmultiplying this equation by A −1 and then premultiplying by

A, we can show that

Clearly, the matrices A and A −1 commute. The operation of matrix division can be accomplished through matrix inversion.

EXAMPLE B.12

Let us find A −1 if

Here

and | A| = −4. Therefore,

B.6-3 Derivatives and Integrals of a Matrix

Elements of a matrix need not be constants; they may be functions of a variable. For example, if

then the matrix elements are functions of t. Here, it is helpful to denote A by A(t). Also, it would be helpful to define the derivative and integral of A(t).

The derivative of a matrix A(t) (with respect to t) is defined as a matrix whose ijth element is the derivative (with respect to t) of the ijth element of the matrix

A. Thus, if then

or

Thus, the derivative of the matrix in Eq. (B.68) is given by

Similarly, we define the integral of A(t) (with respect to t) as a matrix whose ijth element is the integral (with respect to t) of the ijth element of the

matrix A:

Thus, for the matrix A in Eq. (B.68) , we have

We can readily prove the following identities:

The proofs of identities (B.71a) and (B.71b) are trivial. We can prove Eq. (B.71c) as follows. Let A be an m x n matrix and B an n x p matrix. Then, if from Eq. (B.56) , we have

and

or Equation (B.72) along with the multiplication rule clearly indicate that d ik is the ikth element of matrix

and e ik is the ikth element of matrix Equation (B.71c) then follows.

If we let B=A −1 in Eq. (B.71c) , we obtain

But since

we have

B.6-4 The Characteristic Equation of a Matrix: The Cayley-Hamilton Theorem

For an (n − n) square matrix A, any vector x (x ≠ 0) that satisfies the equation is an eigenvector (or characteristic vector), and λ is the corresponding eigenvalue (or characteristic value) of A. Equation (B.74) can be expressed as

The solution for this set of homogeneous equations exists if and only if

or

Equation (B.76a) [or (B.76b) ] is known as the characteristic equation of the matrix

A and can be expressed as

Q( λ) is called the characteristic polynomial of the matrix A. The n zeros of the characteristic polynomial are the eigenvalues of A and, corresponding to each eigenvalue, there is an eigenvector that satisfies Eq. (B.74) .

The Cayley-Hamilton theorem states that every n × n matrix A satisfies its own characteristic equation. In other words, Eq. (B.77) is valid if λ. is

replaced by A:

FUNCTIONS OF A MATRIX

We now demonstrate the use of the Cayley-Hamilton theorem to evaluate functions of a square matrix A.

Consider a function f( λ) in the form of an infinite power series:

Since λ, being an eigenvalue (characteristic root) of A, satisfies the characteristic equation [ Eq. (B.77) ], we can write

If we multiply both sides by λ, the left-hand side is λ n+1 , and the right-hand side contains the terms λ n , λ n-1 , ..., λ. Using Eq. (B.80) , we substitute λ n , in terms of λ n −1 ,? n-2 ,..., λ so that the highest power on the right-hand side is reduced to n − 1. Continuing in this way, we see that λ n+k can

be expressed in terms of λ n-1 ,? n-2 ,..., λ for any k. Hence, the infinite series on the right-hand side of Eq. (B.79) can always be expressed in terms of λ n −1 , λ n −2 , λ,λ and a constant as

If we assume that there are n distinct eigenvalues ? 1 ,? 2 ,..., λ n , then Eq. (B.81) holds for these n values of λ. The substitution of these values in Eq. (B.81) yields n simultaneous equations

and

Since A also satisfies Eq. (B.80) , we may advance a similar argument to show that if f( A) is a function of a square matrix A expressed as an infinite power series in

A, then

and, as argued earlier, the right-hand side can be expressed using terms of power less than or equal to n − 1, in which the coefficients β i s are found from Eq. (B.82b) . If some of the eigenvalues are repeated (multiple roots), the results are somewhat modified.

We shall demonstrate the utility of this result with the following two examples.

B.6-5 Computation of an Exponential and a Power of a Matrix

Let us compute e At defined by

From Eq. (B.83b) , we can express

in which the β i s are given by Eq. (B.82b) , with f( λ i )=e λ it .

EXAMPLE B.13

Let us consider

The characteristic equation is

Hence, the eigenvalues are λ 1 = −1, λ 2 = −2, and

in which

and

COMPUTATION OF A k

As Eq. (B.83b) indicates, we can express A k as

in which the β i s are given by Eq. (B.82b) with f( λ i )= λ k i . For a completed example of the computation of A k by this method, see Example 10.12 . [ † ] These two conditions imply that the number of equations is equal to the number of unknowns and that all the equations are independent.