Sparse Matrices and Iterative Methods
1.3.3 Sparse Matrices and Iterative Methods
Following Wilkinson and Reinsch [381], a matrix A will be called sparse if the percentage of zero elements is large and its distribution is such that it is economical to take advantage of their presence. The nonzero elements of a sparse matrix may be concentrated on a narrow band centered on the diagonal. Alternatively, they may be distributed in a less systematic manner.
Large sparse linear systems arise in numerous areas of application, such as the numeri- cal solution of partial differential equations, mathematical programming, structural analysis, chemical engineering, electrical circuits, and networks. Large could imply a value of n in the range 1000–1,000,000. Figure 1.3.1 shows a sparse matrix of order n = 479 with 1887 nonzero elements (or 0.9%) that arises from a model of an eight stage chemical distillation column.
The first task in solving a sparse system by Gaussian elimination is to permute the rows and columns so that not too many new nonzero elements are created during the elimination.
13 André-Louis Cholesky (1875–1918), a French military officer involved in geodesy and surveying in Crete and North Africa just before World War One.
1.3. Matrix Computations
Figure 1.3.1. Nonzero pattern of a sparse matrix from an eight stage chemical distillation column.
Equivalently, we want to choose permutation matrices P and Q such that the LU factors of P AQ are as sparse as possible. Such a reordering will usually nearly minimize the number of arithmetic operations.
To find an optimal ordering which minimizes the number of nonzero elements in L and U is unfortunately an intractable problem, because the number of possible orderings of rows and columns are (n!) 2 . Fortunately, there are heuristic ordering algorithms which do a good job. In Figure 1.3.2 we show the reordered matrix P AQ and its LU factors. Here L and U contain together 5904 nonzero elements or about 2.6%. The column ordering was obtained using a MATLAB version of the so-called column minimum degree ordering.
Figure 1.3.2. Nonzero structure of the matrix A (left) and L + U (right).
40 Chapter 1. Principles of Numerical Calculations For the origin and details of this code we refer to Gilbert, Moler, and Schreiber [156].
We remark that, in general, some kind of stability check on the pivot elements must be performed during the factorization.
For many classes of sparse linear systems iterative methods are more efficient to use than direct methods such as Gaussian elimination. Typical examples are those arising when a differential equation in two or three dimensions is discretized. In iterative methods
a sequence of approximate solutions is computed, which in the limit converges to the exact solution x. Basic iterative methods work directly with the original matrix A and therefore have the added advantage of requiring only extra storage for a few vectors.
In a classical iterative method due to Richardson [302], starting from x ( 0) = 0, a sequence x (k) is defined by
=x (k) + ω(b − Ax ), k = 0, 1, 2, . . . , (1.3.22) where ω > 0 is a parameter to be chosen. It follows easily from (1.3.22) that the error in
x (k +1)
(k)
x (k) satisfies x (k +1) − x = (I − ωA)(x (k) − x), and hence
x (k)
− x = (I − ωA) k (x ( 0) − x).
It can be shown that, if all the eigenvalues λ i of A are real and satisfy
0<a≤λ i ≤ b,
then x (k) will converge to the solution, when k → ∞, for 0 < ω < 2/b. Iterative methods are used most often for the solution of very large linear systems, which typically arise in the solution of boundary value problems of partial differential equations by finite difference or finite element methods. The matrices involved can be huge, sometimes involving several million unknowns. The LU factors of matrices arising in such applications typically contain orders of magnitude more nonzero elements than
A itself. Hence, because of the storage and number of arithmetic operations required, Gaussian elimination may be far too costly to use. In a typical problem for the Poisson equation (1.1.21) the function is to be determined in a plane domain D, when the values of u are given on the boundary ∂D. Such boundary value problems occur in the study of steady states in most branches of physics, such as electricity, elasticity, heat flow, and fluid mechanics (including meteorology). Let D be a square grid with grid size h, i.e.,
x i =x 0 + ih, y j =y 0 + jh, 0 ≤ i ≤ N + 1, 0 ≤ j ≤ N + 1. Then the difference approximation yields
−1 − 4u i,j =h f (x i ,y j ) ( 1 ≤ i, j ≤ N). This is a huge system of linear algebraic equations; one equation for each in-
u i,j +1 +u i −1,j +u i +1,j +u i,j
terior gridpoint, altogether N 2 unknowns and equations. (Note that u i, 0 ,u i,N +1 ,u 0,j ,u N +1,j are known boundary values.) To write the equations in matrix-vector form we order the unknowns in a vector,
u = (u 1,1 ,...,u 1,N ,u 2,1 ,...,u 2,N ,...,u N, 1 ,...,u N,N ), the so-called natural ordering. If the equations are ordered in the same order we get a system
Au = b, where A is symmetric with all nonzero elements located in five diagonals; see Figure 1.3.3 (left).
1.3. Matrix Computations
Figure 1.3.3. Structure of the matrix A (left) and L + U (right) for the Poisson problem, N = 20 (rowwise ordering of the unknowns).
In principle Gaussian elimination can be used to solve such systems. But even taking symmetry and the banded structure into account, this would require 1 2 ·N 4 multiplications, since in the LU factors the zero elements inside the outer diagonals will fill in during the elimination, as shown in Figure 1.3.3 (right).
The linear system arising from the Poisson equation has several features common to boundary value problems for other linear partial differential equations. One of these is that only a tiny fraction of the elements in each row of A are nonzero. Therefore, each iteration
in Richardson’s method requires only about kN 2 multiplications, i.e., k multiplications per unknown. Using iterative methods which take advantage of the sparsity and other features does allow the efficient solution of such systems. This becomes even more essential for three-dimensional problems.
As early as 1954, a simple atmospheric model was used for weather forecasting on an electronic computer. The net covered most of North America and Europe. During a 48 hour forecast, the computer solved (among other things) 48 Poisson equations (with different right-hand sides). This would have been impossible at that time if the special features of the system had not been used.