Directory UMM :Data Elmu:jurnal:J-a:Journal of Computational And Applied Mathematics:Vol101.Issue1-2.1999:
Journal of Computational and Applied Mathematics 101 (1999) 61–85
A Lanczos-type method for solving nonsymmetric linear
systems with multiple right-hand sides – matrix
and polynomial interpretation
C. Musschoot
Laboratoire d’Analyse Numerique et d’Optimisation, UFR IEEA - M3, Universite des Sciences et
Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France
Received 9 June 1998
Abstract
In this paper, we propose a method with a nite termination property for solving the linear system AX = B where A
is a nonsymmetric complex n×n matrix and B is an arbitrary n×s rectangular matrix. s does not have to be small. The
method is based on a single Krylov subspace where all the systems are picking informations. A polynomial and a single
matrix interpretation is given which seems to be new from a theoretical point of view. Numerical experiments show that
the convergence is usually quite good even if s is relatively large. The memory requirements and the computational costs
c 1999 Elsevier Science B.V. All rights reserved.
seem to be interesting too.
Keywords: Nonsymmetric systems; Krylov subspace; Multiple right-hand sides; Lanczos’ method; Bi-orthogonality;
Hankel matrix; Orthogonal polynomials; Transpose-free algorithm; BiCGSTAB
1. Introduction
Two main techniques can be found for solving
AX = B
(1)
where A is a n×n square nonsymmetric matrix and B is an arbitrary n×s rectangular matrix.
The rst one considers block versions of existing methods such as block GCR (see [12]), block
BiCG (see [11]), block GMRES (see [16, 17]) or block QMR (see [14]). All these methods need
the inversion of an s×s matrix per iteration. So s must be relatively small.
The second one avoids this problem. Generally a seed system is considered and the results obtained
for solving this single system are used to solve the other ones (see [7]). Other methods have recently
been proposed. An iterative method is considered in [15] and requires a general eigenvalue problem
to be solved. This method shares the informations obtained for solving each system, in order to get
a faster convergence.
c 1999 Elsevier Science B.V. All rights reserved.
0377-0427/99/$ – see front matter
PII: S 0 3 7 7 - 0 4 2 7 ( 9 8 ) 0 0 1 9 5 - 2
62
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
In this paper, we use a single seed system to solve (1) based on a modied Lanczos’ method that
will lead to transpose-free algorithms and to a modied BiCGSTAB, inspired from van der Vorst
[19]. All these methods will allow a single matrix and a polynomial interpretation.
Section 2 brie
y recalls the Lanczos’ method for a single right-hand side. Section 3 describes
the modications that have to be made for several right-hand sides and considers the BiCGSTAB
with these modications. Section 4 presents some numerical results and the conclusions stands in
Section 5.
2. Lanczos’ method and its implementations
As the new method will be based on the Lanczos’ method, we will rst introduce this method
[10] and its implementations by means of orthogonal polynomials.
2.1. Lanczos’ method
Let us consider the system
Ax = b;
where A is a n×n square matrix and b ∈ Cn . Lanczos’ method constructs two sequences rk and xk
such that
xk − x0 ∈ Kk (A; r0 ) = span(r0 ; Ar0 ; : : : ; Ak−1 r0 );
r = b − Ax ⊥ K (A∗ ; y) = span(y; A∗ y; : : : ; A∗k−1 y);
k
k
k
(2)
(3)
where x0 and y are arbitrarily chosen. Usually, y = r0 .
From that, we have the
Property 2.1. Let us assume that the vectors y; A∗ y; : : : ; A∗n−1 y are linearly independent. Then
∃k6n;
rk = 0:
By the denitions of xk and rk , we obtain
xk − x0 = −1 r0 − 2 Ar0 − · · · − k Ak−1 r0 :
So,
rk = r0 + 1 Ar0 + · · · + k Ak r0
= Pk (A)r0 ;
where Pk is the polynomial of degree k at most, dened by
Pk (x) = 1 + 1 x + · · · + k x k :
Thus, the orthogonality conditions (3) can be written as
1 (y; Ai+1 r0 ) + · · · + k (y; Ai+k r0 ) = −(y; Ai r0 ) for i = 0; : : : ; k − 1:
(4)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
63
The vector (1 ; : : : ; k )T is the solution of the Hankel system of order k generated by ((r0 ; A∗ y); : : : ;
(r0 ; A∗2k−1 y))T with the right-hand side (−(r0 ; y); : : : ; −(r0 ; A∗k−1 y))T , that is the solution of the
system
(r0 ; A∗k y)
(r0 ; A∗ y) (r0 ; A∗2 y) · · ·
(r0 ; y)
1
..
∗
2
(r0 ; A y)
(r0 ; A∗2 y)
.
:
. = −
..
.
..
..
.
.
.
.
∗
k−1
2k−1
∗
k
∗
k
(r0 ; A
y)
···
· · · (r0 ; A
(r0 ; A y)
y)
2.2. Formal orthogonal polynomials
We now have to introduce the notion of formal orthogonal polynomials, fully studied by Draux
in [8]. Let c be the linear functional dened on the space of polynomials by
c(xi ) = ci = (y; Ai r0 ) for all i¿0:
Then, for all polynomial P, we have c(P) = (y; P(A)r0 ) and the conditions (4) and thus, the orthogonality conditions (3) become
c(xi Pk ) = 0
for i = 0; : : : ; k − 1:
(5)
A family of polynomials satisfying the conditions (5) is called the family of orthogonal polynomials
with respect to the functional c. These polynomials are dened apart a multiplying factor chosen in
our case to verify Pk (0) = 1.
Such polynomials satisfy, if they exist for all k, a three-term recurrence relationship. Moreover,
we can show that Pk exists if and only if
c1 · · · ck
.. 6= 0
Hk(1) = ...
.
c ··· c
k
2k−1
and that it is of the exact degree k if and only if
c0 · · · ck−1
.. 6= 0:
Hk(0) = ...
.
c
k−1 · · · c2k−2
In the sequel, we will assume these two conditions hold for all k.
Then, we can dene the family of adjacent polynomials Pk(1) , orthogonal with respect to the
functional c (1) dened by
c (1) (x i ) = c(x i+1 ) = ci+1 :
These polynomials are chosen to be monic and, so, they exist if and only if Hk(1) 6= 0. Thus, the
condition for the existence of Pk(1) and Pk is the same.
64
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
2.3. Implementations of Lanczos’ method
All the possible algorithms for implementing Lanczos’ method come out from the recurrence relations between the families {Pk } and {Pk(1) }. The three main algorithms are named Lanczos=Orthodir,
Lanczos=Orthomin and Lanczos=Orthores and we will now remind them.
2.3.1. Lanczos/Orthodir
The orthogonal polynomials Pk(1) satisfy a three-term relation of the form
(1)
(1)
Pk+1
(x) = (x − k )Pk(1) (x) − k Pk−1
(x):
In the same way, we can prove that the polynomials Pk satisfy the relation
Pk+1 (x) = Pk (x) − k xPx(1) (x):
Setting qk = Pk(1) (A)r0 , the rst algorithm follows
Algorithm 2.2 (Lanczos=Orthodir).
qk+1 = (A − k )qk − k qk−1 ;
rk+1 = rk − k Aqk ;
xk+1 = xk + k qk ;
with
k =
(y; A2 Uk (A)qk ) − k (y; AUk (A)qk−1 )
;
(y; AUk (A)qk )
(y; A2 Uk−1 (A)qk )
;
(y; AUk−1 (A)qk−1 )
(y; Uk (A)rk )
k =
;
(y; AUk (A)qk )
k =
where Uk and Uk−1 are two arbitrary polynomials of respective degrees k and k − 1. This algorithm
(1)
is called Lanczos=Orthodir if Uk and Uk−1 are, respectively, Pk(1) and Pk−1
.
2.3.2. Lanczos=Orthomin
Setting Qk = ˜ k Pk(1) with ˜ k such that Pk and Qk have the same leading coecient, we can show
that the polynomials Qk satisfy the relation
Qk+1 (x) = Pk+1 (x) − k Qk (x):
So, setting q˜k = Qk (A)r0 , we obtain the
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Algorithm 2.3 (Lanczos=Orthomin).
rk+1 = rk − k Aqek ;
xk+1 = xk + k qek ;
with
q˜k+1 = rk+1 − k qek ;
(y; Uk+1 (A)rk+1 )
;
(y; AUk (A)qek )
(y; Uk (A)rk )
k =
:
(y; AUk (A)qek )
k =
This algorithm is called Lanczos=Orthomin for the choices Uk ≡ Pk and Uk+1 ≡ Pk+1 .
2.3.3. Lanczos=Orthores
The polynomials Pk satisfy a three-term relation that can always be written as
Pk+1 (x) = k [(x + k )Pk (x) −
k Pk−1 (x)]:
This relation gives us the
Algorithm 2.4 (Lanczos=Orthores).
rk+1 = k [(A + k )rk −
k rk−1 ];
xk+1 = k [k xk −
k xk−1 − rk ];
with
1
;
k −
k
k (y; Uk (A)rk−1 ) − (y; AUk (A)rk )
;
k =
(y; Uk (A)rk )
(y; AUk−1 (A)rk )
k =
:
(y; Uk−1 (A)rk−1 )
k =
This algorithm is called Lanczos=Orthores for the choice Uk ≡ Pk and Uk−1 ≡ Pk−1 .
3. Several right-hand sides
Let us now consider
AX = B;
where B = [(b(1) ; : : : ; b(s) )] is a matrix of dimension n × s.
65
66
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
3.1. Description of the method
Let us consider, 2s sequences rk( j) and xk( j) for j = 1; : : : ; s dened by
xk( j) ∈ Kk (A; z);
(6)
rk( j) = b( j) − Axk( j) ⊥ Kk (A∗ ; y);
(7)
where z and y are arbitrarily chosen vectors.
Then, we nd from (6)
xk( j) = 1( j) z + · · · + k( j) Ak−1 z:
Thus,
rk( j) = b( j) − Axk( j)
= b( j) − 1( j) Az − · · · − k( j) Ak z
( j)
= b( j) − Ak−1
(A)z;
( j)
where k−1
is the polynomial of degree k − 1 at most, dened by
( j)
(x) = 1( j) + 2( j) x + · · · + k( j) xk−1 :
k−1
The orthogonality conditions (7) can be written as
1( j) (Ai+1 z; y)
+ ··· +
k( j) (Ai+k z; y)
i ( j)
= (A b ; y) for
i = 0; : : : ; k − 1;
j = 1; : : : ; s:
(8)
The unknown vectors (1( j) ; : : : ; k( j) )T are the solutions of the Hankel systems generated by
((Az; y); : : : ; (A2k−1 z; y))T with the right-hand sides ((b( j) ; y); : : : ; (Ak−1 b( j) ; y))T for j = 1; : : : ; s.
This matrix interpretation would not be possible considering the original Lanczos’ method (we
only have one Hankel matrix here with several right-hand sides). In the original Lanczos’ method,
the Hankel matrix depends on r0( j) .
3.2. Finite convergence
As in the original Lanczos’ method and under a certain assumption, we can show that the new
method gives the exact solution in a nite number of iterations.
The condition is the same as in the original Lanczos’ method and we have the
Proposition 3.1. Let us assume that the vectors y; A∗y; : : : ; A∗n−1y are linearly independent. Then;
∃k 6 n; rk(j) = 0 for j = 1; : : : ; s:
Proof. By the orthogonality conditions, we have rk( j) ⊥ (y; A∗y; : : : ; A∗n−1y) which are n linearly
independent vectors. Then the result is obvious and the exact solutions are obtained in n iterations
at most.
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
67
3.3. Implementation of the new method
Let us now explore the possibilities for implementing this method. We will rst dene the functionals that have to be used. Then we will construct the sequences that lead to the solution of the
systems.
3.3.1. Associated functionals – polynomial expression
Let us consider the n × s linear functionals dened on the space of polynomials by
Li( j) (xl ) = (Ai+l z; y) if l ¿ 0;
L(i j) (1)
(9)
i ( j)
= −(A b ; y)
(10)
for i = 1; : : : ; n and j = 1; : : : ; s.
Let us dene, as in the original Lanczos’ method, the functional c(1) by
c(1) (xi ) = ci+1 = (Ai+1 z; y) for i ¿ 0:
Let Pek be the orthogonal polynomials with respect to the functional c(1) . These polynomials satisfy
c(1) (xiPek ) = 0
for i = 0; : : : ; k − 1:
(11)
We have the
Proposition 3.2. The functionals L(i j) and c(1) are related by
c(1) (xi ) = L(l j) (xi−l+1 ) for i − l + 1 ¿ 0:
Proof. By denition, L(l j) (xi−l+1 ) = (Al+i−l+1 z; y) since i − l + 1 6= 0. Thus, we have L(l j) (xi−l+1 ) =
(Ai+1 z; y) = c(1) (xi ).
Let the polynomials Pek be monic. Then, from [8], we have the
Property 3.3. The monic orthogonal polynomials Pek with respect to the functional c(1) satisfy the
relation
Pek+1 (x) = (x − k )Pek (x) − k Pek−1 (x);
where
0 = 0;
k =
0 =
k =
c(1) (xk Pek )
c(1) (xk−1 Pek−1 )
for k ¿ 0;
c(1) (x)
;
c(1) (1)
c(1) (xk+1 Pek ) − k c(1) (xk Pek−1 )
c(1) (xk Pek )
for k ¿ 0:
(12)
68
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The expression of the coecients is due to the orthogonality of the polynomials Pek .
Multiplying both sides in (12) by a polynomial Uk−1 of exact degree k − 1 and applying c(1)
gives us
k =
c(1) (xUk−1 Pek )
c(1) (Uk−1 Pek−1 )
for k ¿ 0:
(13)
Moreover, multiplying both sides in (12) by a polynomial Uk of exact degree k and applying c(1)
leads to
k =
c(1) (xUk Pek ) − k c(1) (Uk Pek−1 )
c(1) (Uk Pek )
for k ¿ 0:
(14)
Now, setting
Pk( j) (x) = 1 + 1( j) x + · · · + k( j) xk
we have, for j = 0; : : : ; s,
Pk( j) (0) = 1;
L(i j) (Pk( j) ) = 0
for i = 0; : : : ; k − 1:
(15)
Thus, the polynomials Pk( j) are the bi-orthogonal polynomials introduced by Brezinski in [3]. So,
they are of degree k at most and satisfy
with
( j)
Pk+1
(x) = Pk( j) (x) + k( j) xPek (x)
k( j) = −
(16)
L(k j) (Pk( j) )
:
L(k j) (xPek )
The polynomials Pk( j) can be written, by denition
( j)
Pk( j) (x) = 1 + xk−1
(x)
and thus, the polynomials k( j) verify
( j)
k+1
(x) = k( j) (x) + k( j)Pek (x):
( j)
And, as xk( j) = k−1
(A)z and rk( j) = b( j) − Axk( j) , we obtain, setting qk = Pek (A)z
( j)
xk+1
= xk( j) + k( j) qk ;
( j)
rk+1
= rk( j) − k( j) Aqk :
(17)
Remark 3.4. Unlike in the original Lanczos’ method; the polynomials Pk( j) are not orthogonal polyj)
(x). Moreover; we do not have a polynomial
nomials. Indeed; we do not usually have L(i j) (1) = L(i−1
relation of the form rk( j) = Pk( j) (A)r0 .
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
69
The polynomials Pek and Pk( j) can be written in terms of determinants as
c1 · · · ck+1
..
..
.
.
ck · · · c2k
1 · · · xk
Pek (x) =
:
c1 · · · ck
..
..
.
.
c ··· c
k
2k−1
We can easily check that these polynomials are monic and satisfy the conditions (11).
Similarly, for Pk( j) , we have
1
x · · · xk
−(b( j) ; y) c1 · · · ck
..
..
..
.
.
.
−(Ak−1 b( j) ; y) c · · · c
k
2k−1
:
Pk( j) (x) =
c1 · · · c k
..
..
.
.
c ··· c
k
2k−1
These polynomials satisfy the conditions (15) and we obviously have Pk( j) (0) = 1.
Thus, the polynomials Pek and Pk( j) exist if and only if Hk(1) 6= 0 and the polynomials Pk( j) are of
the exact degree k if and only if
(b( j) ; y)
..
(0)
Hk; j =
.
( j) ∗k−1
(b ; A
y)
c1 · · · ck−1
..
..
. 6= 0:
.
ck · · · c2k−2
If the polynomials Pek and Pk( j) do not exist, a breakdown occurs. Such a situation can be treated by
a look-ahead technique as developed by Brezinski and Redivo Zaglia in [2].
3.3.2. Analogy with Lanczos=Orthodir
We can now obtain the relations required for solving the systems.
The Multiple Lanczos=Orthodir algorithm will be similar to Lanczos=Orthodir (for one single
system).
Algorithm 3.5 (M-Lanczos=Orthodir).
Initializations
q0 = z
q1 = (A − 0 )z
for j = 1; : : : ; s do
r1( j) = b( j) − ( j) Az
70
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1 do
for j = 1; : : : ; s do
( j)
rk+1
= rk( j) − k( j) Aqk
( j)
xk+1
= xk( j) + k( j) qk
end for
qk+1 = (A − k )qk − k qk−1
end for
where z is an arbitrary nonzero vector, the scalars k( j) ; k and k are dened above and
( j) = −
L(0j) (1) (b( j) ; y)
=
(Az; y)
L(0j) (x)
for the conditions (15) with k = 0:
Let us now give some useful expressions of the coecients ( j) ; k ; k and k( j) . By denition of
the functionals L(i j) and c(1) and using their linearity properties, we easily prove, setting qk =Pek (A)z,
that
c(1) (xiPek ) = (Ai+1Pek (A)z; y) = (Ai+1 qk ; y);
(18)
( j)
L(i j) (Pk( j) ) = −(Ai b( j) ; y) + (Ai+1 k−1
(A)z; y)
= −(Ai rk( j) ; y):
(19)
Thus, we can deduce the
Proposition 3.6. The scalars ( j) ; k( j) ; k and k are given by
( j) =
(b( j) ; y)
;
(Az; y)
k( j) =
(Vk ( j) (A) rk( j) ; y)
(Ak rk( j) ; y)
=
;
(Ak+1 qk ; y) (AVk ( j) (A)qk ; y)
k =
(A2 Uk−1 (A)qk ; y)
(Ak+1 qk ; y)
=
;
(Ak qk−1 ; y) (AUk−1 (A)qk−1 ; y)
k =
(Ak+2 qk ; y) − k (Ak+1 qk−1 ; y)
(Ak+1 qk ; y)
=
(A2 Uk (A) qk ; y) − k (AUk (A)qk−1 ; y)
(AUk (A)qk ; y)
for j = 1; : : : ; s and for any polynomial Vk ( j) of exact degree k.
(20)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
71
Proof. It is a consequence of (13), (14), (18), (19), Proposition 3.2 and the orthogonality conditions (7).
This algorithm can be directly implemented using Uk (x) = x k and Vk ( j) (x) = x k but it uses the
computation of successive powers of the matrix A; which is known to be numerically unstable.
To avoid computing many Vk ( j) (A)qk , we choose Vk ( j) ≡ Uk . We thus only have to compute
Uk (A)qk ; Uk−1 (A)qk ; Uk (A) qk−1 and Uk (A) rk( j) . A natural choice for Uk is Uk ≡ Pek (since then,
e k−1 (A)Pek (A) z = Pek (A)Pek−1 (A)z = Uk (A) qk−1 ):
Uk−1 (A) qk = P
e k (A)Pek−1 (A) z and Pek (A)P ( j) (A) z using the recurrence
So we must compute the vectors Pek (A)2 z; P
k
(
j)
relationships the polynomials Pek and Pk satisfy.
Using the technique we can nd in [4], we obtain the
Proposition 3.7. Setting rk( j) = Pek (A)rk( j) ; qk = Pek (A) qk ; qbk = Pek−1 (A) qk and brk( j) = Pek−1 (A) rk( j) ; we
have
qk+1 = (A − k )2 qk − 2k (A − k )qbk + 2 qk−1 ;
qbk+1 = (A − k )qk − k qbk ;
( j)
b
rk+1
= r(k j) − k( j) Aqk ;
j)
( j)
= (A − k )brk+1
− k brk( j) + k( j) k Aqbk
r(k+1
with
= (A − k ) r(k j) − k brk( j) − k( j) Aqbk+1
k( j) =
( r(k j) ; y)
;
(Aqk ; y)
k =
(A2qbk ; y)
;
(Aqk−1 ; y)
k =
(A2 qk ; y) − k (Aqbk ; y)
:
(Aqk ; y)
j)
( j)
Proof. Using the relations (12) and (16) we nd the expressions of qk+1 ; r(k+1
;e
qk+1 and erk+1
. Then,
( j)
( j)
replacing Uk and Vk in the Proposition 3.6, the result is obvious for k ; k and k .
We thus obtain a transpose-free algorithm.
Algorithm 3.8 (TFM-Lanczos=Orthodir).
Initializations
q0 = q0 = z
q1 = (A − 0 ) q0
q1 = (A − 0 ) q1
qb1 = q1
qb0 = 0
for j = 1; : : : ; s do
b
r1( j) = b( j) − ( j) Az
r(1j) = (A − 0 ) br1( j)
72
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1
for j = 1; : : : ; s
( j)
b
rk+1
= r(k j) − k( j) Aqk
j)
( j)
r(k+1
= (A − k ) brk+1
− k brk( j) + k( j) k A qbk
( j)
xk+1
= xk( j) + k( j) qk
end for
qk+1 = (A − k )2 qk − 2k (A − k ) qbk + k2 qk−1
qbk+1 = (A − k ) qk − k A qbk
qk+1 = (A − k ) qk − k qk−1
end for
with k ; k ; k( j) and k( j) dened as above.
Note that the vectors xk( j) are the same as in the M-Lanczos=Orthodir implementation.
3.3.3. Analogy with Lanczos=Orthomin
The algorithm Lanczos=Orthomin uses the polynomials Pk+1 and Pek to compute the polynomial
Pek+1 .
To obtain similar relations, we need to introduce the polynomial Pk(0) which satises
(0)
Pek+1 (x) = Pk+1
(x) +
k Pek (x):
(21)
We can easily prove that the polynomial Pk(0) is orthogonal with respect to the functional c dened
by
c(xi ) = ci = c(1) (x i−1 ) = (Ai z; y)
and will be chosen such that Pk(0) (0) = 1.
As Pek is orthogonal with respect to c(1) , then Pk(0) satisfy a relation of the form
(0)
Pk+1
(x) = Pk(0) (x) + k(0) xPek :
(0)
.
The polynomial Pek+1 is not necessarily monic but has the same leading coecient as Pk+1
(0)
(0)
(0)
(0)
(0)
(0)
(0)
We dene rk = b − Axk = b − Ak−1 (A)z with b = −z so that we have Li (1) = ci .
The polynomial Pk(0) thus veries rk(0) = −Pk(0) (A)z and we have, setting qk = Pek (A)z; qk+1 =
k Aqk −
(0)
(0)
rk+1
and rk+1
= rk(0) − k(0) Aqk .
The scalar
k can be written as
(0)
(0)
c(x k+1 Pk+1
; y)
) (Ak+1 rk+1
=
k = −
k+1 q ; y)
(1)
k
e
(A
c (x Pk )
k
=−
(0)
(0)
c(xUk Pk+1
) (AUk (A)rk+1
; y)
:
=
(1)
(AUk (A)qk ; y)
c (Uk Pek )
(22)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
73
As in the previous case, if we choose Uk (x) = x k and Vk ( j) = x k then this can be directly implemented.
Thus, the Multiple Lanczos=Orthomin algorithm is as follows:
Algorithm 3.9 (M-Lanczos=Orthomin).
Initializations
q0 = z
for j = 0; : : : ; s do
r1( j) = b( j) − ( j) Az
x1( j) = ( j) z
end for
q1 =
0 q0 − r1(0)
for k = 1; : : : ; n − 1
for j = 0; : : : ; s
( j)
= rk( j) − k( j) Aqk
rk+1
( j)
xk+1
= xk( j) + k( j) qk
end for
(0)
qk+1 =
k qk − rk+1
end for
where the scalars k( j) ; ( j) and
k are dened above.
If we set, again, Uk ≡ Pek and Vk ( j) ≡ Pek , then we obtain the
Proposition 3.10. Setting r(k j) = Pek (A) rk( j) ; qk = Pek (A) qk ; brk( j) = Pek−1 (A) rk( j) ; and r(k j) = Pk(0) (A) rk( j) ;
we have
( j)
(0)
( j)
(0) ( j) 2
rk+1
= r(k j) + k( j) A r(0)
k ;
k + k A rk − k k A q
( j)
b
rk+1
= r(k j) − k( j) Aqk ;
j)
j)
( j)
r(k+1
= r(k+1
+
k brk+1
;
with
(0)
qk+1 =
2k qk − r(0)
rk+1
k+1 − 2
k b
k( j) =
( r(k j) ; y)
;
(A qk ; y)
k =
(0)
(Abrk+1
; y)
:
(Aqk ; y)
74
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Proof. Those relations are easily obtained from (21) and (12). Then, replacing Uk and Vk ( j) by Pek
in (20) and (22), the coecients k( j) and
k are obtained.
Proposition 3.11. As Pk(0) (0) = 1 then we can write rk( j) = b( j) −Ax(k j) and
rk( j) = 0 ⇒ rk( j) = 0 ⇒ Axk( j) = b( j) :
Then
( j)
xk+1
= xk( j) −k( j) rk(0) −k(0) Ark( j) +k(0) k( j) Aqk :
Thus the transpose-free algorithm follows:
Algorithm 3.12 (TFM-Lanczos=Orthomin).
Initializations
q0 = z
for j = 0; : : : ; s do
b
r1( j) = b( j) − ( j) Az
r1( j) = (1 + (0) A)br1( j)
r1( j) = r1( j) +
0 br1( j)
x1( j) = ( j) z − (0) b( j) + (0) ( j) Az
end for
q1 =
20 q0 − r1(0) − 2
0rb1(0)
for k = 1; : : : ; n − 1 do
for j = 0; : : : ; s do
( j)
rk+1
= rk( j) + k( j) Ark(0) + k(0) Ark( j) − k(0) k( j) A2 qk
( j)
b
rk+1
= rk( j) − k( j) Aqk
( j)
( j)
( j)
rk+1
= rk+1
+
k brk+1
( j)
xk+1
= xk( j) −k( j) rk(0) −k(0) rk( j) +k(0) k( j) Aqk
end for
(0)
(0)
− 2
k rbk+1
qk+1 =
2k qk − rk+1
end for
with ( j) , k( j) and
k dened as above.
Note that the vectors xk( j) are not the same as in the M-Lanczos=Orthomin implementation.
An analogy with Lanczos=Orthores cannot be obtained since, usually, the polynomials Pk( j) are not
orthogonal.
75
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The following table shows the computational cost per iteration of each algorithm.
The number of n-vectors required are displayed in the column Memory.
M-Lanczos=Orthodir
M-Lanczos=Orthomin
TFM-Lanczos=Orthodir
TFM-Lanczos=Orthomin
n-vector DOT
Matrix-vector products
Memory
s+2
s+2
s+4
s+2
2
2
s+4
s+4
2s + 1
2s + 2
3s + 3
4s + 1
This shows that the number of matrix-vector products does not depend on s for the two rst algorithms, while this is not the case for the Transpose-free ones. It must be noticed that the Transposefree algorithms require much more memory and they can be seen as a generalization of the algorithms
found in [4, 6].
We thus obtained a polynomial and a matrix interpretation with one Hankel matrix to linear
systems with several right-hand sides that is based on orthogonal polynomials. The method proposed only needs, in the basic implementations (M-Lanczos=Orthodir and M-Lanczos=Orthomin),
two matix-vector products per iteration (this does not depend on s, the number of right-hand sides
considered).
Unfortunately, the greater n is, the worse the numerical results are for the two basic implementations (see below). Unless improved, those two methods only seem to have a theoretical interest. We
will now study a modication of the BiCGSTAB. The BiCGSTAB from van der Vorst gives good
numerical results in the single case. So we can hope an acceleration (due to a certain minimization)
of the convergence with multiple right-hand sides.
3.3.4. Modication of BiCGSTAB for several right-hand sides
The algorithm with less computational cost for the Modied BiCGSTAB is the M-Lanczos=
Orthomin.
In the BiCGSTAB, the polynomials Vk dened by
Vk+1 (x) = (1 + ak x)Vk (x)
are considered and the sequence rek dened by
rek = Vk (A)rk
is constructed. The scalar ak is chosen such that krek+1 k2 is minimum.
Let us set
rek( j) = Vk (A)rk( j) :
76
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
From (17), we obtain
( j)
rek+1
= rk( j) − k( j) AVk (A)qk :
Thus, to avoid computing several vectors Vk (A)qk , the polynomial Vk must not depend on j.
This is why we now choose ak which minimizes
s
X
( j) 2
kr˜k+1
k:
j=1
Setting qek = Vk (A)qk and sek( j) = rek( j) − k( j) Aqek , we easily nd
Ps
ek( j) ; Asek( j) )
j=1 (s
:
ek( j) k2
j=1 kAs
ak = − Ps
Then we obtain the algorithm Multiple BiCGSTAB=Orthomin.
Algorithm 3.13 (M-BiCGSTAB=Orthomin).
Initializations
qe0 = z
for j = 0; : : : ; s do
re1( j) = (I + a0 A) (b( j) − ( j) Az)
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1
for j = 0; : : : ; s do
sek( j) = rek( j) − k( j) Aqek
( j)
rek+1
= (I + ak A)sek( j)
( j)
xek+1
= xek( j) + k( j) − ak Asek( j)
end for
(0)
qek+1 = rek+1
−
k (I + ak A)qek .
end for
The scalars k( j) and
k can be expressed by
ek( j) ; y)
( j) (r
;
k =
(Aqek ; y)
(0)
; y)
1 (rek+1
k =
:
ak (Aqek ; y)
77
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Let us see the computational cost of such an algorithm.
M-BiCGSTAB=Orthomin
n-vector DOT
Matrix-vector products
Memory
3s + 2
s+2
3s + 1
This method is ought to be more numerically accurate but it requires s matrix-vector products more
per iteration (to compute the Ask( j) ).
Remark 3.14. The BiCGSTAB seems to be the most ecient method of the Lanczos Type Products
Methods (LTPM). The Conjugate Gradient Square (CGS) in particular [18], cannot be used here
since the Vk must be independent from j.
4. Numerical examples
Before considering the examples, let us remark that the M-Lanczos=Orthomin and the M-Lanczos=
Orthodir only seem to be ecient on small matrices (dimension less than 20), even if the computational cost is theoretically lower than for the M-BiCGSTAB or the other Transpose-free algorithms.
Secondly, the TFM-Lanczos=Orthodir only seem to have good numerical results on matrices of dimension less than 100. This is why we will only focus the numerical study on the M-BiCGSTAB
and on the TFM-Lanczos=Orthomin.
Every routine was written in Matlab 4.2c.l. All the matrices we considered are of order n = 500.
All the right-hand sides
were randomly chosen, using the RAND function in Matlab. The stopping
Ps
criteria used is (1=s) i=1
krk(i) k2 ¡ 10−16 (unless the matrix dimension is reached).
We used three symmetric matrices and three nonsymmetric matrices to point out how fast the
convergence was in each case. For each matrix, we computed the condition number using the COND
function in Matlab. Then, we considered s = 1, 10, 20, 30, 40 and 50 right-hand sides to see the
behaviour of the M-BiCGSTAB when increasing the number of right-hand sides (since the coecient
ak in the M-BiCGSTAB depends on every residue). There is no need to do such a comparison for
the TFM-Lanczos=Orthomin since each residue is considered independently from the other ones. All
the results are presented in a table for the M-BiCGSTAB.
On each gure, we show the results for s = 1 and s = 50 for the M-BiCGSTAB as well as the
results for s=50 for the TFM-Lanczos=Orthomin. This will allow us to compare the behaviour of the
M-BiCGSTAB step by step when increasing the number of right-hand sides. The M-BiCGSTAB and
the TFM-Lanczos=Orthomin can be compared too. The graphs represent the norms of the residuals,
in logarithmic scale, versus the iterations.
4.1. Symmetric matrices
We will rst study the implementation of the M-BiCGSTAB and of the TFM-Lanczos=Orthomin
on symmetric matrices since such matrices generally give better results than the nonsymmetric ones.
78
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The rst matrix considered is the matrix
−1
20
−1
M1 =
20
..
.
..
.
..
.
−1
−1
20
whose condition number is 1.22.
We obtained the following results (iteration where the stopping criteria is satised) for the
M-BiCGSTAB.
M1
s
Iteration
1 10 20 30 40 50
16 16 16 17 18 16
The convergence for s = 1 and s = 50 is as follows:
As we could expect, the convergence is fast whatever the method is. This might be due to
two dierent factors. First, the matrix considered is symmetric. Secondly, the problem is very well
conditioned. We note that the number of right-hand sides is not important for the M-BiCGSTAB.
And the shape of convergence is the same for s = 1 and s = 50 for the M-BiCGSTAB. The TFMLanczos=Orthomin gives a quite good result too.
The next matrix is
B
−I
M2 =
−I
B
..
.
..
.
..
.
−I
;
−I
B
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
79
where the matrix I is the identity matrix of order 20 and
4
..
B=
.
4
is of order 20. The condition number is 2.97.
We obtained the following results when applying the M-BiCGSTAB
M2
s
Iteration
1 10 20 30 40 50
40 45 41 41 49 43
and the convergence is as follows:
In this example, we see that the number of right-hand sides does not really interact with the
convergence of the M-BiCGSTAB. The smaller number of iterations needed is 40 for s=1 while
the larger one is 49 for s = 40. The two curves for the M-BiCGSTAB are close again and the
convergence is quite good. The TFM-Lanczos=Orthomin gives here a better result.
The last symmetric matrix considered is the diagonal matrix
M3 =
1
2
..
.
500
80
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
used in [7] with size 200. The condition number is obviously 500. Even if this matrix is particular,
let us see what the results are.
We obtained the following results with the M-BiCGSTAB
M3
1
10 20 30 40 50
189 192 192 198 198 205
s
Iteration
and the graph for s = 1 and s = 50 is
The convergence is good, despite a condition number equal to 500 (which is not very big but
not so small). The two graphs are again very close for the M-BiCGSTAB. We can see that the
smaller number of iterations is 189 for s = 1 as the larger one is 205 for s = 50. Thus, the number
of right-hand sides does not seem to be very important here. The M-BiCGSTAB is a better than the
TFM-Lanczos=Orthomin for s = 50 but the two methods give the same behaviour.
4.2. Nonsymmetric matrices
As the M-BiCGSTAB and the TFM-Lanczos=Orthomin seemed to be ecient on symmetric matrices, we can wonder if it would be the same with nonsymmetric ones. (We already know that
theoretically, the methods converge).
The rst matrix we considered is
B
−I
M4 =
−I
B
..
.
..
..
.
.
−I
−I
B
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
81
with
4
−2
B=
0
4
..
.
..
..
.
.
−2
:
0
4
The condition number of this matrix is 1:01 × 103 . We obtained the following results for the
M-BiCGSTAB
M4
s
Iteration
1
10 20 30 40 50
311 304 307 315 311 316
The convergence behaviour is as follows:
Even if the matrix is not well conditioned, we can see that the method gives us a quite good
convergence. It required 304 iterations for s = 10 and 316 for s = 50. From the graph, we can see
that the behaviour of each curve is slightly the same with stagnation until the 100th iteration for the
M-BiCGSTAB. For the TFM-Lanczos=Orthomin, the convergence is much slower and at iteration
500, the stopping criteria is not reached.
The next matrix used is the matrix
2
0
1
M5 =
1
2
0
..
.
1
2
..
.
..
.
1
..
.
..
.
1
..
..
.
.
0
1
2
82
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
considered in [4]. The condition number is 2.91 and the results we obtained are, for the
M-BiCGSTAB,
M5
s
Iteration
1
10 20 30 40 50
274 285 284 279 286 294
The convergence behaves as follows:
Even if the condition number is small, we have a linear but not fast convergence of the method for
the M-BiCGSTAB. This might partly be due to the fact that the matrix is nonsymmetric. However,
for the TFM-Lanczos=Orthomin, the convergence for s = 50 is much faster. The number of iterations
needed for each number of right-hand sides s for the M-BiCGSTAB is very close (from 274 if s = 1
to 294 for s = 50), as we can see in the table.
The last nonsymmetric matrix we considered is the matrix REDHEFF of MATLAB Test Matrix Toolbox of Higham [9], also considered in [4]. If we write this matrix M6 = (mi; j ), then the
coecients satisfy
m(i; j) =
1
if j = 1;
1 if i divides j;
0 otherwise:
Its condition number is 2:41 × 103 .
We obtained the following results for the M-BiCGSTAB
M6
s
Iteration
1 10 20 30 40 50
50 45 47 46 48 46
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
83
Then, for s = 1 and s = 50, we obtained the following graph:
Again, even with a badly conditioned matrix, the M-BiCGSTAB behaves quite good for all s, with
a smaller iteration number of 45 (s = 10) and a larger one of 50 (s = 1). In this last example, we
can see that the two curves are again very close. The TFM-Lanczos=Orthomin reached the stopping
criteria in much more iteration for s = 50.
5. Conclusion
Studying the examples, several remarks can be made.
Firstly, the M-Lanczos=Orthomin and the M-Lanczos=Orthodir do not give a good convergence.
The TFM-Lanczos=Orthodir
does not give a good convergence either. This might be due to the
fact that the sequence xk(j) is the same for M-Lanczos=Orthodir and TFM-Lanczos=Orthodir (the
dierence is the way to compute it).
Secondly, the M-BiCGSTAB and the TFM-Lanczos=Orthomin gave us better convergence with
symmetric matrices, even if nonsymmetric matrices have a good behaviour too. In the examples, we
can see that neither the M-BiCGSTAB nor the TFM-Lanczos=Orthomin is a better method. From the
memory point of view, we can only say that the TFM-Lanczos=Orthomin requires one more vector
to be stored as from the computational cost point of view, the M-BiCGSTAB needs much more dot
products.
Thirdly, the number of right-hand sides does not seem to be very important in all the examples considered (except, of course, on computational cost and memory requirements) for the M-BiCGSTAB.
Fourthly, the behaviour of the M-BiCGSTAB seems to be the same whatever the number of righthand sides is. The behaviour of the M-BiCGSTAB only depends on the considered matrix, even
if the coecient ak is computed regardless to the dierent right-hand sides. Thus, the computation
of the orthogonal polynomials we consider must be accurate. So, we may see [1] and use quasiorthogonality (in a way, a numerical orthogonality) instead of orthogonality to improve the stability
of the algorithms and if it possible to apply this to the methods.
Finally, even with the minimization property of the M-BiCGSTAB, this does not seem to be a
criteria of better convergence (since it should then be better than the TFM-Lanczos=Orthomin, but
it is not really the case).
84
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
On a theoretical point of view, we saw that we got a matrix and a polynomial interpretation of
the methods (only depending on one Hankel matrix), which seems to be new.
The main drawback of the methods, as we can see in the graphs, is that we do not have decreasing
residues, as in most of the Lanczos-types algorithms. So we must see if we can modify it to
get this property, conserving the nite termination property. Particularly, we now have to study if
the M-Lanczos=Orthodir and M-Lanczos=Orthomin can be improved since it requires a very small
computational cost. This method has to be compared with Lanczos type ones, and particularly with
the Global Lanczos process [13]. This is under consideration.
Acknowledgements
I would like to thank V. Simoncini for her helpful suggestions and advises that helped to improve
this paper.
References
[1] B. Beckermann, The stable computation of formal orthogonal polynomials, Numer. Algorithms 11 (1996) 1–23.
[2] C. Brezinski, M. Redivo Zaglia, Breakdowns in the computation of orthogonal polynomials, Numer. Algorithms 1
(1991) 207–221.
[3] C. Brezinski, Biorthogonality and its Applications to Numerical Analysis, Marcel Dekker, New York, 1992.
[4] C. Brezinski, M. Redivo Zaglia, Transpose-free Lanczos-type algorithms for non symmetric linear systems, Numer.
Algorithms, to appear.
[5] C. Brezinski, CGM: a whole class of Lanczos-type solvers for linear systems, Note ANO 253, Laboratoire d’Analyse
Numerique et d’Optimisation, Universite des Sciences et Technologies de Lille, November 1991.
[6] T.F. Chan, L. De Pillis, H.A. van der Vorst, Transpose-free formulations of Lanczos-type methods for nonsymmetric
linear systems, Numer. Algorithms, to appear.
[7] T.F. Chan, W.L. Wan, Analysis of projection methods for solving linear systems with multiple right-hand sides,
SIAM J. Sci. Comput. 18 (1997) 1698–1721.
[8] A. Draux, Polynomes
ˆ
Orthogonaux Formels. Applications, Lecture Notes in Computer Science, vol. 974, Springer,
Berlin, 1983.
[9] N.J. Higham, The test matrix Toolbox for Matlab (Version 3.0), Numerical Analysis Report No. 276, Department
of Mathematics, The University of Manchester, September 1995.
[10] C. Lanczos, Solution of systems of linear equations by minimized iterations, J. Res. Natl. Bur. Stand. 49 (1952)
33–53.
[11] D.P. O’Leary, The block conjugate gradient algorithm and related methods, Linear Algebra Appl. 13 (1992)
567–593.
[12] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston, 1995.
[13] H. Sadok, K. Jbilou, Global Lanczos-type methods with applications, Appl. Linear Algebra, submitted.
[14] V. Simoncini, A stabilized QMR version of block BiCG, SIAM J. Matrix Anal. Appl. 18 (1997) 419– 434.
[15] V. Simoncini, E. Gallopoulos, An iterative method for nonsymmetric systems with multiple right-hand sides, SIAM
J. Sci. Comput. 16 (1995) 917–933.
[16] V. Simoncini, E. Gallopoulos, A hybrid block GMRES method for nonsymmetric systems with multiple right hand
sides, J. Comput. Appl. Math. 66 (1–2) (1996) 457– 469.
[17] V. Simoncini, E. Gallopoulos, Convergence properties of block GMRES for solving systems with multiple right-hand
sides, Tech. Rep. 1316, Center for Supercomputing Research and Development, University of Illinois at UrbanaChampaign, Oct. 1993.
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
85
[18] P. Sonneveld, CGS, a fast Lanczos-type solver for nonsymmetric linear systems, SIAM J. Sci. Stat. Comput. 10
(1989) 35–52.
[19] H.A. van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric
linear systems, SIAM J. Sci. Stat. Comput. 13 (1992) 631–644.
[20] H.A. van der Vorst, An iterative method for solving f(A)x = b using Krylov subspace informations obtained for the
symmetric positive denite matrix, J. Comput. Appl. Math. 18 (1987) 249.
A Lanczos-type method for solving nonsymmetric linear
systems with multiple right-hand sides – matrix
and polynomial interpretation
C. Musschoot
Laboratoire d’Analyse Numerique et d’Optimisation, UFR IEEA - M3, Universite des Sciences et
Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France
Received 9 June 1998
Abstract
In this paper, we propose a method with a nite termination property for solving the linear system AX = B where A
is a nonsymmetric complex n×n matrix and B is an arbitrary n×s rectangular matrix. s does not have to be small. The
method is based on a single Krylov subspace where all the systems are picking informations. A polynomial and a single
matrix interpretation is given which seems to be new from a theoretical point of view. Numerical experiments show that
the convergence is usually quite good even if s is relatively large. The memory requirements and the computational costs
c 1999 Elsevier Science B.V. All rights reserved.
seem to be interesting too.
Keywords: Nonsymmetric systems; Krylov subspace; Multiple right-hand sides; Lanczos’ method; Bi-orthogonality;
Hankel matrix; Orthogonal polynomials; Transpose-free algorithm; BiCGSTAB
1. Introduction
Two main techniques can be found for solving
AX = B
(1)
where A is a n×n square nonsymmetric matrix and B is an arbitrary n×s rectangular matrix.
The rst one considers block versions of existing methods such as block GCR (see [12]), block
BiCG (see [11]), block GMRES (see [16, 17]) or block QMR (see [14]). All these methods need
the inversion of an s×s matrix per iteration. So s must be relatively small.
The second one avoids this problem. Generally a seed system is considered and the results obtained
for solving this single system are used to solve the other ones (see [7]). Other methods have recently
been proposed. An iterative method is considered in [15] and requires a general eigenvalue problem
to be solved. This method shares the informations obtained for solving each system, in order to get
a faster convergence.
c 1999 Elsevier Science B.V. All rights reserved.
0377-0427/99/$ – see front matter
PII: S 0 3 7 7 - 0 4 2 7 ( 9 8 ) 0 0 1 9 5 - 2
62
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
In this paper, we use a single seed system to solve (1) based on a modied Lanczos’ method that
will lead to transpose-free algorithms and to a modied BiCGSTAB, inspired from van der Vorst
[19]. All these methods will allow a single matrix and a polynomial interpretation.
Section 2 brie
y recalls the Lanczos’ method for a single right-hand side. Section 3 describes
the modications that have to be made for several right-hand sides and considers the BiCGSTAB
with these modications. Section 4 presents some numerical results and the conclusions stands in
Section 5.
2. Lanczos’ method and its implementations
As the new method will be based on the Lanczos’ method, we will rst introduce this method
[10] and its implementations by means of orthogonal polynomials.
2.1. Lanczos’ method
Let us consider the system
Ax = b;
where A is a n×n square matrix and b ∈ Cn . Lanczos’ method constructs two sequences rk and xk
such that
xk − x0 ∈ Kk (A; r0 ) = span(r0 ; Ar0 ; : : : ; Ak−1 r0 );
r = b − Ax ⊥ K (A∗ ; y) = span(y; A∗ y; : : : ; A∗k−1 y);
k
k
k
(2)
(3)
where x0 and y are arbitrarily chosen. Usually, y = r0 .
From that, we have the
Property 2.1. Let us assume that the vectors y; A∗ y; : : : ; A∗n−1 y are linearly independent. Then
∃k6n;
rk = 0:
By the denitions of xk and rk , we obtain
xk − x0 = −1 r0 − 2 Ar0 − · · · − k Ak−1 r0 :
So,
rk = r0 + 1 Ar0 + · · · + k Ak r0
= Pk (A)r0 ;
where Pk is the polynomial of degree k at most, dened by
Pk (x) = 1 + 1 x + · · · + k x k :
Thus, the orthogonality conditions (3) can be written as
1 (y; Ai+1 r0 ) + · · · + k (y; Ai+k r0 ) = −(y; Ai r0 ) for i = 0; : : : ; k − 1:
(4)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
63
The vector (1 ; : : : ; k )T is the solution of the Hankel system of order k generated by ((r0 ; A∗ y); : : : ;
(r0 ; A∗2k−1 y))T with the right-hand side (−(r0 ; y); : : : ; −(r0 ; A∗k−1 y))T , that is the solution of the
system
(r0 ; A∗k y)
(r0 ; A∗ y) (r0 ; A∗2 y) · · ·
(r0 ; y)
1
..
∗
2
(r0 ; A y)
(r0 ; A∗2 y)
.
:
. = −
..
.
..
..
.
.
.
.
∗
k−1
2k−1
∗
k
∗
k
(r0 ; A
y)
···
· · · (r0 ; A
(r0 ; A y)
y)
2.2. Formal orthogonal polynomials
We now have to introduce the notion of formal orthogonal polynomials, fully studied by Draux
in [8]. Let c be the linear functional dened on the space of polynomials by
c(xi ) = ci = (y; Ai r0 ) for all i¿0:
Then, for all polynomial P, we have c(P) = (y; P(A)r0 ) and the conditions (4) and thus, the orthogonality conditions (3) become
c(xi Pk ) = 0
for i = 0; : : : ; k − 1:
(5)
A family of polynomials satisfying the conditions (5) is called the family of orthogonal polynomials
with respect to the functional c. These polynomials are dened apart a multiplying factor chosen in
our case to verify Pk (0) = 1.
Such polynomials satisfy, if they exist for all k, a three-term recurrence relationship. Moreover,
we can show that Pk exists if and only if
c1 · · · ck
.. 6= 0
Hk(1) = ...
.
c ··· c
k
2k−1
and that it is of the exact degree k if and only if
c0 · · · ck−1
.. 6= 0:
Hk(0) = ...
.
c
k−1 · · · c2k−2
In the sequel, we will assume these two conditions hold for all k.
Then, we can dene the family of adjacent polynomials Pk(1) , orthogonal with respect to the
functional c (1) dened by
c (1) (x i ) = c(x i+1 ) = ci+1 :
These polynomials are chosen to be monic and, so, they exist if and only if Hk(1) 6= 0. Thus, the
condition for the existence of Pk(1) and Pk is the same.
64
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
2.3. Implementations of Lanczos’ method
All the possible algorithms for implementing Lanczos’ method come out from the recurrence relations between the families {Pk } and {Pk(1) }. The three main algorithms are named Lanczos=Orthodir,
Lanczos=Orthomin and Lanczos=Orthores and we will now remind them.
2.3.1. Lanczos/Orthodir
The orthogonal polynomials Pk(1) satisfy a three-term relation of the form
(1)
(1)
Pk+1
(x) = (x − k )Pk(1) (x) − k Pk−1
(x):
In the same way, we can prove that the polynomials Pk satisfy the relation
Pk+1 (x) = Pk (x) − k xPx(1) (x):
Setting qk = Pk(1) (A)r0 , the rst algorithm follows
Algorithm 2.2 (Lanczos=Orthodir).
qk+1 = (A − k )qk − k qk−1 ;
rk+1 = rk − k Aqk ;
xk+1 = xk + k qk ;
with
k =
(y; A2 Uk (A)qk ) − k (y; AUk (A)qk−1 )
;
(y; AUk (A)qk )
(y; A2 Uk−1 (A)qk )
;
(y; AUk−1 (A)qk−1 )
(y; Uk (A)rk )
k =
;
(y; AUk (A)qk )
k =
where Uk and Uk−1 are two arbitrary polynomials of respective degrees k and k − 1. This algorithm
(1)
is called Lanczos=Orthodir if Uk and Uk−1 are, respectively, Pk(1) and Pk−1
.
2.3.2. Lanczos=Orthomin
Setting Qk = ˜ k Pk(1) with ˜ k such that Pk and Qk have the same leading coecient, we can show
that the polynomials Qk satisfy the relation
Qk+1 (x) = Pk+1 (x) − k Qk (x):
So, setting q˜k = Qk (A)r0 , we obtain the
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Algorithm 2.3 (Lanczos=Orthomin).
rk+1 = rk − k Aqek ;
xk+1 = xk + k qek ;
with
q˜k+1 = rk+1 − k qek ;
(y; Uk+1 (A)rk+1 )
;
(y; AUk (A)qek )
(y; Uk (A)rk )
k =
:
(y; AUk (A)qek )
k =
This algorithm is called Lanczos=Orthomin for the choices Uk ≡ Pk and Uk+1 ≡ Pk+1 .
2.3.3. Lanczos=Orthores
The polynomials Pk satisfy a three-term relation that can always be written as
Pk+1 (x) = k [(x + k )Pk (x) −
k Pk−1 (x)]:
This relation gives us the
Algorithm 2.4 (Lanczos=Orthores).
rk+1 = k [(A + k )rk −
k rk−1 ];
xk+1 = k [k xk −
k xk−1 − rk ];
with
1
;
k −
k
k (y; Uk (A)rk−1 ) − (y; AUk (A)rk )
;
k =
(y; Uk (A)rk )
(y; AUk−1 (A)rk )
k =
:
(y; Uk−1 (A)rk−1 )
k =
This algorithm is called Lanczos=Orthores for the choice Uk ≡ Pk and Uk−1 ≡ Pk−1 .
3. Several right-hand sides
Let us now consider
AX = B;
where B = [(b(1) ; : : : ; b(s) )] is a matrix of dimension n × s.
65
66
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
3.1. Description of the method
Let us consider, 2s sequences rk( j) and xk( j) for j = 1; : : : ; s dened by
xk( j) ∈ Kk (A; z);
(6)
rk( j) = b( j) − Axk( j) ⊥ Kk (A∗ ; y);
(7)
where z and y are arbitrarily chosen vectors.
Then, we nd from (6)
xk( j) = 1( j) z + · · · + k( j) Ak−1 z:
Thus,
rk( j) = b( j) − Axk( j)
= b( j) − 1( j) Az − · · · − k( j) Ak z
( j)
= b( j) − Ak−1
(A)z;
( j)
where k−1
is the polynomial of degree k − 1 at most, dened by
( j)
(x) = 1( j) + 2( j) x + · · · + k( j) xk−1 :
k−1
The orthogonality conditions (7) can be written as
1( j) (Ai+1 z; y)
+ ··· +
k( j) (Ai+k z; y)
i ( j)
= (A b ; y) for
i = 0; : : : ; k − 1;
j = 1; : : : ; s:
(8)
The unknown vectors (1( j) ; : : : ; k( j) )T are the solutions of the Hankel systems generated by
((Az; y); : : : ; (A2k−1 z; y))T with the right-hand sides ((b( j) ; y); : : : ; (Ak−1 b( j) ; y))T for j = 1; : : : ; s.
This matrix interpretation would not be possible considering the original Lanczos’ method (we
only have one Hankel matrix here with several right-hand sides). In the original Lanczos’ method,
the Hankel matrix depends on r0( j) .
3.2. Finite convergence
As in the original Lanczos’ method and under a certain assumption, we can show that the new
method gives the exact solution in a nite number of iterations.
The condition is the same as in the original Lanczos’ method and we have the
Proposition 3.1. Let us assume that the vectors y; A∗y; : : : ; A∗n−1y are linearly independent. Then;
∃k 6 n; rk(j) = 0 for j = 1; : : : ; s:
Proof. By the orthogonality conditions, we have rk( j) ⊥ (y; A∗y; : : : ; A∗n−1y) which are n linearly
independent vectors. Then the result is obvious and the exact solutions are obtained in n iterations
at most.
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
67
3.3. Implementation of the new method
Let us now explore the possibilities for implementing this method. We will rst dene the functionals that have to be used. Then we will construct the sequences that lead to the solution of the
systems.
3.3.1. Associated functionals – polynomial expression
Let us consider the n × s linear functionals dened on the space of polynomials by
Li( j) (xl ) = (Ai+l z; y) if l ¿ 0;
L(i j) (1)
(9)
i ( j)
= −(A b ; y)
(10)
for i = 1; : : : ; n and j = 1; : : : ; s.
Let us dene, as in the original Lanczos’ method, the functional c(1) by
c(1) (xi ) = ci+1 = (Ai+1 z; y) for i ¿ 0:
Let Pek be the orthogonal polynomials with respect to the functional c(1) . These polynomials satisfy
c(1) (xiPek ) = 0
for i = 0; : : : ; k − 1:
(11)
We have the
Proposition 3.2. The functionals L(i j) and c(1) are related by
c(1) (xi ) = L(l j) (xi−l+1 ) for i − l + 1 ¿ 0:
Proof. By denition, L(l j) (xi−l+1 ) = (Al+i−l+1 z; y) since i − l + 1 6= 0. Thus, we have L(l j) (xi−l+1 ) =
(Ai+1 z; y) = c(1) (xi ).
Let the polynomials Pek be monic. Then, from [8], we have the
Property 3.3. The monic orthogonal polynomials Pek with respect to the functional c(1) satisfy the
relation
Pek+1 (x) = (x − k )Pek (x) − k Pek−1 (x);
where
0 = 0;
k =
0 =
k =
c(1) (xk Pek )
c(1) (xk−1 Pek−1 )
for k ¿ 0;
c(1) (x)
;
c(1) (1)
c(1) (xk+1 Pek ) − k c(1) (xk Pek−1 )
c(1) (xk Pek )
for k ¿ 0:
(12)
68
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The expression of the coecients is due to the orthogonality of the polynomials Pek .
Multiplying both sides in (12) by a polynomial Uk−1 of exact degree k − 1 and applying c(1)
gives us
k =
c(1) (xUk−1 Pek )
c(1) (Uk−1 Pek−1 )
for k ¿ 0:
(13)
Moreover, multiplying both sides in (12) by a polynomial Uk of exact degree k and applying c(1)
leads to
k =
c(1) (xUk Pek ) − k c(1) (Uk Pek−1 )
c(1) (Uk Pek )
for k ¿ 0:
(14)
Now, setting
Pk( j) (x) = 1 + 1( j) x + · · · + k( j) xk
we have, for j = 0; : : : ; s,
Pk( j) (0) = 1;
L(i j) (Pk( j) ) = 0
for i = 0; : : : ; k − 1:
(15)
Thus, the polynomials Pk( j) are the bi-orthogonal polynomials introduced by Brezinski in [3]. So,
they are of degree k at most and satisfy
with
( j)
Pk+1
(x) = Pk( j) (x) + k( j) xPek (x)
k( j) = −
(16)
L(k j) (Pk( j) )
:
L(k j) (xPek )
The polynomials Pk( j) can be written, by denition
( j)
Pk( j) (x) = 1 + xk−1
(x)
and thus, the polynomials k( j) verify
( j)
k+1
(x) = k( j) (x) + k( j)Pek (x):
( j)
And, as xk( j) = k−1
(A)z and rk( j) = b( j) − Axk( j) , we obtain, setting qk = Pek (A)z
( j)
xk+1
= xk( j) + k( j) qk ;
( j)
rk+1
= rk( j) − k( j) Aqk :
(17)
Remark 3.4. Unlike in the original Lanczos’ method; the polynomials Pk( j) are not orthogonal polyj)
(x). Moreover; we do not have a polynomial
nomials. Indeed; we do not usually have L(i j) (1) = L(i−1
relation of the form rk( j) = Pk( j) (A)r0 .
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
69
The polynomials Pek and Pk( j) can be written in terms of determinants as
c1 · · · ck+1
..
..
.
.
ck · · · c2k
1 · · · xk
Pek (x) =
:
c1 · · · ck
..
..
.
.
c ··· c
k
2k−1
We can easily check that these polynomials are monic and satisfy the conditions (11).
Similarly, for Pk( j) , we have
1
x · · · xk
−(b( j) ; y) c1 · · · ck
..
..
..
.
.
.
−(Ak−1 b( j) ; y) c · · · c
k
2k−1
:
Pk( j) (x) =
c1 · · · c k
..
..
.
.
c ··· c
k
2k−1
These polynomials satisfy the conditions (15) and we obviously have Pk( j) (0) = 1.
Thus, the polynomials Pek and Pk( j) exist if and only if Hk(1) 6= 0 and the polynomials Pk( j) are of
the exact degree k if and only if
(b( j) ; y)
..
(0)
Hk; j =
.
( j) ∗k−1
(b ; A
y)
c1 · · · ck−1
..
..
. 6= 0:
.
ck · · · c2k−2
If the polynomials Pek and Pk( j) do not exist, a breakdown occurs. Such a situation can be treated by
a look-ahead technique as developed by Brezinski and Redivo Zaglia in [2].
3.3.2. Analogy with Lanczos=Orthodir
We can now obtain the relations required for solving the systems.
The Multiple Lanczos=Orthodir algorithm will be similar to Lanczos=Orthodir (for one single
system).
Algorithm 3.5 (M-Lanczos=Orthodir).
Initializations
q0 = z
q1 = (A − 0 )z
for j = 1; : : : ; s do
r1( j) = b( j) − ( j) Az
70
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1 do
for j = 1; : : : ; s do
( j)
rk+1
= rk( j) − k( j) Aqk
( j)
xk+1
= xk( j) + k( j) qk
end for
qk+1 = (A − k )qk − k qk−1
end for
where z is an arbitrary nonzero vector, the scalars k( j) ; k and k are dened above and
( j) = −
L(0j) (1) (b( j) ; y)
=
(Az; y)
L(0j) (x)
for the conditions (15) with k = 0:
Let us now give some useful expressions of the coecients ( j) ; k ; k and k( j) . By denition of
the functionals L(i j) and c(1) and using their linearity properties, we easily prove, setting qk =Pek (A)z,
that
c(1) (xiPek ) = (Ai+1Pek (A)z; y) = (Ai+1 qk ; y);
(18)
( j)
L(i j) (Pk( j) ) = −(Ai b( j) ; y) + (Ai+1 k−1
(A)z; y)
= −(Ai rk( j) ; y):
(19)
Thus, we can deduce the
Proposition 3.6. The scalars ( j) ; k( j) ; k and k are given by
( j) =
(b( j) ; y)
;
(Az; y)
k( j) =
(Vk ( j) (A) rk( j) ; y)
(Ak rk( j) ; y)
=
;
(Ak+1 qk ; y) (AVk ( j) (A)qk ; y)
k =
(A2 Uk−1 (A)qk ; y)
(Ak+1 qk ; y)
=
;
(Ak qk−1 ; y) (AUk−1 (A)qk−1 ; y)
k =
(Ak+2 qk ; y) − k (Ak+1 qk−1 ; y)
(Ak+1 qk ; y)
=
(A2 Uk (A) qk ; y) − k (AUk (A)qk−1 ; y)
(AUk (A)qk ; y)
for j = 1; : : : ; s and for any polynomial Vk ( j) of exact degree k.
(20)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
71
Proof. It is a consequence of (13), (14), (18), (19), Proposition 3.2 and the orthogonality conditions (7).
This algorithm can be directly implemented using Uk (x) = x k and Vk ( j) (x) = x k but it uses the
computation of successive powers of the matrix A; which is known to be numerically unstable.
To avoid computing many Vk ( j) (A)qk , we choose Vk ( j) ≡ Uk . We thus only have to compute
Uk (A)qk ; Uk−1 (A)qk ; Uk (A) qk−1 and Uk (A) rk( j) . A natural choice for Uk is Uk ≡ Pek (since then,
e k−1 (A)Pek (A) z = Pek (A)Pek−1 (A)z = Uk (A) qk−1 ):
Uk−1 (A) qk = P
e k (A)Pek−1 (A) z and Pek (A)P ( j) (A) z using the recurrence
So we must compute the vectors Pek (A)2 z; P
k
(
j)
relationships the polynomials Pek and Pk satisfy.
Using the technique we can nd in [4], we obtain the
Proposition 3.7. Setting rk( j) = Pek (A)rk( j) ; qk = Pek (A) qk ; qbk = Pek−1 (A) qk and brk( j) = Pek−1 (A) rk( j) ; we
have
qk+1 = (A − k )2 qk − 2k (A − k )qbk + 2 qk−1 ;
qbk+1 = (A − k )qk − k qbk ;
( j)
b
rk+1
= r(k j) − k( j) Aqk ;
j)
( j)
= (A − k )brk+1
− k brk( j) + k( j) k Aqbk
r(k+1
with
= (A − k ) r(k j) − k brk( j) − k( j) Aqbk+1
k( j) =
( r(k j) ; y)
;
(Aqk ; y)
k =
(A2qbk ; y)
;
(Aqk−1 ; y)
k =
(A2 qk ; y) − k (Aqbk ; y)
:
(Aqk ; y)
j)
( j)
Proof. Using the relations (12) and (16) we nd the expressions of qk+1 ; r(k+1
;e
qk+1 and erk+1
. Then,
( j)
( j)
replacing Uk and Vk in the Proposition 3.6, the result is obvious for k ; k and k .
We thus obtain a transpose-free algorithm.
Algorithm 3.8 (TFM-Lanczos=Orthodir).
Initializations
q0 = q0 = z
q1 = (A − 0 ) q0
q1 = (A − 0 ) q1
qb1 = q1
qb0 = 0
for j = 1; : : : ; s do
b
r1( j) = b( j) − ( j) Az
r(1j) = (A − 0 ) br1( j)
72
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1
for j = 1; : : : ; s
( j)
b
rk+1
= r(k j) − k( j) Aqk
j)
( j)
r(k+1
= (A − k ) brk+1
− k brk( j) + k( j) k A qbk
( j)
xk+1
= xk( j) + k( j) qk
end for
qk+1 = (A − k )2 qk − 2k (A − k ) qbk + k2 qk−1
qbk+1 = (A − k ) qk − k A qbk
qk+1 = (A − k ) qk − k qk−1
end for
with k ; k ; k( j) and k( j) dened as above.
Note that the vectors xk( j) are the same as in the M-Lanczos=Orthodir implementation.
3.3.3. Analogy with Lanczos=Orthomin
The algorithm Lanczos=Orthomin uses the polynomials Pk+1 and Pek to compute the polynomial
Pek+1 .
To obtain similar relations, we need to introduce the polynomial Pk(0) which satises
(0)
Pek+1 (x) = Pk+1
(x) +
k Pek (x):
(21)
We can easily prove that the polynomial Pk(0) is orthogonal with respect to the functional c dened
by
c(xi ) = ci = c(1) (x i−1 ) = (Ai z; y)
and will be chosen such that Pk(0) (0) = 1.
As Pek is orthogonal with respect to c(1) , then Pk(0) satisfy a relation of the form
(0)
Pk+1
(x) = Pk(0) (x) + k(0) xPek :
(0)
.
The polynomial Pek+1 is not necessarily monic but has the same leading coecient as Pk+1
(0)
(0)
(0)
(0)
(0)
(0)
(0)
We dene rk = b − Axk = b − Ak−1 (A)z with b = −z so that we have Li (1) = ci .
The polynomial Pk(0) thus veries rk(0) = −Pk(0) (A)z and we have, setting qk = Pek (A)z; qk+1 =
k Aqk −
(0)
(0)
rk+1
and rk+1
= rk(0) − k(0) Aqk .
The scalar
k can be written as
(0)
(0)
c(x k+1 Pk+1
; y)
) (Ak+1 rk+1
=
k = −
k+1 q ; y)
(1)
k
e
(A
c (x Pk )
k
=−
(0)
(0)
c(xUk Pk+1
) (AUk (A)rk+1
; y)
:
=
(1)
(AUk (A)qk ; y)
c (Uk Pek )
(22)
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
73
As in the previous case, if we choose Uk (x) = x k and Vk ( j) = x k then this can be directly implemented.
Thus, the Multiple Lanczos=Orthomin algorithm is as follows:
Algorithm 3.9 (M-Lanczos=Orthomin).
Initializations
q0 = z
for j = 0; : : : ; s do
r1( j) = b( j) − ( j) Az
x1( j) = ( j) z
end for
q1 =
0 q0 − r1(0)
for k = 1; : : : ; n − 1
for j = 0; : : : ; s
( j)
= rk( j) − k( j) Aqk
rk+1
( j)
xk+1
= xk( j) + k( j) qk
end for
(0)
qk+1 =
k qk − rk+1
end for
where the scalars k( j) ; ( j) and
k are dened above.
If we set, again, Uk ≡ Pek and Vk ( j) ≡ Pek , then we obtain the
Proposition 3.10. Setting r(k j) = Pek (A) rk( j) ; qk = Pek (A) qk ; brk( j) = Pek−1 (A) rk( j) ; and r(k j) = Pk(0) (A) rk( j) ;
we have
( j)
(0)
( j)
(0) ( j) 2
rk+1
= r(k j) + k( j) A r(0)
k ;
k + k A rk − k k A q
( j)
b
rk+1
= r(k j) − k( j) Aqk ;
j)
j)
( j)
r(k+1
= r(k+1
+
k brk+1
;
with
(0)
qk+1 =
2k qk − r(0)
rk+1
k+1 − 2
k b
k( j) =
( r(k j) ; y)
;
(A qk ; y)
k =
(0)
(Abrk+1
; y)
:
(Aqk ; y)
74
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Proof. Those relations are easily obtained from (21) and (12). Then, replacing Uk and Vk ( j) by Pek
in (20) and (22), the coecients k( j) and
k are obtained.
Proposition 3.11. As Pk(0) (0) = 1 then we can write rk( j) = b( j) −Ax(k j) and
rk( j) = 0 ⇒ rk( j) = 0 ⇒ Axk( j) = b( j) :
Then
( j)
xk+1
= xk( j) −k( j) rk(0) −k(0) Ark( j) +k(0) k( j) Aqk :
Thus the transpose-free algorithm follows:
Algorithm 3.12 (TFM-Lanczos=Orthomin).
Initializations
q0 = z
for j = 0; : : : ; s do
b
r1( j) = b( j) − ( j) Az
r1( j) = (1 + (0) A)br1( j)
r1( j) = r1( j) +
0 br1( j)
x1( j) = ( j) z − (0) b( j) + (0) ( j) Az
end for
q1 =
20 q0 − r1(0) − 2
0rb1(0)
for k = 1; : : : ; n − 1 do
for j = 0; : : : ; s do
( j)
rk+1
= rk( j) + k( j) Ark(0) + k(0) Ark( j) − k(0) k( j) A2 qk
( j)
b
rk+1
= rk( j) − k( j) Aqk
( j)
( j)
( j)
rk+1
= rk+1
+
k brk+1
( j)
xk+1
= xk( j) −k( j) rk(0) −k(0) rk( j) +k(0) k( j) Aqk
end for
(0)
(0)
− 2
k rbk+1
qk+1 =
2k qk − rk+1
end for
with ( j) , k( j) and
k dened as above.
Note that the vectors xk( j) are not the same as in the M-Lanczos=Orthomin implementation.
An analogy with Lanczos=Orthores cannot be obtained since, usually, the polynomials Pk( j) are not
orthogonal.
75
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The following table shows the computational cost per iteration of each algorithm.
The number of n-vectors required are displayed in the column Memory.
M-Lanczos=Orthodir
M-Lanczos=Orthomin
TFM-Lanczos=Orthodir
TFM-Lanczos=Orthomin
n-vector DOT
Matrix-vector products
Memory
s+2
s+2
s+4
s+2
2
2
s+4
s+4
2s + 1
2s + 2
3s + 3
4s + 1
This shows that the number of matrix-vector products does not depend on s for the two rst algorithms, while this is not the case for the Transpose-free ones. It must be noticed that the Transposefree algorithms require much more memory and they can be seen as a generalization of the algorithms
found in [4, 6].
We thus obtained a polynomial and a matrix interpretation with one Hankel matrix to linear
systems with several right-hand sides that is based on orthogonal polynomials. The method proposed only needs, in the basic implementations (M-Lanczos=Orthodir and M-Lanczos=Orthomin),
two matix-vector products per iteration (this does not depend on s, the number of right-hand sides
considered).
Unfortunately, the greater n is, the worse the numerical results are for the two basic implementations (see below). Unless improved, those two methods only seem to have a theoretical interest. We
will now study a modication of the BiCGSTAB. The BiCGSTAB from van der Vorst gives good
numerical results in the single case. So we can hope an acceleration (due to a certain minimization)
of the convergence with multiple right-hand sides.
3.3.4. Modication of BiCGSTAB for several right-hand sides
The algorithm with less computational cost for the Modied BiCGSTAB is the M-Lanczos=
Orthomin.
In the BiCGSTAB, the polynomials Vk dened by
Vk+1 (x) = (1 + ak x)Vk (x)
are considered and the sequence rek dened by
rek = Vk (A)rk
is constructed. The scalar ak is chosen such that krek+1 k2 is minimum.
Let us set
rek( j) = Vk (A)rk( j) :
76
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
From (17), we obtain
( j)
rek+1
= rk( j) − k( j) AVk (A)qk :
Thus, to avoid computing several vectors Vk (A)qk , the polynomial Vk must not depend on j.
This is why we now choose ak which minimizes
s
X
( j) 2
kr˜k+1
k:
j=1
Setting qek = Vk (A)qk and sek( j) = rek( j) − k( j) Aqek , we easily nd
Ps
ek( j) ; Asek( j) )
j=1 (s
:
ek( j) k2
j=1 kAs
ak = − Ps
Then we obtain the algorithm Multiple BiCGSTAB=Orthomin.
Algorithm 3.13 (M-BiCGSTAB=Orthomin).
Initializations
qe0 = z
for j = 0; : : : ; s do
re1( j) = (I + a0 A) (b( j) − ( j) Az)
x1( j) = ( j) z
end for
for k = 1; : : : ; n − 1
for j = 0; : : : ; s do
sek( j) = rek( j) − k( j) Aqek
( j)
rek+1
= (I + ak A)sek( j)
( j)
xek+1
= xek( j) + k( j) − ak Asek( j)
end for
(0)
qek+1 = rek+1
−
k (I + ak A)qek .
end for
The scalars k( j) and
k can be expressed by
ek( j) ; y)
( j) (r
;
k =
(Aqek ; y)
(0)
; y)
1 (rek+1
k =
:
ak (Aqek ; y)
77
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
Let us see the computational cost of such an algorithm.
M-BiCGSTAB=Orthomin
n-vector DOT
Matrix-vector products
Memory
3s + 2
s+2
3s + 1
This method is ought to be more numerically accurate but it requires s matrix-vector products more
per iteration (to compute the Ask( j) ).
Remark 3.14. The BiCGSTAB seems to be the most ecient method of the Lanczos Type Products
Methods (LTPM). The Conjugate Gradient Square (CGS) in particular [18], cannot be used here
since the Vk must be independent from j.
4. Numerical examples
Before considering the examples, let us remark that the M-Lanczos=Orthomin and the M-Lanczos=
Orthodir only seem to be ecient on small matrices (dimension less than 20), even if the computational cost is theoretically lower than for the M-BiCGSTAB or the other Transpose-free algorithms.
Secondly, the TFM-Lanczos=Orthodir only seem to have good numerical results on matrices of dimension less than 100. This is why we will only focus the numerical study on the M-BiCGSTAB
and on the TFM-Lanczos=Orthomin.
Every routine was written in Matlab 4.2c.l. All the matrices we considered are of order n = 500.
All the right-hand sides
were randomly chosen, using the RAND function in Matlab. The stopping
Ps
criteria used is (1=s) i=1
krk(i) k2 ¡ 10−16 (unless the matrix dimension is reached).
We used three symmetric matrices and three nonsymmetric matrices to point out how fast the
convergence was in each case. For each matrix, we computed the condition number using the COND
function in Matlab. Then, we considered s = 1, 10, 20, 30, 40 and 50 right-hand sides to see the
behaviour of the M-BiCGSTAB when increasing the number of right-hand sides (since the coecient
ak in the M-BiCGSTAB depends on every residue). There is no need to do such a comparison for
the TFM-Lanczos=Orthomin since each residue is considered independently from the other ones. All
the results are presented in a table for the M-BiCGSTAB.
On each gure, we show the results for s = 1 and s = 50 for the M-BiCGSTAB as well as the
results for s=50 for the TFM-Lanczos=Orthomin. This will allow us to compare the behaviour of the
M-BiCGSTAB step by step when increasing the number of right-hand sides. The M-BiCGSTAB and
the TFM-Lanczos=Orthomin can be compared too. The graphs represent the norms of the residuals,
in logarithmic scale, versus the iterations.
4.1. Symmetric matrices
We will rst study the implementation of the M-BiCGSTAB and of the TFM-Lanczos=Orthomin
on symmetric matrices since such matrices generally give better results than the nonsymmetric ones.
78
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
The rst matrix considered is the matrix
−1
20
−1
M1 =
20
..
.
..
.
..
.
−1
−1
20
whose condition number is 1.22.
We obtained the following results (iteration where the stopping criteria is satised) for the
M-BiCGSTAB.
M1
s
Iteration
1 10 20 30 40 50
16 16 16 17 18 16
The convergence for s = 1 and s = 50 is as follows:
As we could expect, the convergence is fast whatever the method is. This might be due to
two dierent factors. First, the matrix considered is symmetric. Secondly, the problem is very well
conditioned. We note that the number of right-hand sides is not important for the M-BiCGSTAB.
And the shape of convergence is the same for s = 1 and s = 50 for the M-BiCGSTAB. The TFMLanczos=Orthomin gives a quite good result too.
The next matrix is
B
−I
M2 =
−I
B
..
.
..
.
..
.
−I
;
−I
B
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
79
where the matrix I is the identity matrix of order 20 and
4
..
B=
.
4
is of order 20. The condition number is 2.97.
We obtained the following results when applying the M-BiCGSTAB
M2
s
Iteration
1 10 20 30 40 50
40 45 41 41 49 43
and the convergence is as follows:
In this example, we see that the number of right-hand sides does not really interact with the
convergence of the M-BiCGSTAB. The smaller number of iterations needed is 40 for s=1 while
the larger one is 49 for s = 40. The two curves for the M-BiCGSTAB are close again and the
convergence is quite good. The TFM-Lanczos=Orthomin gives here a better result.
The last symmetric matrix considered is the diagonal matrix
M3 =
1
2
..
.
500
80
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
used in [7] with size 200. The condition number is obviously 500. Even if this matrix is particular,
let us see what the results are.
We obtained the following results with the M-BiCGSTAB
M3
1
10 20 30 40 50
189 192 192 198 198 205
s
Iteration
and the graph for s = 1 and s = 50 is
The convergence is good, despite a condition number equal to 500 (which is not very big but
not so small). The two graphs are again very close for the M-BiCGSTAB. We can see that the
smaller number of iterations is 189 for s = 1 as the larger one is 205 for s = 50. Thus, the number
of right-hand sides does not seem to be very important here. The M-BiCGSTAB is a better than the
TFM-Lanczos=Orthomin for s = 50 but the two methods give the same behaviour.
4.2. Nonsymmetric matrices
As the M-BiCGSTAB and the TFM-Lanczos=Orthomin seemed to be ecient on symmetric matrices, we can wonder if it would be the same with nonsymmetric ones. (We already know that
theoretically, the methods converge).
The rst matrix we considered is
B
−I
M4 =
−I
B
..
.
..
..
.
.
−I
−I
B
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
81
with
4
−2
B=
0
4
..
.
..
..
.
.
−2
:
0
4
The condition number of this matrix is 1:01 × 103 . We obtained the following results for the
M-BiCGSTAB
M4
s
Iteration
1
10 20 30 40 50
311 304 307 315 311 316
The convergence behaviour is as follows:
Even if the matrix is not well conditioned, we can see that the method gives us a quite good
convergence. It required 304 iterations for s = 10 and 316 for s = 50. From the graph, we can see
that the behaviour of each curve is slightly the same with stagnation until the 100th iteration for the
M-BiCGSTAB. For the TFM-Lanczos=Orthomin, the convergence is much slower and at iteration
500, the stopping criteria is not reached.
The next matrix used is the matrix
2
0
1
M5 =
1
2
0
..
.
1
2
..
.
..
.
1
..
.
..
.
1
..
..
.
.
0
1
2
82
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
considered in [4]. The condition number is 2.91 and the results we obtained are, for the
M-BiCGSTAB,
M5
s
Iteration
1
10 20 30 40 50
274 285 284 279 286 294
The convergence behaves as follows:
Even if the condition number is small, we have a linear but not fast convergence of the method for
the M-BiCGSTAB. This might partly be due to the fact that the matrix is nonsymmetric. However,
for the TFM-Lanczos=Orthomin, the convergence for s = 50 is much faster. The number of iterations
needed for each number of right-hand sides s for the M-BiCGSTAB is very close (from 274 if s = 1
to 294 for s = 50), as we can see in the table.
The last nonsymmetric matrix we considered is the matrix REDHEFF of MATLAB Test Matrix Toolbox of Higham [9], also considered in [4]. If we write this matrix M6 = (mi; j ), then the
coecients satisfy
m(i; j) =
1
if j = 1;
1 if i divides j;
0 otherwise:
Its condition number is 2:41 × 103 .
We obtained the following results for the M-BiCGSTAB
M6
s
Iteration
1 10 20 30 40 50
50 45 47 46 48 46
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
83
Then, for s = 1 and s = 50, we obtained the following graph:
Again, even with a badly conditioned matrix, the M-BiCGSTAB behaves quite good for all s, with
a smaller iteration number of 45 (s = 10) and a larger one of 50 (s = 1). In this last example, we
can see that the two curves are again very close. The TFM-Lanczos=Orthomin reached the stopping
criteria in much more iteration for s = 50.
5. Conclusion
Studying the examples, several remarks can be made.
Firstly, the M-Lanczos=Orthomin and the M-Lanczos=Orthodir do not give a good convergence.
The TFM-Lanczos=Orthodir
does not give a good convergence either. This might be due to the
fact that the sequence xk(j) is the same for M-Lanczos=Orthodir and TFM-Lanczos=Orthodir (the
dierence is the way to compute it).
Secondly, the M-BiCGSTAB and the TFM-Lanczos=Orthomin gave us better convergence with
symmetric matrices, even if nonsymmetric matrices have a good behaviour too. In the examples, we
can see that neither the M-BiCGSTAB nor the TFM-Lanczos=Orthomin is a better method. From the
memory point of view, we can only say that the TFM-Lanczos=Orthomin requires one more vector
to be stored as from the computational cost point of view, the M-BiCGSTAB needs much more dot
products.
Thirdly, the number of right-hand sides does not seem to be very important in all the examples considered (except, of course, on computational cost and memory requirements) for the M-BiCGSTAB.
Fourthly, the behaviour of the M-BiCGSTAB seems to be the same whatever the number of righthand sides is. The behaviour of the M-BiCGSTAB only depends on the considered matrix, even
if the coecient ak is computed regardless to the dierent right-hand sides. Thus, the computation
of the orthogonal polynomials we consider must be accurate. So, we may see [1] and use quasiorthogonality (in a way, a numerical orthogonality) instead of orthogonality to improve the stability
of the algorithms and if it possible to apply this to the methods.
Finally, even with the minimization property of the M-BiCGSTAB, this does not seem to be a
criteria of better convergence (since it should then be better than the TFM-Lanczos=Orthomin, but
it is not really the case).
84
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
On a theoretical point of view, we saw that we got a matrix and a polynomial interpretation of
the methods (only depending on one Hankel matrix), which seems to be new.
The main drawback of the methods, as we can see in the graphs, is that we do not have decreasing
residues, as in most of the Lanczos-types algorithms. So we must see if we can modify it to
get this property, conserving the nite termination property. Particularly, we now have to study if
the M-Lanczos=Orthodir and M-Lanczos=Orthomin can be improved since it requires a very small
computational cost. This method has to be compared with Lanczos type ones, and particularly with
the Global Lanczos process [13]. This is under consideration.
Acknowledgements
I would like to thank V. Simoncini for her helpful suggestions and advises that helped to improve
this paper.
References
[1] B. Beckermann, The stable computation of formal orthogonal polynomials, Numer. Algorithms 11 (1996) 1–23.
[2] C. Brezinski, M. Redivo Zaglia, Breakdowns in the computation of orthogonal polynomials, Numer. Algorithms 1
(1991) 207–221.
[3] C. Brezinski, Biorthogonality and its Applications to Numerical Analysis, Marcel Dekker, New York, 1992.
[4] C. Brezinski, M. Redivo Zaglia, Transpose-free Lanczos-type algorithms for non symmetric linear systems, Numer.
Algorithms, to appear.
[5] C. Brezinski, CGM: a whole class of Lanczos-type solvers for linear systems, Note ANO 253, Laboratoire d’Analyse
Numerique et d’Optimisation, Universite des Sciences et Technologies de Lille, November 1991.
[6] T.F. Chan, L. De Pillis, H.A. van der Vorst, Transpose-free formulations of Lanczos-type methods for nonsymmetric
linear systems, Numer. Algorithms, to appear.
[7] T.F. Chan, W.L. Wan, Analysis of projection methods for solving linear systems with multiple right-hand sides,
SIAM J. Sci. Comput. 18 (1997) 1698–1721.
[8] A. Draux, Polynomes
ˆ
Orthogonaux Formels. Applications, Lecture Notes in Computer Science, vol. 974, Springer,
Berlin, 1983.
[9] N.J. Higham, The test matrix Toolbox for Matlab (Version 3.0), Numerical Analysis Report No. 276, Department
of Mathematics, The University of Manchester, September 1995.
[10] C. Lanczos, Solution of systems of linear equations by minimized iterations, J. Res. Natl. Bur. Stand. 49 (1952)
33–53.
[11] D.P. O’Leary, The block conjugate gradient algorithm and related methods, Linear Algebra Appl. 13 (1992)
567–593.
[12] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston, 1995.
[13] H. Sadok, K. Jbilou, Global Lanczos-type methods with applications, Appl. Linear Algebra, submitted.
[14] V. Simoncini, A stabilized QMR version of block BiCG, SIAM J. Matrix Anal. Appl. 18 (1997) 419– 434.
[15] V. Simoncini, E. Gallopoulos, An iterative method for nonsymmetric systems with multiple right-hand sides, SIAM
J. Sci. Comput. 16 (1995) 917–933.
[16] V. Simoncini, E. Gallopoulos, A hybrid block GMRES method for nonsymmetric systems with multiple right hand
sides, J. Comput. Appl. Math. 66 (1–2) (1996) 457– 469.
[17] V. Simoncini, E. Gallopoulos, Convergence properties of block GMRES for solving systems with multiple right-hand
sides, Tech. Rep. 1316, Center for Supercomputing Research and Development, University of Illinois at UrbanaChampaign, Oct. 1993.
C. Musschoot / Journal of Computational and Applied Mathematics 101 (1999) 61–85
85
[18] P. Sonneveld, CGS, a fast Lanczos-type solver for nonsymmetric linear systems, SIAM J. Sci. Stat. Comput. 10
(1989) 35–52.
[19] H.A. van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric
linear systems, SIAM J. Sci. Stat. Comput. 13 (1992) 631–644.
[20] H.A. van der Vorst, An iterative method for solving f(A)x = b using Krylov subspace informations obtained for the
symmetric positive denite matrix, J. Comput. Appl. Math. 18 (1987) 249.