Directory UMM :Data Elmu:jurnal:J-a:Journal of Computational And Applied Mathematics:Vol100.Issue2.1998:
Journal of Computational and Applied Mathematics 100 (1998) 207–224
On preconditioned Uzawa methods and SOR methods for
saddle-point problems
Xiaojun Chen ∗
Department of Mathematics and Computer Science, Shimane University, Matsue 690-8504, Japan
Received 20 February 1998
Abstract
This paper studies convergence analysis of a preconditioned inexact Uzawa method for nondierentiable saddle-point
problems. The SOR-Newton method and the SOR-BFGS method are special cases of this method. We relax the Bramble–
Pasciak–Vassilev condition on preconditioners for convergence of the inexact Uzawa method for linear saddle-point problems. The relaxed condition is used to determine the relaxation parameters in the SOR–Newton method and the SOR–
BFGS method. Furthermore, we study global convergence of the multistep inexact Uzawa method for nondierentiable
c 1998 Elsevier Science B.V. All rights reserved.
saddle-point problems.
AMS classication: 65H10
Keywords: Saddle-point problem; Nonsmooth equation; Uzawa method; Precondition; SOR method
1. Introduction
Saddle-point problems arise, for example, in the mixed nite element discretization of the Stokes
equations, coupled nite element/boundary element computations for interface problems, and the
minimization of a convex function subject to linear constraints [2–7, 10, 12, 21, 23–27] In this
paper we consider the nonlinear saddle-point problem
H (x; y) ≡
F (x ) + B T y − p
Bx − G (y) − q
= 0;
(1.1)
where B ∈ Rm×n , p ∈ ℜn , q ∈ ℜm , F : ℜn → ℜn is a strongly monotone mapping with modulus ,
i.e.,
(F (x) − F (x˜))T (x − x˜)¿kx − xk
˜ 2;
for x; x˜ ∈ ℜn
∗
(1.2)
E-mail: [email protected]. This work is supported by the Australian Research Council while the author
worked at the School of Mathematics, University of New South Wales.
c 1998 Elsevier Science B.V. All rights reserved.
0377-0427/98/$ – see front matter
PII: S 0 3 7 7 - 0 4 2 7 ( 9 8 ) 0 0 1 9 7 - 6
208
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
and G : ℜm → ℜm is a monotone mapping, i.e.,
(G (y) − G (y˜ ))T (y − y˜ )¿0;
for y; y˜ ∈ ℜm
(1.3)
If F and G are symmetric ane functions, problem (1.1) reduces to the linear saddle problem [2,
3, 10, 24, 27]
A BT
B −C
x
y
=
p
q
(1.4)
;
where A is an n × n symmetric positive-denite matrix and C is an m × m symmetric positive-semidenite matrix. A version of preconditioned inexact Uzawa methods for solving (1.4) is
xk+1 = xk + P −1 (p − Axk − BT yk );
yk+1 = yk + Q−1 (Bxk+1 − Cyk − q);
(1.5)
where P ∈ ℜn×n and Q ∈ ℜm×m are symmetric positive-denite preconditioners [3, 10, 27]. This
inexact Uzawa method (1.5) is simple and has minimal computer memory requirements. Furthermore,
it has no inner products involved in the iteration. These features make this method very well suited
for implementation on modern computing architectures. Bramble, et al. [3] showed that method (1.5)
for solving (1.4) with C = 0 always converges provided that the preconditioners satisfy
for all x ∈ ℜn
06((P − A)x; x)6(Px; x)
(1.6)
and
06((Q − BA−1 BT )y; y)6
(Qy; y)
for all y ∈ ℜm ;
(1.7)
where ;
∈ [0; 1):
In this paper, we consider the case C 6= 0 and relax conditions (1.6) and (1.7) to
06((P − A)x; x)6(Px; x)
or
− ˆ(Px; x)6((P − A)x; x)60;
(1.8)
for all x ∈ ℜn , and
−
ˆ(Qy; y)6((Q − BA−1 BT − C )y; y)6
(Qy; y)
for all y ∈ Rm ;
(1.9)
where P − A and Q − BA−1 BT − C are positive semi-denite or negative semi-denite, and ˆ and
ˆ are small positive numbers. Furthermore, we use the relaxed Bramble–Pasciak–Vassilev condition
to study convergence of the inexact Uzawa method for nonlinear saddle-point problems.
A direct generalization of (1.5) for solving nonlinear saddle-point problems (1.1) is
xk+1 = xk + Pk−1 (p − F (xk ) − BT yk );
yk+1 = yk + Qk−1 (Bxk+1 − G (yk ) − q);
(1.10)
where Pk ∈ ℜn×n and Qk ∈ ℜm×m are positive denite.
Some accelerated Newton-type methods are particular cases of method (1.10).
Example 1 (SOR–Newton method). In this case Pk = !1 F ′ (xk ) and Qk = !1 G ′ (yk ); where ! ¿ 0.
The positive-denite property of Pk is guaranteed by the strong monotonicity of F . To ensure the
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
209
positive-denite property of Qk , we can use a modication Qk = !1 (G ′ (yk )+ Im ) where is a positive
number and Im is the identity matrix in ℜm×m .
Example 2 (SOR-BFGS method). Let A0 ∈ ℜn×n and C0 ∈ ℜm×m be arbitrary positive denite
matrices. For k¿0, we dene
sk = xk+1 − xk ; tk = F (xk+1 ) − F (xk );
uk = yk+1 − yk ; vk = G (yk+1 ) − G (yk ):
We set
Ak+1 = Ak −
tk tkT
Ak sk skT Ak
+
;
skT Ak sk
tkT sk
(1.11)
and
Ck+1 =
Ck −
Ck
vk v T
Ck uk ukT Ck
+ T k if vkT uk 6= 0
T
uk Ck uk
vk uk
otherwise.
(1.12)
Since F is strongly monotone and G is monotone, tkT sk ¿ 0 and vkT uk ¿0 for all k¿0. By the BFGS
update rule [11], Ak and Ck are positive denite. Taking Pk = !1 Ak and Qk = !1 Ck with ! ¿ 0, method
(1.10) reduces to the SOR–BFGS method.
In this paper we are concerned with the case in which F and/or G are possibly nondierentiable.
Such problems arise from LC 1 convex programming problems [4, 6, 21, 23, 25], nondierentiable
interface problems [5, 12], and some possible extension of nondierentiable problems [1, 15, 20]. A
globally and superlinearly convergent inexact Uzawa method for solving 1.1 was studied in [5], in
which the component xk+1 is generated by a nonlinear iterative process. In particular, xk+1 satises
F (xk+1 ) + BT yk = p + k ;
(1.13)
where k is the residual of the approximation solution xk+1 to the system F (x) + BT yk = p. In
this paper we show that the nonlinear version (1.13) can be replaced by a multistep linear process.
Precisely, we prove global convergence of the following multistep inexact Uzawa method:
xk+1 = xk;lk ; xk;0 = xk ;
xk; i+1 = xk; i + Pk−1 (p − F (xk; i ) − BT yk ); i = 0; 1; : : : lk − 1;
yk+1 = yk + Qk−1 (Bxk+1 − G (yk ) − q):
(1.14)
This paper is organized as follows. In Section 2 we rewrite the preconditioned inexact Uzawa
method (1.10) as a xed-point method, and generalize local convergence theory [16–18, 28] to
nondierentiable problems. Moreover, we relax the Bramble–Pasciak–Vassilev condition on P and
Q for convergence of (1.5). In Section 3 we use the local convergence theory and the relaxed
condition to determine the relaxation parameter in the SOR–Newton method and the SOR–BFGS
method for the nonsmooth saddle-point problem (1.1). Furthermore, we study global convergence
of the multistep inexact Uzawa method (1.14).
Throughout this paper we denote the identity matrices in ℜn×n , ℜm×m and ℜ(n+m)×(n+m) by In , Im
and I , respectively. The spectral radius of a matrix J is denoted by (J ). For simplicity, we use z
for the column vector (xT ; yT )T and E for the matrix (P; Q).
210
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
2. A xed-point method and its preconditioners
Since F is strongly monotone, F has a single valued inverse operator F −1 dened by F −1 (v) =
{x | v = F (x)}: Furthermore, the inverse operator F −1 is also a strongly monotone mapping. Hence
system (1.1) is equivalent to
H2 (y) = −BF −1 (p − BT y) + G (y) + q = 0:
(2.1)
By the monotone property of G , we have that for any y; y˜ ∈ ℜm , there exists a positive scalar ˜
such that
˜ T (y − y˜ )k2 :
(B(F −1 (p − BT y˜ ) − F −1 (p − BT y)) + G (y) − G (y˜ ))T (y − y˜ )¿kB
(2.2)
If B has full row rank, (2.2) implies that H2 is a strongly monotone mapping and so system (2.1)
has a unique solution y∗ ∈ ℜm . Therefore, (1.1) has a unique solution z∗ ∈ ℜn+m . In the remainder
of this paper, we assume that there exists a solution z∗ of (1.1).
Let us denote
1 (z; P ) = x + P −1 (p − F (x) − BT y);
2 (z; E ) = y + Q−1 (B(x + P −1 (p − F (x) − BT y)) − G (y) − q)
and
(z; E ) =
1 (z; P )
2 (z; E )
:
Obviously z∗ is a solution of (1.1) if and only if z∗ = (z∗ ; E ): Furthermore, method (1.10) has
the form
zk+1 = (zk ; Ek );
(2.3)
which denes a xed-point method [16].
Assumption 1. F and G are Lipschitz continuous, i.e., there exist positive numbers ; such that
kF (x) − F (x˜)k6kx − xk
˜
for
x; x˜ ∈ ℜn
and
kG (y) − G (y˜ )k6ky − yk
˜
for
y; y˜ ∈ ℜm :
By the Rademacher theorem, Assumption 1 implies that F and G are dierentiable almost everywhere in ℜn and ℜm , respectively. The generalized Jacobian in the sense of Clarke [8] is dened
by
@F (x) = conv{lim F ′ (x˜);
x→x
˜
F is dierentiable at x}
˜
and
@G (y) = conv{ lim G ′ (y˜ );
y→y
˜
G is dierentiable at y}:
˜
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
211
By the structure of H and Proposition 2.6.2 in [8], the generalized Jacobian of H at z ∈ ℜn+m
satises
@H (z ) ⊆
A BT
B −C
A ∈ @F (x); C ∈ @G (y) :
;
By the monotone property of F and G , for any z ∈ ℜn+m , all A ∈ @F (x) are positive denite and
all C ∈ @G (y) are positive semi-denite. Moreover, by Proposition 2.1.2 of [8], Assumption 1 and
(1.2) imply that for A ∈ @F (x) and C ∈ @G (y),
kAk6;
kA−1 k6−1 ;
kCk6:
Hence the mapping H is Lipschitz continuous, and there exists
all J ∈ @H (z ) satisfy kJ k6 .
Let ˜ be a large positive number and let
¿ 0 such that for any z ∈ ℜn+m ,
D = {E | P ∈ ℜn×n ; Q ∈ ℜm×m are nonsingular and kP −1 k + kQ−1 k6 ˜ }:
Lemma 2.1. Suppose that Assumption 1 holds. Then there exists a L˜ ¿ 0 such that
˜ − z′ k
k(z; E ) − (z ′ ; E )k6Lkz
(2.4)
for any z; z ′ ∈ ℜn+m and any E ∈ D.
Proof. By the mean-value theorem (Proposition 2.6.5 in [8]), for any z; z ′ ∈ ℜn+m , there exist
A ∈ conv @F (xx′ ) and C ∈ conv @G (yy′ ) such that
F (x) − F (x′ ) = A(x − x′ )
(2.5)
G (y) − G (y′ ) = C (y − y′ ):
(2.6)
and
Here conv@F (xx′ ) denotes the convex hull of all points W ∈ @F (u) for u in the line segment xx′ ,
and conv@G (yy′ ) denotes similarly.
By the denition of , we have
1 (z; P ) − 1 (z ′ ; P ) = (In − P −1 A)(x − x′ ) − P −1 BT (y − y′ )
(2.7)
and
2 (z; E ) − 2 (z ′ ; E ) = Q−1 B(In − P −1 A)(x − x′ )
+(Im − Q−1 (BP −1 BT + C ))(y − y′ ):
(2.8)
By a straightforward calculation, we obtain that
P
B −Q
−1
=
P −1
−1
Q BP −1 −Q−1
:
(2.9)
212
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Hence from (2.7) and (2.8), we have
′
(z; E ) − (z ; E ) =
I−
P
B −Q
−1
A BT
B −C
!
(z − z ′ ) :
(2.10)
Since H is Lipschitz continuous and E ∈ D, the matrix after the above equality is bounded. Hence
there exists a L˜ ¿ 0 such that (2.4) holds.
The following assumption is a key condition to ensure that the inexact Uzawa method (1.10)
locally converges.
Assumption 2. There exist nonsingular matrices P∗ ∈ ℜn×n and Q∗ ∈ ℜm×m and a constant r∗ ∈
[0; 1); such that
maximize
A∗ ∈@F(x∗ );C∗ ∈@G(y∗ )
I−
P∗
B −Q∗
−1
A∗ BT
B −C∗
!
6r∗ ¡ 1:
(2.11)
The Lipschitz continuity of H implies that H is Frechet dierentiable if and only if H is Gˆateaux
dierentiable. Furthermore, if H is strongly dierentiable at z∗ , then @F (x∗ ) and @G (y∗ ) reduce
to singletons [8]. In this case, if we choose P∗ = 1=!F ′ (x∗ ) and Q∗ = 1=!G ′ (y∗ ), Assumption 2
reduces to the assumption of local convergence theorem for the SOR–Newton method [18, 28] and
the SOR-secant methods [16, 17]. It is notable that a Lipschitz continuous function H can be strongly
dierentiable at a single point but can fail to be dierentiable at arbitrarily close neighbouring points
(cf. [19]). Hence Assumption 2 with the strong dierentiability of H at z ∗ is weaker than assumptions
that H is continuously dierentiable in a neighborhood of z ∗ and
I−
P∗
B −Q∗
−1
F ′ (x∗ )
BT
B −G ′ (y∗ )
!
6r∗ ¡ 1
(cf. [16–18, 28]).
Lemma 2.2. Under Assumptions 1 and 2; we have
lim∗
z→z
k(z; E∗ ) − z∗ k
6rˆ∗ ¡ 1:
kz − z∗ k
(2.12)
Proof. Let ∈ (0; 1 − r∗ ). By Theorem 2.2.8 in [18], for any A∗ ∈ @F (x∗ ) and any C∗ ∈ @G (y∗ ),
there is a norm on ℜn+m such that
−1
P∗
A∗ B T
6r∗ + ¡ 1:
I −
B −Q∗
B −C∗
Since @F (x∗ ) and @G (y∗ ) are closed sets, maximizing the norms over @F (x∗ ) × @G (y∗ ) gives
−1
P∗
A∗ B T
maximize
6r∗ + ¡ 1:
I −
B −Q∗
B −C∗
A∗ ∈@F(x∗ );C∗ ∈@G(y∗ )
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
213
Now, from the mean-value theorem (Proposition 2.6.5 in [8]) and the Caratheodory theorem
(Theorem 17.1 in [22]) for any z ∈ ℜn+m , there exist i ; j ; i ; j ∈ [0; 1]; Ai ∈ @F (x∗ + i (x−x∗ )); Cj ∈
P
Pm+1
@G (y∗ + j (y − y∗ )); n+1
i=1 i = 1;
j=1 j = 1; for i = 1; 2; :::; n + 1; j = 1; :::; m + 1 such that
F (x) − F (x∗ ) =
n+1
X
i=1
i A i ( x − x∗ )
and
G (y) − G (y∗ ) =
Let A =
Pn+1
i=1
m+1
X
j=1
j Cj (y − y∗ ):
i Ai and C =
Pm+1
j=1
j Cj : By (2.10),
(z; E∗ ) − z∗ = (z; E∗ ) − (z∗ ; E∗ )
= I−
P∗
B −Q∗
−1
A BT
B −C
!
(z − z∗ ):
Notice that @F (x) and @G (y) are closed sets at any point z ∈ ℜn+m . By passing to a subsequence,
we can assume that i → i∗ ; j → ∗j ; Ai → A∗i and Cj → Cj∗ as z → z∗ . By the convexity of the
P
Pm+1 ∗ ∗
∗ ∗
∗
∗
generalized Jacobian, we have A∗ = n+1
i=1 i Ai ∈ @F (x ) and C∗ =
j=1 j Cj ∈ @G (y ): Hence
(2.11) implies (2.12).
Now we give the local convergence theorem for the inexact Uzawa method (1.10).
Theorem 2.1. Suppose that H , P∗ and Q∗ satisfy Assumptions 1 and 2. Then there exist 1 ¿ 0;
2 ¿ 0 such that if kz0 − z∗ k61 ; kPk − P∗ k62 and kQk − Q∗ k62 for all k¿0, then method
(1:10) is well-dened and satises
kzk+1 − z∗ k6rkzk − z∗ k;
(2.13)
where r ∈ (r∗ ; 1):
Assume further that
lim
k(Pk − A∗ )(xk+1 − xk )k
=0
kzk+1 − zk k
(2.14)
lim
k(Qk − C∗ )(yk+1 − yk )k
= 0:
kzk+1 − zk k
(2.15)
k→∞
and
k→∞
Then
kzk+1 − z∗ k
6r∗ :
k→∞ kzk − z∗ k
lim
(2.16)
214
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Proof. The rst part of Theorem 2.1 is straightforward and follows from Lemma 2.2 and Theorem
3.1 of [16]. The proof for the second part can be given by following the pattern of the proof of
Theorem 3.3 of [17].
An important problem remains to be studied: how to choose the preconditioners satisfying Assumption 2. Bramble et al. [3] provided a family of preconditioners satisfying Assumption 2 for the
linear saddle-point problem (1.4) with C = 0. The following theorem is a generalization of Theorem 1 of [3], which includes the case C 6= 0, and expands the Bramble–Pasciak–Vassilev family of
preconditioners.
Theorem 2.2. Let A ∈ ℜn×n ; P ∈ ℜn×n ; Q ∈ ℜm×m be symmetric positive denite and C ∈ ℜm×m be
symmetric positive semi-denite. Let
M =I−
P
B −Q
−1
A BT
B −C
(2.17)
:
Then (M ) ¡ 1; if there exist
; ∈ [0; 1) and ˆ ∈ (0; 1= 3) such that P satises
06((P − A)x; x)6(Px; x)
for all x ∈ ℜn
(2.18)
or
− ˆ(Px; x)6((P − A)x; x)60
for all x ∈ ℜn
(2.19)
and Q satises
06((Q − BA−1 BT − C )y; y)6
(Qy; y)
for all y ∈ ℜm .
(2.20)
In addition, we assume that there is ! ∈ (0; 2) such that
(Cy; y)¿!(Qy; y) for all y ∈ ℜm .
(2.21)
Then (M ) ¡ 1; if P satises (2.18) or (2.19) and Q satises
−
!
(Qy; y) ¡ ((Q − BA−1 BT − C )y; y)60
2
for all y ∈ ℜm :
(2.22)
Proof. By a straightforward calculation, we obtain
M=
−In
≡ DV:
Im
P −1 A − In
P −1 BT
−1
−1
−1
Q B(In − P A) Im − Q (BP −1 BT + C )
(2.23)
Case 1: (2.18) and (2.20) hold.
Obviously, (2.23) implies (M ) = 0, if P = A and Q = BA−1 BT + C . Since (M ) is a continuous
function of M , it suces to consider the case where the term in (2.18) is strictly positive for nonzero
vectors x ∈ ℜn , i.e., P − A is symmetric positive denite. Then V is symmetric with respect to the
inner product induced by the block diagonal matrix diag(P − A; Q). The symmetric property of V
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
215
implies that all eigenvalues of V are real. Furthermore, we have
(M ) 6 (DV )
= (V )
= (V );
where (DV ) and (V ) are the largest singular values of DV and V , and the rst inequality is from
Theorem 3.3.2 in [13].
Hence to estimate (M ), it suces to bound the positive and negative eigenvalues of V , separately.
Let be an eigenvalue of V , and (x; y) ∈ ℜn × ℜm be the corresponding eigenvector. Then
(P −1 A − In )x + P −1 BT y = x;
(2.24)
Q−1 B(In − P −1 A)x + (Im − Q−1 (BP −1 BT + C ))y = y:
(2.25)
We rst provide an upper bound for all positive eigenvalues ¿ 0.
Eliminating (In − P −1 A)x in (2.25) by using (2.24) gives
− Q−1 Bx + (Im − Q−1 C )y = y:
(2.26)
Then, taking an inner product of (2.26) with Qy, we have
( − 1)(Qy; y) = −(Bx; y) − (Cy; y)
6 −(x; BT y)
= −(x; Px + (P − A)x)
6 −2 (Px; x);
where the second equality follows from (2.24), and the second inequality follows from (2.18). If
x 6= 0, this, together with the positive-denite property of P , gives ¡ 1. If x = 0, (2.24) implies
BT y = 0 and so (2.26) gives
(Qy; y) = ((Q − C )y; y)
= ((Q − BA−1 BT − C )y; y)
(2.27)
6
(Qy; y);
where the inequality follows from (2.20). Notice that y cannot be zero, since (x; y) is an eigenvector.
This provides 6
. Hence all positive eigenvalues satisfy ¡ 1.
Now we provide a lower bound for negative eigenvalues ¡ 0:
By (2.20), ((1 − )Q − C ) is nonsingular for ¡ 0.
Eliminating y in (2.24) by (2.26) yields
(P −1 A − In )x + P −1 BT ((1 − )Q − C )−1 Bx = x:
(2.28)
216
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Multiplying (2.28) by 1 − and taking an inner product with Px gives
(BT (Q −
1
C )−1 Bx; x)
1−
= (1 − )((P − A)x; x) + (1 − )(Px; x)
= −2 (Px; x) + ((P − A)x; x) + (Ax; x)
6 − 2 (Px; x) + (Px; x) + (Ax; x);
(2.29)
where the inequality follows from (2.18).
Let = 1= (1 − ) and u = (Q − C )−1 Bx. Then we have
(BT (Q − C )−1 Bx; x) =
(BT u; x)2
((Q − C )u; u)
=
(A1=2 x; A−1=2 BT u)2
((Q − C )u; u)
6
(Ax; x)(BA−1 BT u; u)
((Q − C )u; u)
6 (Ax; x);
(2.30)
where the last inequality is from (2.20). Hence this, together with (2.29), gives
06( − 2 )(Px; x):
(2.31)
√
If x =
6 0, then ¿ − . Moreover (2.24) and (2.25) imply that if x = 0, then (Q − C )y = Qy.
However, by (2.20), the positive-denite property of Q implies
√ that y must be zero for ¡ 0.
Consequently, x cannot be a zero vector. Hence, we obtain − 6 ¡ 0: This completes the proof
for Case 1.
Case 2: (2.19) and (2.20) hold.
In this case A−P is symmetric positive denite, and M is symmetric in the inner product induced
by diag(A − P; Q). Hence all eigenvalues of M are real. Following the pattern of the proof for Case
1, we can show (M ) ¡ 1. Here we give a brief proof. Let be an eigenvalue of M and (x; y) be
the corresponding eigenvector. Then
(In − P −1 A)x − P −1 BT y = x;
(2:24)′
Q−1 B(In − P −1 A)x + (Im − Q−1 (BP −1 BT + C ))y = y:
(2:25)′
Eliminating (In − P −1 A)x in (2.25)′ by (2.24)′ gives
Q−1 Bx + (Im − Q−1 C )y = y:
Then, taking an inner product of (2.26)′ with Qy, we have for ¿ 0
( − 1)(Qy; y) = (Bx; y) − (Cy; y)
6 (x; BT y)
(2:26)′
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
217
= (x; −Px + (P − A)x)
6 −2 (Px; x):
Hence all positive eigenvalues of M are strictly less than 1.
Now we provide a lower bound for nonpositive eigenvalues 60:
Eliminating y in (2.24)′ by (2.26)′ yields
(In − P −1 A)x + P −1 BT ((1 − )Q − C )−1 Bx = x:
(2:28)′
Multiplying (2.28)′ by 1 − and taking an inner product with Px gives
1
C )−1 Bx; x)
1−
= (1 − )((A − P )x; x) + (1 − )(Px; x)
(BT (Q −
= −2 (Px; x) + ((A − P )x; x) + (Px; x) − ((A − P )x; x)
6 − 2 (Px; x) + ˆ(Px; x) + 2(Px; x) − (Ax; x):
Condition (2.20) ensures (2.30) holds for Case 2. Hence, we have
06(−2 + ˆ)(Px; x) + 2((P − A)x; x):
By (2.19), we have
ˆ
06 − 2 + ˆ − 2:
Since ˆ ¡ 13 , we have ¿ −1: Therefore in Case 2, we have (M ) ¡ 1.
Case 3: (2.18) and (2.22) hold.
The rst part of the proof for Case 1 remains same for this case to show that all positive
eigenvalues satisfy 1 ¿ if x 6= 0. If x = 0, then from (2.27) and (2.22), we obtain (Qy; y)60.
This cannot hold for ¿ 0. Hence in Case 3, we have ¡ 1:
Now we give a lower bound for negative eigenvalues ¡ 0.
If Q − 1= (1 − )C is singular then there is a nonzero vector v ∈ ℜm such that
((1 − )Q − C )v = 0:
By (2.22), this implies that
(Qv; v) = ((Q − C )v; v)¿ −
!
(Qv; v) ¿ −(Qv; v);
2
that is, ¿ −1.
Assume that Q − 1= (1 − )C is nonsingular. Then (2.28) and (2.29) hold. If
1
C u; u ¿(BA−1 BT u; u);
Q−
1−
√
then (2.31) holds. If x 6= 0, then ¿ − . If x = 0, then (2.24), (2.25) and (2.22) provide
!
(Qy; y) = ((Q − C )y; y) ¿ − (Qy; y):
2
This shows ¿ −1:
218
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Now, we consider the case where
(((1 − )Q − C )u; u)6(1 − )(BA−1 BT u; u):
(2.32)
In this case, (2.21), (2.32) and (2.22) imply
!(Qu; u)¿(Cu; u)¿(1 − )((Q − BA−1 BT − C )u; u)¿ −
!
(1 − )(Qu; u):
2
Hence we have ! + != 2(1 − ) ¿ 0, and so ¿ −1: Therefore, we obtain (M ) ¡ 1 for Case 3.
Case 4: (2.19) and (2.22) hold.
Following the proof for Case 2, we can show that all positive eigenvalues satisfy ¡ 1; By the
similar argument in the proof for Case 3, we can show all negative eigenvalues satisfy −1 ¡ :
Hence in Case 4, we have (M ) ¡ 1:
This completes the proof.
Corollary 2.1. Assume that @F (x∗ ) = A and @G (y∗ ) = C are singletons; and symmetric positive
denite. Let P and Q satisfy conditions of Theorem 2:2: Then there exist 1 ¿0; and 2 ¿0 such
that if kz −z0 k61 ; kPk −Pk62 and kQk −Qk62 for all k¿0; then method (1.10) is well dened
and locally linearly converges to z ∗ .
Proof. The result is straightforward, using Theorems 2.1 and 2.2.
3. SOR methods and a multistep Uzawa method
In this section we assume that H is strongly dierentiable at the solution z ∗ , F ′ (x∗ ) and G ′ (y∗ ) are
positive denite, and all elements in @F (x) and @G (y) for x ∈ ℜn and y ∈ ℜm are symmetric. We
will use Theorem 2.1 and Theorem 2.2 to determine the relaxation parameter ! in the SOR–Newton
method and the SOR–BFGS method. Furthermore, we study global convergence of the multistep
inexact Uzawa method (1.14).
To simplify the notation, we let
R = G ′ (y∗ )−1 (BF ′ (x∗ )−1 BT + G ′ (y∗ )):
We use the notation min (R) and max (R) for the minimum and maximum eigenvalues of R, respectively. From the similarity transform involving G ′ (y∗ )1=2 , we have max (R) = (R):
Welfert [27] gave a sucient condition on P , Q and for convergence of the following inexact
Uzawa method:
xk+1 = xk + P −1 (p − Axk − BT yk );
yk+1 = yk + Q−1 (Bxk+1 − Cyk − q);
(3.1)
where is a positive stepsize. Welfert’s condition is
0¡¡
2
(Q−1 (BA−1 BT + C ))
min
(
2
min (P −1 A)
−
1
;
max (P −1 A)
max (P −1 A)
)
:
(3.2)
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
219
It is easy to see that neither of the Bramble–Pasciak–Vassilev condition (1.6)–(1.7) and the
Welfert condition (3.2) implies other. Now we use conditions (2.19)–(2.22) and (3.2) to determine
the relaxation parameter in the SOR–Newton method and the SOR–BFGS method.
Lemma 3.1. Let A∗ = F ′ (x∗ ), P∗ = 1=!A∗ ; C∗ = G ′ (y∗ ) and Q∗ = 1=!C∗ : Then Assumption 2
holds; if
2
2
min
0¡!¡
− 1; 1 :
( R )
!
(3.3)
2
4
1
6 min
;
;
min (R)
2(R) − 1 3
(3.4)
Furthermore; if
then Assumption 2 holds if
2
4
1
6!6 min
;
:
min (R)
2(R) − 1 3
(3.5)
Proof. Obviously, the Welfert condition (3.2) provides (3.3).
Now we show (3.5) by using Theorem 2.2. It is easy to verify that (2.19) holds if 16! ¡ 4= 3:
Moreover, (2.22) holds if for all y ∈ ℜm ,
T
(C∗ y; y) − !((BA−1
∗ B + C∗ )y; y )60
(3.6)
and
−
!
T
(C∗ y; y) ¡ (C∗ y; y) − !((BA−1
∗ B + C∗ )y; y )60:
2
Inequality (3.6) requires that ! satises
! ¿ maxm
y∈ℜ
= maxm
y∈ℜ
=
=
(C∗ y; y)
−1
((BA∗ BT + C∗ )y; y)
(y; y)
+ C∗ )C∗−1=2 y; y)
T
(C∗−1=2 (BA−1
∗ B
1
T
min (C∗−1=2 (BA−1
∗ B
+ C∗ )C∗−1=2 )
1
;
min (R)
where the last equality is from the similarity transform involving C∗1=2 .
Similarly, we can show (3.7) requires that
!
− ¡ 1 − !max (R);
2
i.e., ! ¡ 2= (2(R) − 1):
Summarizing these choices on !, we obtain (3.5).
(3.7)
220
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Now, we are ready to give the local convergence theorem for the SOR–Newton method and the
SOR–BFGS method.
The SOR–Newton method for nonsmooth saddle-point problems is dened by
T
xk+1 = xk + !A−1
k (p − F (xk ) − B yk );
yk+1 = yk + !Ck−1 (Bxk+1 − G (yk ) − q);
(3.8)
where Ak ∈ @F (xk ) and Ck ∈ @G (yk ):
Theorem 3.1. Under Assumption 1, if ! satises the condition of Lemma 3:1 then the SOR–
Newton method (3.8) for saddle-point problem (1.1) is locally convergent.
Proof. By Lemma 3.1, Assumption 2 holds. Since F ′ (x∗ ) and G ′ (y∗ ) are singletons, for any given
there is neighborhood N of z∗ , such that for any z ∈ N
kA − A∗ k6
for all A ∈ @F (x)
kC − C∗ k6
for all C ∈ @G (y):
and
Hence, there exist 1 and 2 such that if kz − z ∗ k61 , then kA − A∗ k62 and kC − C∗ k62 for all
A ∈ @F (x) and C ∈ @G (y). By Theorem 2.1, the SOR–Newton method (3.8) locally converges.
Theorem 3.2. Under Assumption 1, if ! satises the condition of Lemma 3:1 then the SOR–BFGS
method for (1:1) locally converges.
Proof. By Lemma 3.1, Assumption 2 holds. Furthermore, by Theorem 2.1 there exist positive
constants 1 and 2 such that whenever kz0 − z∗ k61 ; kAk − A∗ k62 and kCk − C∗ k62 , the SOR–
BFGS method locally converges to z∗ . Hence it is sucient to show that there exist ˆ1 ∈ (0; 1 ] and
ˆ2 ∈ (0; 2 ] such that if kz0 −z∗ k6ˆ1 ; kA0 −A∗ k6ˆ2 and kC0 −C∗ k6ˆ2 then kzk −z∗ k61 ; kAk −A∗ k62
and kCk − C∗ k62 ; for all k¿0:
Since G ′ (y∗ ) is symmetric positive denite, by the Lipschitz continuity, G is strongly monotone
in a neighborhood of y∗ . Then for all yk in the neighborhood, vkT uk ¿ 0 and so Ck is updated at
every step k . In this case, the SOR–BFGS method satises the SOR-secant equation
Ak+1 sk = F (xk+1 ) + BT yk − F (xk ) − BT yk
−Ck+1 uk = Bxk+1 − G (yk+1 ) − Bxk+1 + G (yk ):
According to the results of [9, 16, 17], the strong dierentiability of H at z∗ , together with
Assumption 1 implies that there exist 1 ∈ (0; 1 ] and 2 ∈ (0; 2 ] such that whenever kz − z∗ k61 ,
kA − A∗ k62 ; kC − C∗ k62 and k(z; E ) − z∗ k6kz − z∗ k, we have
kA − A∗ −
tt T
AssT A
k6kA − A∗ k + kz − z∗ k
+
sT As
tT s
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
221
and
T
T
C − C∗ − Cuu C + vv
6kC − C∗ k + kz − z∗ k ;
T
T
u Cu
u v
where ¿0; ∈ (0; 1], s = 1 (z; A) − x; t = F (1 (z; P )) − F (x); u = 2 (z; E ) − y; v = G (2 (z; E )) −
G (y):
Let ˆ1 61 , r ∈ (0; 1) and ˆ2 + ˆ1 = (1 − r )62 :
We choose z0 ; A0 and C0 such that kz − z0 k6ˆ1 ; kA0 − A∗ k6ˆ2 and kC0 − C∗ k6ˆ2 :
Following a standard induction method (cf. [9, 16, 17]), we can show that for all k¿0, the
sequence {zk ; Ak ; Ck } satises
kzk+1 − z∗ k6rkzk − z∗ k61 ;
kAk+1 − A∗ k6ˆ2 + ˆ1
k
X
kCk+1 − C∗ k6ˆ2 + ˆ1
k
X
r i 62 62
i=0
and
r i 62 62 :
i=0
This completes the proof.
The linear version of inexact Uzawa method (1.10) has no inner products involved, but only
guarantees local convergence. A possible way to have global convergence and keep the linear
feature is to use multistep technique. In what follows, we shall study global convergence of the
multistep Uzawa method (1.14).
We consider the case G (y) ≡ Cy, where C is an m × m symmetric positive denite matrix.
Algorithm 3.1 (Multistep inexact Uzawa algorithm).
If p − F (xk ) − BT yk = 0, let xk+1 = xk + k k e: Otherwise let xk;0 = xk and lk be the minimum
nonnegative integer of i = 0; 1; 2; : : : such that
xk; i+1 = xk; i + k Pk−1 (p − F (xk; i ) − BT yk )
and
kF (xk; lk +1 ) + BT yk − pk6k :
Set xk+1 = xk;lk +1 and
yk+1 = yk + k Qk−1 (Bxk+1 − Cyk − q):
Here e ∈ ℜn with all entries equal to 1, k ; k ; k ; k are positive numbers and Pk ∈ ℜn×n and
Qk ∈ ℜm×m are symmetric positive-denite matrices.
When p − F (xk ) − BT yk = 0, we check whether Bxk − Cyk − q = 0. If both of them are equal to
zero, then (xk ; yk ) is the exact solution of (1.1) and we stop the algorithm. Hence, without loss of
generality, we assume that H (zk ) 6= 0.
222
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Theorem 3.3. Suppose that Assumption 1 holds. Let Qk ≡ C; k 61=kek; k ≡ 6 min(1; = (1 +
kC −1 kkBk2 )); 0 ¡ k 6k kF (xk ) − F (xk−1 )k; k 62=kPk−1 k and k 6=. Then from any (x0 ; y0 ) ∈
ℜn+m the sequence {(xk ; yk )} generated by Algorithm 3.1 converges to the unique solution (x∗ ; y∗ )
of (1.1).
Proof. Since F is strongly monotone, and C is positive denite, (1.1) has a unique solution z∗ =
(x∗ ; y∗ ).
Now we show that Algorithm 3.1 is well dened. Assume k ¿ 0.
If F (xk ) + BT yk − p = 0; then
kF (xk + k k e) + BT yk − pk6kF (xk ) + BT yk − pk + k k kek6k ;
where the rst inequality follows from Assumption 1.
Assume that F (xk ) + BT yk − p 6= 0. Let
(x) ≡ F (x) + BT yk − p:
Following the proof of the symmetry principle theorem in [18] and using the mean value theorem
for nonsmooth functions in [8], we can show that is a gradient mapping of
Z
g (x ) =
1
0
(x − xk )T (xk + t (x − xk )) dt;
and
g(xk; i+1 ) − g(xk; i )
=
Z
=
Z
1
−k (Pk−1 (xk; i ))T (xk; i − sk Pk−1 (xk; i )) ds
0
0
1
−k (Pk−1 (xk; i ))T ((xk; i ) − sk A˜ ()Pk−1 (xk; i )) ds
6 − k (xk; i )T Pk−1 (xk; i ) + 21 2k kPk−1 k(xk; i )T Pk−1 (xk; i );
where A˜ () ∈ conv@F (x˜), x˜ is in the line segment between xk; i − sk Pk−1 (xk; i ) and xk; i : By Assumption 1, kA˜ ()k6:
The strongly monotone property of F implies that g is a strongly convex function and the level
sets of g are bounded. Moreover, the solution xk∗ of (x) = 0 is the unique minima of g.
Let = k − (2k = 2)kPk−1 k: Then for k ∈ (0; 2=kPk−1 k), we have
∞
X
1
(Pk−1 (xk; i ); (xk; i ))6 (g(xk;0 ) − g(xk∗ )) ¡ ∞:
i=0
Since Pk is symmetric positive denite and {(xk; i )T Pk−1 (xk; i )}i¿0 is a monotonically decreasing
sequence, we have
lim (xk; i ) = 0:
i→∞
Since k ¿ 0, this implies that there is a nite number lk such that k(xk; lk )k6k :
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
223
Moreover, from kF (xk+1 ) − F (xk )k¿kxk+1 − xk k ¿ 0; we can choose k+1 ¿ 0.
Therefore, Algorithm 3.1 is well dened.
By Theorem 2.1 of [5], {xk ; yk } converges to (x∗ ; y∗ ):
Acknowledgements
The author is grateful to T. Yamamoto for his helpful comments.
References
[1] O. Axelsson, E. Kaporin, On the solution of nonlinear equations for nondierentiable mapping, Preprint, Department
of Mathematics, University of Nijmegen, Nijmegen, 1994.
[2] R.E. Bank, B.D. Welfert, H. Yserentant, A class of iterative methods for solving saddle-point problems, Numer.
Math. 56 (1990) 645–666.
[3] J. Bramble, J. Pasciak, A. Vassilev, Analysis of the inexact Uzawa algorithm for saddle-point problems, SIAM
Numer. Anal. 34 (1997) 1072–1092.
[4] X. Chen, Convergence of the BFGS method for LC1 convex constrained optimization, SIAM J. Control Optim. 34
(1996) 2051–2063.
[5] X. Chen, Global and superlinear convergence of inexact Uzawa methods for saddle-point problems with
nondierentiable mappings, SIAM J. Numer. Anal. 35 (1998) 1130–1148.
[6] X. Chen, R. Womersley, A parallel inexact Newton method for stochastic programs with recourse, Annals. Oper.
Res. 64 (1996) 113–141.
[7] P.G. Ciarlet, Introduction to Numerical Linear Algebra and Optimization, Cambridge University Press, Cambridge,
1989.
[8] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
[9] J.E. Dennis, Jorge J. More, A characterization of superlinear convergence and its application to quasi-Newton
methods, Math. Comp. 28 (1974) 549–560.
[10] H.C. Elman, G.H. Golub, Inexact and preconditioned Uzawa algorithms for saddle-point problems, SIAM J. Numer.
Anal. 31 (1994) 1645–1661.
[11] R. Fletcher, Practical Methods of Optimization, 2nd ed., Wiley, Chichester, 1987.
[12] S.A. Funken, E.P. Stephan, Fast solvers for nonlinear FEM–BEM equations, Preprint, Institut fur Angewandte
Mathematik, University of Hannover, Hannover, 1995.
[13] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, New York, 1991.
[14] J. Janssen, S. Vandewalle, On SOR waveform relaxation methods, SIAM J. Numer. Anal. 34 (1997) 2456–2481.
[15] M. Heinkenschloss, C.T. Kelley, H.T. Tran, Fast algorithms for nonsmooth compact xed point problems, SIAM J.
Numer. Anal. 29 (1992) 1769–1792.
[16] J.M. Martnez, Fixed-point quasi-Newton methods, SIAM J. Numer. Anal. 29 (1992) 1413–1434.
[17] J.M. Martnez, SOR-secant methods, SIAM J. Numer. Anal. 31 (1994) 217–226.
[18] J.M. Ortega, W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New
York, 1970.
[19] J.S. Pang, Newton’s method for B-dierentiable equations, Math. Oper. Res. 15 (1990) 311–341.
[20] J.S. Pang, L. Qi, Nonsmooth equations: motivation and algorithms, SIAM J. Optim. 3 (1993) 443–465.
[21] L. Qi, Superlinear convergent approximate Newton methods for LC1 optimization problems, Math. Programming 64
(1994) 277–294.
[22] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970.
[23] R.T. Rockafellar, R.J.-B. Wets, Linear-quadratic problems with stochastic penalties: the nite generation algorithm,
in: V.I. Arkin, A. Shiraev and R.J.-B. Wets (eds.), Stochastic Optimization, Lecture Notes in Control and Information
Sciences, vol. 81, Springer, Berlin, 1987, pp. 545–560.
224
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
[24] T. Rusten, R. Winther, A preconditioned iterative method for saddle-point problems, SIAM J. Matrix Anal. Appl.
13 (1992) 887–904.
[25] P. Tseng, Applications of a splitting algorithm to decomposition in convex programming and variational inequalities,
SIAM J. Control Optim. 29 (1991) 119–138.
[26] A.J. Wathen, E.P. Stephan, Convergence of preconditioned minimum residual iteration for coupled nite
element/boundary element computations, Math. Res. Report, University of Bristol, AM-94-03, 1994.
[27] B.D. Welfert, Convergence of inexact Uzawa algorithms for saddle-point problems, Preprint, Department of
Mathematics, Arizona State University, 1995.
[28] T. Yamamoto, On nonlinear SOR-like methods I-Applications to simultaneous methods for polynomial zeros, Japan
J. Indust. Appl. Math. 14 (1997) 87–97.
On preconditioned Uzawa methods and SOR methods for
saddle-point problems
Xiaojun Chen ∗
Department of Mathematics and Computer Science, Shimane University, Matsue 690-8504, Japan
Received 20 February 1998
Abstract
This paper studies convergence analysis of a preconditioned inexact Uzawa method for nondierentiable saddle-point
problems. The SOR-Newton method and the SOR-BFGS method are special cases of this method. We relax the Bramble–
Pasciak–Vassilev condition on preconditioners for convergence of the inexact Uzawa method for linear saddle-point problems. The relaxed condition is used to determine the relaxation parameters in the SOR–Newton method and the SOR–
BFGS method. Furthermore, we study global convergence of the multistep inexact Uzawa method for nondierentiable
c 1998 Elsevier Science B.V. All rights reserved.
saddle-point problems.
AMS classication: 65H10
Keywords: Saddle-point problem; Nonsmooth equation; Uzawa method; Precondition; SOR method
1. Introduction
Saddle-point problems arise, for example, in the mixed nite element discretization of the Stokes
equations, coupled nite element/boundary element computations for interface problems, and the
minimization of a convex function subject to linear constraints [2–7, 10, 12, 21, 23–27] In this
paper we consider the nonlinear saddle-point problem
H (x; y) ≡
F (x ) + B T y − p
Bx − G (y) − q
= 0;
(1.1)
where B ∈ Rm×n , p ∈ ℜn , q ∈ ℜm , F : ℜn → ℜn is a strongly monotone mapping with modulus ,
i.e.,
(F (x) − F (x˜))T (x − x˜)¿kx − xk
˜ 2;
for x; x˜ ∈ ℜn
∗
(1.2)
E-mail: [email protected]. This work is supported by the Australian Research Council while the author
worked at the School of Mathematics, University of New South Wales.
c 1998 Elsevier Science B.V. All rights reserved.
0377-0427/98/$ – see front matter
PII: S 0 3 7 7 - 0 4 2 7 ( 9 8 ) 0 0 1 9 7 - 6
208
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
and G : ℜm → ℜm is a monotone mapping, i.e.,
(G (y) − G (y˜ ))T (y − y˜ )¿0;
for y; y˜ ∈ ℜm
(1.3)
If F and G are symmetric ane functions, problem (1.1) reduces to the linear saddle problem [2,
3, 10, 24, 27]
A BT
B −C
x
y
=
p
q
(1.4)
;
where A is an n × n symmetric positive-denite matrix and C is an m × m symmetric positive-semidenite matrix. A version of preconditioned inexact Uzawa methods for solving (1.4) is
xk+1 = xk + P −1 (p − Axk − BT yk );
yk+1 = yk + Q−1 (Bxk+1 − Cyk − q);
(1.5)
where P ∈ ℜn×n and Q ∈ ℜm×m are symmetric positive-denite preconditioners [3, 10, 27]. This
inexact Uzawa method (1.5) is simple and has minimal computer memory requirements. Furthermore,
it has no inner products involved in the iteration. These features make this method very well suited
for implementation on modern computing architectures. Bramble, et al. [3] showed that method (1.5)
for solving (1.4) with C = 0 always converges provided that the preconditioners satisfy
for all x ∈ ℜn
06((P − A)x; x)6(Px; x)
(1.6)
and
06((Q − BA−1 BT )y; y)6
(Qy; y)
for all y ∈ ℜm ;
(1.7)
where ;
∈ [0; 1):
In this paper, we consider the case C 6= 0 and relax conditions (1.6) and (1.7) to
06((P − A)x; x)6(Px; x)
or
− ˆ(Px; x)6((P − A)x; x)60;
(1.8)
for all x ∈ ℜn , and
−
ˆ(Qy; y)6((Q − BA−1 BT − C )y; y)6
(Qy; y)
for all y ∈ Rm ;
(1.9)
where P − A and Q − BA−1 BT − C are positive semi-denite or negative semi-denite, and ˆ and
ˆ are small positive numbers. Furthermore, we use the relaxed Bramble–Pasciak–Vassilev condition
to study convergence of the inexact Uzawa method for nonlinear saddle-point problems.
A direct generalization of (1.5) for solving nonlinear saddle-point problems (1.1) is
xk+1 = xk + Pk−1 (p − F (xk ) − BT yk );
yk+1 = yk + Qk−1 (Bxk+1 − G (yk ) − q);
(1.10)
where Pk ∈ ℜn×n and Qk ∈ ℜm×m are positive denite.
Some accelerated Newton-type methods are particular cases of method (1.10).
Example 1 (SOR–Newton method). In this case Pk = !1 F ′ (xk ) and Qk = !1 G ′ (yk ); where ! ¿ 0.
The positive-denite property of Pk is guaranteed by the strong monotonicity of F . To ensure the
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
209
positive-denite property of Qk , we can use a modication Qk = !1 (G ′ (yk )+ Im ) where is a positive
number and Im is the identity matrix in ℜm×m .
Example 2 (SOR-BFGS method). Let A0 ∈ ℜn×n and C0 ∈ ℜm×m be arbitrary positive denite
matrices. For k¿0, we dene
sk = xk+1 − xk ; tk = F (xk+1 ) − F (xk );
uk = yk+1 − yk ; vk = G (yk+1 ) − G (yk ):
We set
Ak+1 = Ak −
tk tkT
Ak sk skT Ak
+
;
skT Ak sk
tkT sk
(1.11)
and
Ck+1 =
Ck −
Ck
vk v T
Ck uk ukT Ck
+ T k if vkT uk 6= 0
T
uk Ck uk
vk uk
otherwise.
(1.12)
Since F is strongly monotone and G is monotone, tkT sk ¿ 0 and vkT uk ¿0 for all k¿0. By the BFGS
update rule [11], Ak and Ck are positive denite. Taking Pk = !1 Ak and Qk = !1 Ck with ! ¿ 0, method
(1.10) reduces to the SOR–BFGS method.
In this paper we are concerned with the case in which F and/or G are possibly nondierentiable.
Such problems arise from LC 1 convex programming problems [4, 6, 21, 23, 25], nondierentiable
interface problems [5, 12], and some possible extension of nondierentiable problems [1, 15, 20]. A
globally and superlinearly convergent inexact Uzawa method for solving 1.1 was studied in [5], in
which the component xk+1 is generated by a nonlinear iterative process. In particular, xk+1 satises
F (xk+1 ) + BT yk = p + k ;
(1.13)
where k is the residual of the approximation solution xk+1 to the system F (x) + BT yk = p. In
this paper we show that the nonlinear version (1.13) can be replaced by a multistep linear process.
Precisely, we prove global convergence of the following multistep inexact Uzawa method:
xk+1 = xk;lk ; xk;0 = xk ;
xk; i+1 = xk; i + Pk−1 (p − F (xk; i ) − BT yk ); i = 0; 1; : : : lk − 1;
yk+1 = yk + Qk−1 (Bxk+1 − G (yk ) − q):
(1.14)
This paper is organized as follows. In Section 2 we rewrite the preconditioned inexact Uzawa
method (1.10) as a xed-point method, and generalize local convergence theory [16–18, 28] to
nondierentiable problems. Moreover, we relax the Bramble–Pasciak–Vassilev condition on P and
Q for convergence of (1.5). In Section 3 we use the local convergence theory and the relaxed
condition to determine the relaxation parameter in the SOR–Newton method and the SOR–BFGS
method for the nonsmooth saddle-point problem (1.1). Furthermore, we study global convergence
of the multistep inexact Uzawa method (1.14).
Throughout this paper we denote the identity matrices in ℜn×n , ℜm×m and ℜ(n+m)×(n+m) by In , Im
and I , respectively. The spectral radius of a matrix J is denoted by (J ). For simplicity, we use z
for the column vector (xT ; yT )T and E for the matrix (P; Q).
210
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
2. A xed-point method and its preconditioners
Since F is strongly monotone, F has a single valued inverse operator F −1 dened by F −1 (v) =
{x | v = F (x)}: Furthermore, the inverse operator F −1 is also a strongly monotone mapping. Hence
system (1.1) is equivalent to
H2 (y) = −BF −1 (p − BT y) + G (y) + q = 0:
(2.1)
By the monotone property of G , we have that for any y; y˜ ∈ ℜm , there exists a positive scalar ˜
such that
˜ T (y − y˜ )k2 :
(B(F −1 (p − BT y˜ ) − F −1 (p − BT y)) + G (y) − G (y˜ ))T (y − y˜ )¿kB
(2.2)
If B has full row rank, (2.2) implies that H2 is a strongly monotone mapping and so system (2.1)
has a unique solution y∗ ∈ ℜm . Therefore, (1.1) has a unique solution z∗ ∈ ℜn+m . In the remainder
of this paper, we assume that there exists a solution z∗ of (1.1).
Let us denote
1 (z; P ) = x + P −1 (p − F (x) − BT y);
2 (z; E ) = y + Q−1 (B(x + P −1 (p − F (x) − BT y)) − G (y) − q)
and
(z; E ) =
1 (z; P )
2 (z; E )
:
Obviously z∗ is a solution of (1.1) if and only if z∗ = (z∗ ; E ): Furthermore, method (1.10) has
the form
zk+1 = (zk ; Ek );
(2.3)
which denes a xed-point method [16].
Assumption 1. F and G are Lipschitz continuous, i.e., there exist positive numbers ; such that
kF (x) − F (x˜)k6kx − xk
˜
for
x; x˜ ∈ ℜn
and
kG (y) − G (y˜ )k6ky − yk
˜
for
y; y˜ ∈ ℜm :
By the Rademacher theorem, Assumption 1 implies that F and G are dierentiable almost everywhere in ℜn and ℜm , respectively. The generalized Jacobian in the sense of Clarke [8] is dened
by
@F (x) = conv{lim F ′ (x˜);
x→x
˜
F is dierentiable at x}
˜
and
@G (y) = conv{ lim G ′ (y˜ );
y→y
˜
G is dierentiable at y}:
˜
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
211
By the structure of H and Proposition 2.6.2 in [8], the generalized Jacobian of H at z ∈ ℜn+m
satises
@H (z ) ⊆
A BT
B −C
A ∈ @F (x); C ∈ @G (y) :
;
By the monotone property of F and G , for any z ∈ ℜn+m , all A ∈ @F (x) are positive denite and
all C ∈ @G (y) are positive semi-denite. Moreover, by Proposition 2.1.2 of [8], Assumption 1 and
(1.2) imply that for A ∈ @F (x) and C ∈ @G (y),
kAk6;
kA−1 k6−1 ;
kCk6:
Hence the mapping H is Lipschitz continuous, and there exists
all J ∈ @H (z ) satisfy kJ k6 .
Let ˜ be a large positive number and let
¿ 0 such that for any z ∈ ℜn+m ,
D = {E | P ∈ ℜn×n ; Q ∈ ℜm×m are nonsingular and kP −1 k + kQ−1 k6 ˜ }:
Lemma 2.1. Suppose that Assumption 1 holds. Then there exists a L˜ ¿ 0 such that
˜ − z′ k
k(z; E ) − (z ′ ; E )k6Lkz
(2.4)
for any z; z ′ ∈ ℜn+m and any E ∈ D.
Proof. By the mean-value theorem (Proposition 2.6.5 in [8]), for any z; z ′ ∈ ℜn+m , there exist
A ∈ conv @F (xx′ ) and C ∈ conv @G (yy′ ) such that
F (x) − F (x′ ) = A(x − x′ )
(2.5)
G (y) − G (y′ ) = C (y − y′ ):
(2.6)
and
Here conv@F (xx′ ) denotes the convex hull of all points W ∈ @F (u) for u in the line segment xx′ ,
and conv@G (yy′ ) denotes similarly.
By the denition of , we have
1 (z; P ) − 1 (z ′ ; P ) = (In − P −1 A)(x − x′ ) − P −1 BT (y − y′ )
(2.7)
and
2 (z; E ) − 2 (z ′ ; E ) = Q−1 B(In − P −1 A)(x − x′ )
+(Im − Q−1 (BP −1 BT + C ))(y − y′ ):
(2.8)
By a straightforward calculation, we obtain that
P
B −Q
−1
=
P −1
−1
Q BP −1 −Q−1
:
(2.9)
212
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Hence from (2.7) and (2.8), we have
′
(z; E ) − (z ; E ) =
I−
P
B −Q
−1
A BT
B −C
!
(z − z ′ ) :
(2.10)
Since H is Lipschitz continuous and E ∈ D, the matrix after the above equality is bounded. Hence
there exists a L˜ ¿ 0 such that (2.4) holds.
The following assumption is a key condition to ensure that the inexact Uzawa method (1.10)
locally converges.
Assumption 2. There exist nonsingular matrices P∗ ∈ ℜn×n and Q∗ ∈ ℜm×m and a constant r∗ ∈
[0; 1); such that
maximize
A∗ ∈@F(x∗ );C∗ ∈@G(y∗ )
I−
P∗
B −Q∗
−1
A∗ BT
B −C∗
!
6r∗ ¡ 1:
(2.11)
The Lipschitz continuity of H implies that H is Frechet dierentiable if and only if H is Gˆateaux
dierentiable. Furthermore, if H is strongly dierentiable at z∗ , then @F (x∗ ) and @G (y∗ ) reduce
to singletons [8]. In this case, if we choose P∗ = 1=!F ′ (x∗ ) and Q∗ = 1=!G ′ (y∗ ), Assumption 2
reduces to the assumption of local convergence theorem for the SOR–Newton method [18, 28] and
the SOR-secant methods [16, 17]. It is notable that a Lipschitz continuous function H can be strongly
dierentiable at a single point but can fail to be dierentiable at arbitrarily close neighbouring points
(cf. [19]). Hence Assumption 2 with the strong dierentiability of H at z ∗ is weaker than assumptions
that H is continuously dierentiable in a neighborhood of z ∗ and
I−
P∗
B −Q∗
−1
F ′ (x∗ )
BT
B −G ′ (y∗ )
!
6r∗ ¡ 1
(cf. [16–18, 28]).
Lemma 2.2. Under Assumptions 1 and 2; we have
lim∗
z→z
k(z; E∗ ) − z∗ k
6rˆ∗ ¡ 1:
kz − z∗ k
(2.12)
Proof. Let ∈ (0; 1 − r∗ ). By Theorem 2.2.8 in [18], for any A∗ ∈ @F (x∗ ) and any C∗ ∈ @G (y∗ ),
there is a norm on ℜn+m such that
−1
P∗
A∗ B T
6r∗ + ¡ 1:
I −
B −Q∗
B −C∗
Since @F (x∗ ) and @G (y∗ ) are closed sets, maximizing the norms over @F (x∗ ) × @G (y∗ ) gives
−1
P∗
A∗ B T
maximize
6r∗ + ¡ 1:
I −
B −Q∗
B −C∗
A∗ ∈@F(x∗ );C∗ ∈@G(y∗ )
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
213
Now, from the mean-value theorem (Proposition 2.6.5 in [8]) and the Caratheodory theorem
(Theorem 17.1 in [22]) for any z ∈ ℜn+m , there exist i ; j ; i ; j ∈ [0; 1]; Ai ∈ @F (x∗ + i (x−x∗ )); Cj ∈
P
Pm+1
@G (y∗ + j (y − y∗ )); n+1
i=1 i = 1;
j=1 j = 1; for i = 1; 2; :::; n + 1; j = 1; :::; m + 1 such that
F (x) − F (x∗ ) =
n+1
X
i=1
i A i ( x − x∗ )
and
G (y) − G (y∗ ) =
Let A =
Pn+1
i=1
m+1
X
j=1
j Cj (y − y∗ ):
i Ai and C =
Pm+1
j=1
j Cj : By (2.10),
(z; E∗ ) − z∗ = (z; E∗ ) − (z∗ ; E∗ )
= I−
P∗
B −Q∗
−1
A BT
B −C
!
(z − z∗ ):
Notice that @F (x) and @G (y) are closed sets at any point z ∈ ℜn+m . By passing to a subsequence,
we can assume that i → i∗ ; j → ∗j ; Ai → A∗i and Cj → Cj∗ as z → z∗ . By the convexity of the
P
Pm+1 ∗ ∗
∗ ∗
∗
∗
generalized Jacobian, we have A∗ = n+1
i=1 i Ai ∈ @F (x ) and C∗ =
j=1 j Cj ∈ @G (y ): Hence
(2.11) implies (2.12).
Now we give the local convergence theorem for the inexact Uzawa method (1.10).
Theorem 2.1. Suppose that H , P∗ and Q∗ satisfy Assumptions 1 and 2. Then there exist 1 ¿ 0;
2 ¿ 0 such that if kz0 − z∗ k61 ; kPk − P∗ k62 and kQk − Q∗ k62 for all k¿0, then method
(1:10) is well-dened and satises
kzk+1 − z∗ k6rkzk − z∗ k;
(2.13)
where r ∈ (r∗ ; 1):
Assume further that
lim
k(Pk − A∗ )(xk+1 − xk )k
=0
kzk+1 − zk k
(2.14)
lim
k(Qk − C∗ )(yk+1 − yk )k
= 0:
kzk+1 − zk k
(2.15)
k→∞
and
k→∞
Then
kzk+1 − z∗ k
6r∗ :
k→∞ kzk − z∗ k
lim
(2.16)
214
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Proof. The rst part of Theorem 2.1 is straightforward and follows from Lemma 2.2 and Theorem
3.1 of [16]. The proof for the second part can be given by following the pattern of the proof of
Theorem 3.3 of [17].
An important problem remains to be studied: how to choose the preconditioners satisfying Assumption 2. Bramble et al. [3] provided a family of preconditioners satisfying Assumption 2 for the
linear saddle-point problem (1.4) with C = 0. The following theorem is a generalization of Theorem 1 of [3], which includes the case C 6= 0, and expands the Bramble–Pasciak–Vassilev family of
preconditioners.
Theorem 2.2. Let A ∈ ℜn×n ; P ∈ ℜn×n ; Q ∈ ℜm×m be symmetric positive denite and C ∈ ℜm×m be
symmetric positive semi-denite. Let
M =I−
P
B −Q
−1
A BT
B −C
(2.17)
:
Then (M ) ¡ 1; if there exist
; ∈ [0; 1) and ˆ ∈ (0; 1= 3) such that P satises
06((P − A)x; x)6(Px; x)
for all x ∈ ℜn
(2.18)
or
− ˆ(Px; x)6((P − A)x; x)60
for all x ∈ ℜn
(2.19)
and Q satises
06((Q − BA−1 BT − C )y; y)6
(Qy; y)
for all y ∈ ℜm .
(2.20)
In addition, we assume that there is ! ∈ (0; 2) such that
(Cy; y)¿!(Qy; y) for all y ∈ ℜm .
(2.21)
Then (M ) ¡ 1; if P satises (2.18) or (2.19) and Q satises
−
!
(Qy; y) ¡ ((Q − BA−1 BT − C )y; y)60
2
for all y ∈ ℜm :
(2.22)
Proof. By a straightforward calculation, we obtain
M=
−In
≡ DV:
Im
P −1 A − In
P −1 BT
−1
−1
−1
Q B(In − P A) Im − Q (BP −1 BT + C )
(2.23)
Case 1: (2.18) and (2.20) hold.
Obviously, (2.23) implies (M ) = 0, if P = A and Q = BA−1 BT + C . Since (M ) is a continuous
function of M , it suces to consider the case where the term in (2.18) is strictly positive for nonzero
vectors x ∈ ℜn , i.e., P − A is symmetric positive denite. Then V is symmetric with respect to the
inner product induced by the block diagonal matrix diag(P − A; Q). The symmetric property of V
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
215
implies that all eigenvalues of V are real. Furthermore, we have
(M ) 6 (DV )
= (V )
= (V );
where (DV ) and (V ) are the largest singular values of DV and V , and the rst inequality is from
Theorem 3.3.2 in [13].
Hence to estimate (M ), it suces to bound the positive and negative eigenvalues of V , separately.
Let be an eigenvalue of V , and (x; y) ∈ ℜn × ℜm be the corresponding eigenvector. Then
(P −1 A − In )x + P −1 BT y = x;
(2.24)
Q−1 B(In − P −1 A)x + (Im − Q−1 (BP −1 BT + C ))y = y:
(2.25)
We rst provide an upper bound for all positive eigenvalues ¿ 0.
Eliminating (In − P −1 A)x in (2.25) by using (2.24) gives
− Q−1 Bx + (Im − Q−1 C )y = y:
(2.26)
Then, taking an inner product of (2.26) with Qy, we have
( − 1)(Qy; y) = −(Bx; y) − (Cy; y)
6 −(x; BT y)
= −(x; Px + (P − A)x)
6 −2 (Px; x);
where the second equality follows from (2.24), and the second inequality follows from (2.18). If
x 6= 0, this, together with the positive-denite property of P , gives ¡ 1. If x = 0, (2.24) implies
BT y = 0 and so (2.26) gives
(Qy; y) = ((Q − C )y; y)
= ((Q − BA−1 BT − C )y; y)
(2.27)
6
(Qy; y);
where the inequality follows from (2.20). Notice that y cannot be zero, since (x; y) is an eigenvector.
This provides 6
. Hence all positive eigenvalues satisfy ¡ 1.
Now we provide a lower bound for negative eigenvalues ¡ 0:
By (2.20), ((1 − )Q − C ) is nonsingular for ¡ 0.
Eliminating y in (2.24) by (2.26) yields
(P −1 A − In )x + P −1 BT ((1 − )Q − C )−1 Bx = x:
(2.28)
216
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Multiplying (2.28) by 1 − and taking an inner product with Px gives
(BT (Q −
1
C )−1 Bx; x)
1−
= (1 − )((P − A)x; x) + (1 − )(Px; x)
= −2 (Px; x) + ((P − A)x; x) + (Ax; x)
6 − 2 (Px; x) + (Px; x) + (Ax; x);
(2.29)
where the inequality follows from (2.18).
Let = 1= (1 − ) and u = (Q − C )−1 Bx. Then we have
(BT (Q − C )−1 Bx; x) =
(BT u; x)2
((Q − C )u; u)
=
(A1=2 x; A−1=2 BT u)2
((Q − C )u; u)
6
(Ax; x)(BA−1 BT u; u)
((Q − C )u; u)
6 (Ax; x);
(2.30)
where the last inequality is from (2.20). Hence this, together with (2.29), gives
06( − 2 )(Px; x):
(2.31)
√
If x =
6 0, then ¿ − . Moreover (2.24) and (2.25) imply that if x = 0, then (Q − C )y = Qy.
However, by (2.20), the positive-denite property of Q implies
√ that y must be zero for ¡ 0.
Consequently, x cannot be a zero vector. Hence, we obtain − 6 ¡ 0: This completes the proof
for Case 1.
Case 2: (2.19) and (2.20) hold.
In this case A−P is symmetric positive denite, and M is symmetric in the inner product induced
by diag(A − P; Q). Hence all eigenvalues of M are real. Following the pattern of the proof for Case
1, we can show (M ) ¡ 1. Here we give a brief proof. Let be an eigenvalue of M and (x; y) be
the corresponding eigenvector. Then
(In − P −1 A)x − P −1 BT y = x;
(2:24)′
Q−1 B(In − P −1 A)x + (Im − Q−1 (BP −1 BT + C ))y = y:
(2:25)′
Eliminating (In − P −1 A)x in (2.25)′ by (2.24)′ gives
Q−1 Bx + (Im − Q−1 C )y = y:
Then, taking an inner product of (2.26)′ with Qy, we have for ¿ 0
( − 1)(Qy; y) = (Bx; y) − (Cy; y)
6 (x; BT y)
(2:26)′
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
217
= (x; −Px + (P − A)x)
6 −2 (Px; x):
Hence all positive eigenvalues of M are strictly less than 1.
Now we provide a lower bound for nonpositive eigenvalues 60:
Eliminating y in (2.24)′ by (2.26)′ yields
(In − P −1 A)x + P −1 BT ((1 − )Q − C )−1 Bx = x:
(2:28)′
Multiplying (2.28)′ by 1 − and taking an inner product with Px gives
1
C )−1 Bx; x)
1−
= (1 − )((A − P )x; x) + (1 − )(Px; x)
(BT (Q −
= −2 (Px; x) + ((A − P )x; x) + (Px; x) − ((A − P )x; x)
6 − 2 (Px; x) + ˆ(Px; x) + 2(Px; x) − (Ax; x):
Condition (2.20) ensures (2.30) holds for Case 2. Hence, we have
06(−2 + ˆ)(Px; x) + 2((P − A)x; x):
By (2.19), we have
ˆ
06 − 2 + ˆ − 2:
Since ˆ ¡ 13 , we have ¿ −1: Therefore in Case 2, we have (M ) ¡ 1.
Case 3: (2.18) and (2.22) hold.
The rst part of the proof for Case 1 remains same for this case to show that all positive
eigenvalues satisfy 1 ¿ if x 6= 0. If x = 0, then from (2.27) and (2.22), we obtain (Qy; y)60.
This cannot hold for ¿ 0. Hence in Case 3, we have ¡ 1:
Now we give a lower bound for negative eigenvalues ¡ 0.
If Q − 1= (1 − )C is singular then there is a nonzero vector v ∈ ℜm such that
((1 − )Q − C )v = 0:
By (2.22), this implies that
(Qv; v) = ((Q − C )v; v)¿ −
!
(Qv; v) ¿ −(Qv; v);
2
that is, ¿ −1.
Assume that Q − 1= (1 − )C is nonsingular. Then (2.28) and (2.29) hold. If
1
C u; u ¿(BA−1 BT u; u);
Q−
1−
√
then (2.31) holds. If x 6= 0, then ¿ − . If x = 0, then (2.24), (2.25) and (2.22) provide
!
(Qy; y) = ((Q − C )y; y) ¿ − (Qy; y):
2
This shows ¿ −1:
218
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Now, we consider the case where
(((1 − )Q − C )u; u)6(1 − )(BA−1 BT u; u):
(2.32)
In this case, (2.21), (2.32) and (2.22) imply
!(Qu; u)¿(Cu; u)¿(1 − )((Q − BA−1 BT − C )u; u)¿ −
!
(1 − )(Qu; u):
2
Hence we have ! + != 2(1 − ) ¿ 0, and so ¿ −1: Therefore, we obtain (M ) ¡ 1 for Case 3.
Case 4: (2.19) and (2.22) hold.
Following the proof for Case 2, we can show that all positive eigenvalues satisfy ¡ 1; By the
similar argument in the proof for Case 3, we can show all negative eigenvalues satisfy −1 ¡ :
Hence in Case 4, we have (M ) ¡ 1:
This completes the proof.
Corollary 2.1. Assume that @F (x∗ ) = A and @G (y∗ ) = C are singletons; and symmetric positive
denite. Let P and Q satisfy conditions of Theorem 2:2: Then there exist 1 ¿0; and 2 ¿0 such
that if kz −z0 k61 ; kPk −Pk62 and kQk −Qk62 for all k¿0; then method (1.10) is well dened
and locally linearly converges to z ∗ .
Proof. The result is straightforward, using Theorems 2.1 and 2.2.
3. SOR methods and a multistep Uzawa method
In this section we assume that H is strongly dierentiable at the solution z ∗ , F ′ (x∗ ) and G ′ (y∗ ) are
positive denite, and all elements in @F (x) and @G (y) for x ∈ ℜn and y ∈ ℜm are symmetric. We
will use Theorem 2.1 and Theorem 2.2 to determine the relaxation parameter ! in the SOR–Newton
method and the SOR–BFGS method. Furthermore, we study global convergence of the multistep
inexact Uzawa method (1.14).
To simplify the notation, we let
R = G ′ (y∗ )−1 (BF ′ (x∗ )−1 BT + G ′ (y∗ )):
We use the notation min (R) and max (R) for the minimum and maximum eigenvalues of R, respectively. From the similarity transform involving G ′ (y∗ )1=2 , we have max (R) = (R):
Welfert [27] gave a sucient condition on P , Q and for convergence of the following inexact
Uzawa method:
xk+1 = xk + P −1 (p − Axk − BT yk );
yk+1 = yk + Q−1 (Bxk+1 − Cyk − q);
(3.1)
where is a positive stepsize. Welfert’s condition is
0¡¡
2
(Q−1 (BA−1 BT + C ))
min
(
2
min (P −1 A)
−
1
;
max (P −1 A)
max (P −1 A)
)
:
(3.2)
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
219
It is easy to see that neither of the Bramble–Pasciak–Vassilev condition (1.6)–(1.7) and the
Welfert condition (3.2) implies other. Now we use conditions (2.19)–(2.22) and (3.2) to determine
the relaxation parameter in the SOR–Newton method and the SOR–BFGS method.
Lemma 3.1. Let A∗ = F ′ (x∗ ), P∗ = 1=!A∗ ; C∗ = G ′ (y∗ ) and Q∗ = 1=!C∗ : Then Assumption 2
holds; if
2
2
min
0¡!¡
− 1; 1 :
( R )
!
(3.3)
2
4
1
6 min
;
;
min (R)
2(R) − 1 3
(3.4)
Furthermore; if
then Assumption 2 holds if
2
4
1
6!6 min
;
:
min (R)
2(R) − 1 3
(3.5)
Proof. Obviously, the Welfert condition (3.2) provides (3.3).
Now we show (3.5) by using Theorem 2.2. It is easy to verify that (2.19) holds if 16! ¡ 4= 3:
Moreover, (2.22) holds if for all y ∈ ℜm ,
T
(C∗ y; y) − !((BA−1
∗ B + C∗ )y; y )60
(3.6)
and
−
!
T
(C∗ y; y) ¡ (C∗ y; y) − !((BA−1
∗ B + C∗ )y; y )60:
2
Inequality (3.6) requires that ! satises
! ¿ maxm
y∈ℜ
= maxm
y∈ℜ
=
=
(C∗ y; y)
−1
((BA∗ BT + C∗ )y; y)
(y; y)
+ C∗ )C∗−1=2 y; y)
T
(C∗−1=2 (BA−1
∗ B
1
T
min (C∗−1=2 (BA−1
∗ B
+ C∗ )C∗−1=2 )
1
;
min (R)
where the last equality is from the similarity transform involving C∗1=2 .
Similarly, we can show (3.7) requires that
!
− ¡ 1 − !max (R);
2
i.e., ! ¡ 2= (2(R) − 1):
Summarizing these choices on !, we obtain (3.5).
(3.7)
220
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Now, we are ready to give the local convergence theorem for the SOR–Newton method and the
SOR–BFGS method.
The SOR–Newton method for nonsmooth saddle-point problems is dened by
T
xk+1 = xk + !A−1
k (p − F (xk ) − B yk );
yk+1 = yk + !Ck−1 (Bxk+1 − G (yk ) − q);
(3.8)
where Ak ∈ @F (xk ) and Ck ∈ @G (yk ):
Theorem 3.1. Under Assumption 1, if ! satises the condition of Lemma 3:1 then the SOR–
Newton method (3.8) for saddle-point problem (1.1) is locally convergent.
Proof. By Lemma 3.1, Assumption 2 holds. Since F ′ (x∗ ) and G ′ (y∗ ) are singletons, for any given
there is neighborhood N of z∗ , such that for any z ∈ N
kA − A∗ k6
for all A ∈ @F (x)
kC − C∗ k6
for all C ∈ @G (y):
and
Hence, there exist 1 and 2 such that if kz − z ∗ k61 , then kA − A∗ k62 and kC − C∗ k62 for all
A ∈ @F (x) and C ∈ @G (y). By Theorem 2.1, the SOR–Newton method (3.8) locally converges.
Theorem 3.2. Under Assumption 1, if ! satises the condition of Lemma 3:1 then the SOR–BFGS
method for (1:1) locally converges.
Proof. By Lemma 3.1, Assumption 2 holds. Furthermore, by Theorem 2.1 there exist positive
constants 1 and 2 such that whenever kz0 − z∗ k61 ; kAk − A∗ k62 and kCk − C∗ k62 , the SOR–
BFGS method locally converges to z∗ . Hence it is sucient to show that there exist ˆ1 ∈ (0; 1 ] and
ˆ2 ∈ (0; 2 ] such that if kz0 −z∗ k6ˆ1 ; kA0 −A∗ k6ˆ2 and kC0 −C∗ k6ˆ2 then kzk −z∗ k61 ; kAk −A∗ k62
and kCk − C∗ k62 ; for all k¿0:
Since G ′ (y∗ ) is symmetric positive denite, by the Lipschitz continuity, G is strongly monotone
in a neighborhood of y∗ . Then for all yk in the neighborhood, vkT uk ¿ 0 and so Ck is updated at
every step k . In this case, the SOR–BFGS method satises the SOR-secant equation
Ak+1 sk = F (xk+1 ) + BT yk − F (xk ) − BT yk
−Ck+1 uk = Bxk+1 − G (yk+1 ) − Bxk+1 + G (yk ):
According to the results of [9, 16, 17], the strong dierentiability of H at z∗ , together with
Assumption 1 implies that there exist 1 ∈ (0; 1 ] and 2 ∈ (0; 2 ] such that whenever kz − z∗ k61 ,
kA − A∗ k62 ; kC − C∗ k62 and k(z; E ) − z∗ k6kz − z∗ k, we have
kA − A∗ −
tt T
AssT A
k6kA − A∗ k + kz − z∗ k
+
sT As
tT s
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
221
and
T
T
C − C∗ − Cuu C + vv
6kC − C∗ k + kz − z∗ k ;
T
T
u Cu
u v
where ¿0; ∈ (0; 1], s = 1 (z; A) − x; t = F (1 (z; P )) − F (x); u = 2 (z; E ) − y; v = G (2 (z; E )) −
G (y):
Let ˆ1 61 , r ∈ (0; 1) and ˆ2 + ˆ1 = (1 − r )62 :
We choose z0 ; A0 and C0 such that kz − z0 k6ˆ1 ; kA0 − A∗ k6ˆ2 and kC0 − C∗ k6ˆ2 :
Following a standard induction method (cf. [9, 16, 17]), we can show that for all k¿0, the
sequence {zk ; Ak ; Ck } satises
kzk+1 − z∗ k6rkzk − z∗ k61 ;
kAk+1 − A∗ k6ˆ2 + ˆ1
k
X
kCk+1 − C∗ k6ˆ2 + ˆ1
k
X
r i 62 62
i=0
and
r i 62 62 :
i=0
This completes the proof.
The linear version of inexact Uzawa method (1.10) has no inner products involved, but only
guarantees local convergence. A possible way to have global convergence and keep the linear
feature is to use multistep technique. In what follows, we shall study global convergence of the
multistep Uzawa method (1.14).
We consider the case G (y) ≡ Cy, where C is an m × m symmetric positive denite matrix.
Algorithm 3.1 (Multistep inexact Uzawa algorithm).
If p − F (xk ) − BT yk = 0, let xk+1 = xk + k k e: Otherwise let xk;0 = xk and lk be the minimum
nonnegative integer of i = 0; 1; 2; : : : such that
xk; i+1 = xk; i + k Pk−1 (p − F (xk; i ) − BT yk )
and
kF (xk; lk +1 ) + BT yk − pk6k :
Set xk+1 = xk;lk +1 and
yk+1 = yk + k Qk−1 (Bxk+1 − Cyk − q):
Here e ∈ ℜn with all entries equal to 1, k ; k ; k ; k are positive numbers and Pk ∈ ℜn×n and
Qk ∈ ℜm×m are symmetric positive-denite matrices.
When p − F (xk ) − BT yk = 0, we check whether Bxk − Cyk − q = 0. If both of them are equal to
zero, then (xk ; yk ) is the exact solution of (1.1) and we stop the algorithm. Hence, without loss of
generality, we assume that H (zk ) 6= 0.
222
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
Theorem 3.3. Suppose that Assumption 1 holds. Let Qk ≡ C; k 61=kek; k ≡ 6 min(1; = (1 +
kC −1 kkBk2 )); 0 ¡ k 6k kF (xk ) − F (xk−1 )k; k 62=kPk−1 k and k 6=. Then from any (x0 ; y0 ) ∈
ℜn+m the sequence {(xk ; yk )} generated by Algorithm 3.1 converges to the unique solution (x∗ ; y∗ )
of (1.1).
Proof. Since F is strongly monotone, and C is positive denite, (1.1) has a unique solution z∗ =
(x∗ ; y∗ ).
Now we show that Algorithm 3.1 is well dened. Assume k ¿ 0.
If F (xk ) + BT yk − p = 0; then
kF (xk + k k e) + BT yk − pk6kF (xk ) + BT yk − pk + k k kek6k ;
where the rst inequality follows from Assumption 1.
Assume that F (xk ) + BT yk − p 6= 0. Let
(x) ≡ F (x) + BT yk − p:
Following the proof of the symmetry principle theorem in [18] and using the mean value theorem
for nonsmooth functions in [8], we can show that is a gradient mapping of
Z
g (x ) =
1
0
(x − xk )T (xk + t (x − xk )) dt;
and
g(xk; i+1 ) − g(xk; i )
=
Z
=
Z
1
−k (Pk−1 (xk; i ))T (xk; i − sk Pk−1 (xk; i )) ds
0
0
1
−k (Pk−1 (xk; i ))T ((xk; i ) − sk A˜ ()Pk−1 (xk; i )) ds
6 − k (xk; i )T Pk−1 (xk; i ) + 21 2k kPk−1 k(xk; i )T Pk−1 (xk; i );
where A˜ () ∈ conv@F (x˜), x˜ is in the line segment between xk; i − sk Pk−1 (xk; i ) and xk; i : By Assumption 1, kA˜ ()k6:
The strongly monotone property of F implies that g is a strongly convex function and the level
sets of g are bounded. Moreover, the solution xk∗ of (x) = 0 is the unique minima of g.
Let = k − (2k = 2)kPk−1 k: Then for k ∈ (0; 2=kPk−1 k), we have
∞
X
1
(Pk−1 (xk; i ); (xk; i ))6 (g(xk;0 ) − g(xk∗ )) ¡ ∞:
i=0
Since Pk is symmetric positive denite and {(xk; i )T Pk−1 (xk; i )}i¿0 is a monotonically decreasing
sequence, we have
lim (xk; i ) = 0:
i→∞
Since k ¿ 0, this implies that there is a nite number lk such that k(xk; lk )k6k :
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
223
Moreover, from kF (xk+1 ) − F (xk )k¿kxk+1 − xk k ¿ 0; we can choose k+1 ¿ 0.
Therefore, Algorithm 3.1 is well dened.
By Theorem 2.1 of [5], {xk ; yk } converges to (x∗ ; y∗ ):
Acknowledgements
The author is grateful to T. Yamamoto for his helpful comments.
References
[1] O. Axelsson, E. Kaporin, On the solution of nonlinear equations for nondierentiable mapping, Preprint, Department
of Mathematics, University of Nijmegen, Nijmegen, 1994.
[2] R.E. Bank, B.D. Welfert, H. Yserentant, A class of iterative methods for solving saddle-point problems, Numer.
Math. 56 (1990) 645–666.
[3] J. Bramble, J. Pasciak, A. Vassilev, Analysis of the inexact Uzawa algorithm for saddle-point problems, SIAM
Numer. Anal. 34 (1997) 1072–1092.
[4] X. Chen, Convergence of the BFGS method for LC1 convex constrained optimization, SIAM J. Control Optim. 34
(1996) 2051–2063.
[5] X. Chen, Global and superlinear convergence of inexact Uzawa methods for saddle-point problems with
nondierentiable mappings, SIAM J. Numer. Anal. 35 (1998) 1130–1148.
[6] X. Chen, R. Womersley, A parallel inexact Newton method for stochastic programs with recourse, Annals. Oper.
Res. 64 (1996) 113–141.
[7] P.G. Ciarlet, Introduction to Numerical Linear Algebra and Optimization, Cambridge University Press, Cambridge,
1989.
[8] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
[9] J.E. Dennis, Jorge J. More, A characterization of superlinear convergence and its application to quasi-Newton
methods, Math. Comp. 28 (1974) 549–560.
[10] H.C. Elman, G.H. Golub, Inexact and preconditioned Uzawa algorithms for saddle-point problems, SIAM J. Numer.
Anal. 31 (1994) 1645–1661.
[11] R. Fletcher, Practical Methods of Optimization, 2nd ed., Wiley, Chichester, 1987.
[12] S.A. Funken, E.P. Stephan, Fast solvers for nonlinear FEM–BEM equations, Preprint, Institut fur Angewandte
Mathematik, University of Hannover, Hannover, 1995.
[13] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, New York, 1991.
[14] J. Janssen, S. Vandewalle, On SOR waveform relaxation methods, SIAM J. Numer. Anal. 34 (1997) 2456–2481.
[15] M. Heinkenschloss, C.T. Kelley, H.T. Tran, Fast algorithms for nonsmooth compact xed point problems, SIAM J.
Numer. Anal. 29 (1992) 1769–1792.
[16] J.M. Martnez, Fixed-point quasi-Newton methods, SIAM J. Numer. Anal. 29 (1992) 1413–1434.
[17] J.M. Martnez, SOR-secant methods, SIAM J. Numer. Anal. 31 (1994) 217–226.
[18] J.M. Ortega, W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New
York, 1970.
[19] J.S. Pang, Newton’s method for B-dierentiable equations, Math. Oper. Res. 15 (1990) 311–341.
[20] J.S. Pang, L. Qi, Nonsmooth equations: motivation and algorithms, SIAM J. Optim. 3 (1993) 443–465.
[21] L. Qi, Superlinear convergent approximate Newton methods for LC1 optimization problems, Math. Programming 64
(1994) 277–294.
[22] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970.
[23] R.T. Rockafellar, R.J.-B. Wets, Linear-quadratic problems with stochastic penalties: the nite generation algorithm,
in: V.I. Arkin, A. Shiraev and R.J.-B. Wets (eds.), Stochastic Optimization, Lecture Notes in Control and Information
Sciences, vol. 81, Springer, Berlin, 1987, pp. 545–560.
224
X. Chen / Journal of Computational and Applied Mathematics 100 (1998) 207–224
[24] T. Rusten, R. Winther, A preconditioned iterative method for saddle-point problems, SIAM J. Matrix Anal. Appl.
13 (1992) 887–904.
[25] P. Tseng, Applications of a splitting algorithm to decomposition in convex programming and variational inequalities,
SIAM J. Control Optim. 29 (1991) 119–138.
[26] A.J. Wathen, E.P. Stephan, Convergence of preconditioned minimum residual iteration for coupled nite
element/boundary element computations, Math. Res. Report, University of Bristol, AM-94-03, 1994.
[27] B.D. Welfert, Convergence of inexact Uzawa algorithms for saddle-point problems, Preprint, Department of
Mathematics, Arizona State University, 1995.
[28] T. Yamamoto, On nonlinear SOR-like methods I-Applications to simultaneous methods for polynomial zeros, Japan
J. Indust. Appl. Math. 14 (1997) 87–97.