Global spectral gradient method

where, ˆ ˆ l l i iq1 F X ,Y s y , ,i s 2, . . . ,2 n. 13 Ž . Ž . l Õ l Õ i i iq1 iq1 ˆ Ž . Since in a 2D homogeneous medium, l rl s i i Ž . Ž . El rE x and EÕ rE x s 0, we can establish from i i i i Ž . Ž . the nonlinear system of Eqs. 12 and 13 the fol- lowing result. Theorem 2.1. In a 2D homogeneous isotropic medium solÕing the Snell’s non-linear system 12 and 13 is equiÕalent to forcing the gradient of the traÕel time function giÕen by Eqs. 7 – 9 to be equal to zero. Therefore, the first order necessary conditions for Ž . the travel time function given by Eq. 6 are identical Ž . to Snell’s nonlinear system of Eq. 12 . Thus, New- Ž . ton’s method applied to Eq. 6 will generate identi- cal iterates to the ones generated by the Newton’s method applied to the Snell’s nonlinear system of Ž . Eq. 12 . Moreover, in a 2D homogeneous medium finding a ray path using Snell’s equations to obtain Ž . an initial iterate for solving problem 6 , by an optimization scheme, seems to be redundant. There- fore, an optimization technique for solving problem Ž . 6 that does not require a close initial ray path to converge and, that requires very inexpensive compu- tations, will avoid the redundancy on the use of different approaches for solving the problem and also will guarantee fast convergence. In the 3D homogeneous medium, we can only establish that the first order necessary conditions associated to the Ž . travel time function 6 imply that the Snell’s nonlin- ear system of equations are satisfied. Based on the conclusions made on the above paragraph, we propose to solve the unconstrained Ž . nonlinear optimization problem 6 with the global Ž . SG method Raydan, 1993, 1997 . This low storage optimization technique has been used recently in Ž . inversion tomography, Castillo et al. 2000 , obtain- ing many computational advantages. This low stor- age technique presents four advantages: requires low computational storage and few floating point opera- tions when compared with other optimization tech- Ž niques, second order information Hessian of the . travel time function is not needed and, it is a global technique, in the sense that it converges from any Ž . initial ray trajectory X , Y , which is in sharp contrast with the method proposed by Mao and Ž . Stuart 1997 . A brief but concise description of the Ž . SG method when applied to problem 6 can be found in the next section.

3. Global spectral gradient method

In this section we briefly discuss the Spectral Ž . Ž Gradient SG method proposed by Raydan 1993, . Ž . 1997 for solving problem 6 . Assume that the travel time function T X r : R 4 ny2 X s ™R is continuously differentiable in R 4 ny2 and let T k k k k k k X ,Y s x , x , . . . , x , y , y , . . . , y 14 Ž . Ž . Ž . k k 2 3 2 n 2 3 2 n be the k th iterate, then the SG method is given by the following iterative process X ,Y s X ,Y y l G , 15 Ž . Ž . Ž . kq 1 kq1 k k k k X r Ž where G is the gradient of T evaluated at X , k X k s . X r Ž . Y , i.e., G [ =T X , Y and, l is the k k X k k k s steplength. Ž . Notice that the search direction in Eq. 15 is the negative gradient direction, as in the case of the steepest descent method. The SG is not a descent method since the objective function does not de- crease at every iteration. Unfortunately, forcing de- crease at every iteration will reduce the SG method to the steepest descent method, which is known for being slow. Therefore, the steplength l is chosen k by the nonmonotone line search of Grippo et al. Ž . 1986 , which imposes a much weaker condition of decrease on the objective function that makes the SG method much faster than any globalization of steep- Ž . est descent. For a given l s 1ra at iteration k th k with, G T L ky 1 ky1 a s y , k T l G G ky 1 ky1 ky1 where L s G y G , the l is obtained by the ky 1 k ky1 k nonmonotone line search, which consists on verify- ing the following weak condition, at each iteration, T X r X ,Y Ž . X kq1 kq1 s F max T X r X ,Y Ž . X kyj kyj s Ž . 0FjFmin k , M T T T q g G X ,Y y X ,Y , 16 Ž . Ž . Ž . Ž . k kq1 kq1 k k where M is a non-negative integer and g is a small Ž . real number for details see Luengo et al., 1998 . This nonmonotone line search has been recently incorporated to many optimization algorithms. Con- Ž . dition 16 forces the objective function to decrease after M iterations, which guarantees the fast global convergence of the method, i.e., fast convergence of Ž . the SG method from any initial guess X , Y . It is important to stress out that the numerical results used to compare the behavior of the SG method with recent extensions of the conjugate gra- dient method for the non-quadratic case, indicate that the SG method allows, in many cases, a significant reduction of the number of line search and also the Ž . number of gradient evaluations, Raydan 1997 . Moreover, the SG method does not require the evalu- ation of the Hessian matrix and also does not require to solve a linear system of equations at each itera- tion, as the Newton’s method. The low computa- tional work of the SG method together with the properties mentioned above significantly reduce the computational cost and the CPU time of tracing rays in a 3D medium. Next, we present the SG algorithm when applying to trace rays in a 3D medium. 3.1. Spectral gradient method SG Ž . Ž . Given X , Y , a , integer M G 0, g g 0, 1 , d 0, 0 - s - s - 1, 0 - ´ - 1. Set k s 0 1 2 5 5 Step 1: if G is sufficiently small then stop k Step 2: If a F ´ or a G 1r´ then set a s d k k k Step 3: Let l s 1ra k Ž . Step 4: Nonmonotone Line Search X r ŽŽ . . If T X , Y y lG X k k k s X r Ž . T F max T X , Y y glG G X kyj kyj k k s Ž . 0FjFmin k , M Ž . Ž . then set l s l, X , Y s X , Y y l G , k kq1 kq1 k k k k and Ž Ž kq 1 kq1 . Ž kq 1 kq1 . Z s f x , y , f x , y , . . . , kq 1 2 2 2 3 3 3 Ž kq 1 kq1 .. f x , y and go to Step 6 2 n 2 n 2 n w x Step 5: Choose s g s , s , set l s sl, and go 1 2 to step 4 ŽŽ T . Ž T .. Step 6: Set a s y G L r l G G , kq 1 k k k k k k s k q 1, and go to Step 1. Observe that the choice of the steplength a is k not the classical choice of the steepest descent method. This choice of the steplength is related to the eigenvalues of the Hessian matrix at the mini- mizer, which makes the method faster than the steep- Ž . est descent method see Glunt et al., 1993 . Also observe that the computational storage is very low, three vectors and, the number of floating point opera- tions per iteration is much smaller than the Newton’s Ž Ž . Ž . method order 3 2 n y 1 in 2D and 3 4 n y 2 in 3D . plus the cost of evaluating the gradient vector .

4. Convergence properties