Characterization of Least Squares Solutions

1.4.2 Characterization of Least Squares Solutions

Let y ∈ R n be a vector of observations that is related to a parameter vector c ∈ R by the linear relation

(1.4.2) where A is a known matrix of full column rank and ǫ ∈ R m is a vector of random errors. We

y = Ac + ǫ, A ∈R m ×n ,

assume here that ǫ i , i = 1 : m, has zero mean and that ǫ i and ǫ j i.e.,

The parameter c is then a random vector which we want to estimate in terms of the known quantities A and y. Let y T c be a linear functional of the parameter c in (1.4.2). We say that θ = θ(A, y) is an unbiased linear estimator of y T c if E(θ) = y T c . It is a best linear unbiased estimator if θ has the smallest variance among all such estimators. The following theorem 15 places the method of least squares on a sound theoretical basis.

15 This theorem is originally due to C. F. Gauss (1821). His contribution was somewhat neglected until redis- covered by the Russian mathematician A. A. Markov in 1912.

47 Theorem 1.4.1 (Gauss–Markov Theorem).

1.4. The Linear Least Squares Problem

Consider a linear model ( 1.4.2), where ǫ is an uncorrelated random vector with zero mean and covariance matrix V = σ 2

I . Then the best linear unbiased estimator of any linear functional y T c is y T ˆc, where

ˆc = (A T A) −1 A T y

2 . Fur- thermore, the covariance matrix of the least squares estimate ˆc equals

is an unbiased estimate of σ 2 , i.e. E(s 2 )

Proof. See Zelen [389, pp. 560–561]. The set of all least squares solutions can also be characterized geometrically. For this

purpose we introduce two fundamental subspaces of R m , the range of A and the null space of A T , defined by

R(A) = {z ∈ R n | z = Ax, x ∈ R },

(1.4.7) If z ∈ R(A) and y ∈ N (A T ) , then z T y

N (A T = {y ∈ R |A y = 0}.

= 0, which shows that N (A T ) is the orthogonal complement of R(A).

=x T A T y

By the Gauss–Markov theorem any least squares solution to an overdetermined linear system Ax = b satisfies the normal equations

=A T b. (1.4.8) The normal equations are always consistent, since the right-hand side satisfies

Therefore, a least squares solution always exists, although it may not be unique.

Theorem 1.4.2.

2 if and only if the residual vector r = b − Ax is orthogonal to R(A) or, equivalently,

(1.4.9) Proof. Let x be a vector for which A T (b

A T (b − Ax) = 0.

− Ax) = 0. For any y ∈ R n , it holds that

b − Ay = (b − Ax) + A(x − y). Squaring this and using (1.4.9) we obtain

2 2 2 2 , where equality holds only if A(x − y) = 0.

48 Chapter 1. Principles of Numerical Calculations Now assume that A T (b

2 − 2ǫ(Az) (b − Ax)

From Theorem 1.4.2 it follows that a least squares solution x decomposes the right- hand side into two orthogonal components

(1.4.10) This geometric interpretation is illustrated in Figure 1.4.1. Note that although the solution

b = Ax + r, T r = b − Ax ∈ N (A ), Ax ∈ R(A).

x to the least squares problem may not be unique, the decomposition (1.4.10) always is unique.

Figure 1.4.1. Geometric characterization of the least squares solution. We now give a necessary and sufficient condition for the least squares solution to be

unique.

Theorem 1.4.3.

A is positive definite and hence nonsingular if and only if the columns of A are linearly independent, that is, when rank (A) = n. In this case the least squares solution x is unique and given by

The matrix A T

= (A T A) −1 A T b. (1.4.11)

Proof.

x T A T Ax

0, and hence A T A is positive definite. On the other hand, if the columns are linearly dependent, then for some x 0 have Ax 0 T A T Ax 0 T A is not positive definite. When A = 0. Then x T 0 = 0, and therefore A A is positive definite it is also nonsingular and (1.4.11) follows.

When A has full column rank A T A is symmetric and positive definite and the normal equations can be solved by computing the Cholesky factorization A T A =R T R . The normal

49 equations then become R T Rx

1.4. The Linear Least Squares Problem

=A T b , which decomposes as

=A T b,

Rx = z.

The first system is lower triangular and z is computed by forward-substitution. Then x is computed from the second upper triangular system by back-substitution. For many practical problems this method of normal equations is an adequate solution method, although its numerical stability is not the best.

Example 1.4.1.

The comet Tentax, discovered in 1968, is supposed to move within the solar system. The following observations of its position in a certain polar coordinate system have been made:

. By Kepler’s first law the comet should move in a plane orbit of elliptic or hyperbolic form,

if the perturbations from planets are neglected. Then the coordinates satisfy

r = p/(1 − e cos φ),

where p is a parameter and e the eccentricity. We want to estimate p and e by the method of least squares from the given observations.

We first note that if the relationship is rewritten as

1/p − (e/p) cos φ = 1/r,

it becomes linear in the parameters x 1 = 1/p and x 2 = e/p. We then get the linear system Ax = b, where

0.9804 The least squares solution is x = ( 0.6886 0.4839 ) T giving p = 1/x 1 = 1.4522 and finally e = px 2 = 0.7027.

By (1.4.10), if x is a least squares solution, then Ax is the orthogonal projection of b onto R(A). Thus orthogonal projections play a central role in least squares problems. In

∈R m ×m is called a projector onto a subspace S ⊂ R if and only if it holds that

general, a matrix P 1 m

P 1 v = v ∀v ∈ S,

P 2 1 =P 1 .

50 Chapter 1. Principles of Numerical Calculations

1 2 , where P

An arbitrary vector v ∈ R m can then be decomposed as v = P 1 v

+P ≡v +v

. In particular, if P is symmetric, P

2 =I−P 1 1 1 =P 1 , we have P T 1 P 2 v

1 )v = 0 ∀v ∈ R , and it follows that P T P

1 m 2 = 0. Hence v 1 2 =v 1 P 2 v = 0 for all v ∈ R , i.e., v 2 ⊥v 1 . In this case P 1 is the orthogonal projector onto S, and P 2 =I−P 1 is the orthogonal

projector onto S ⊥ . In the full column rank case, rank (A) = n, of the least squares problem, the residual

r = b − Ax can be written r = b − P R(A) b , where

(1.4.13) is the orthogonal projector onto R(A). If rank (A) < n, then A has a nontrivial null space.

R(A) T = A(A A) −1 A T

2 , then the set of all least squares solutions is

(1.4.14) In this set there is a unique solution of minimum norm characterized by x ⊥ N (A), which

S = {x = ˆx + y | y ∈ N (A)}.

is called the pseudoinverse solution.

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

19 819 7

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

1 29 9

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

7 202 3

An Analysis of illocutionary acts in Sherlock Holmes movie

27 148 96

The Effectiveness of Computer-Assisted Language Learning in Teaching Past Tense to the Tenth Grade Students of SMAN 5 Tangerang Selatan

4 116 138

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

9 128 37

Existentialism of Jack in David Fincher’s Fight Club Film

5 71 55

Phase response analysis during in vivo l 001

2 30 2

The Risk and Trust Factors in Relation to the Consumer Buying Decision Process Model

0 0 15

PENERAPAN ADING (AUTOMATIC FEEDING) PINTAR DALAM BUDIDAYA IKAN PADA KELOMPOK PETANI IKAN SEKITAR SUNGAI IRIGASI DI KELURAHAN KOMET RAYA, BANJARBARU Implementation of Ading (Automatic Feeding) Pintar in Fish Farming on Group of Farmer Close to River Irriga

0 0 5