9 Nico Acta 6 multiple sources

Proceedings of International Conference on Theory and Applications of Mathematics
and Informatics – ICTAMI 2003, Alba Iulia

ON THE SMOOTHING PARAMETER IN CASE OF DATA
FROM MULTIPLE SOURCES
by
Breaz Nicoleta
Abstract. In this paper we focuse on data smoothing by spline function. The smoothing
parameter λ that is involved in this kind of modeling is obtained here from the generalized
cross validation (GCV) procedure. Also for data from two sources with different weights is
already known a GCV formula for parameter λ given in a particular case when the smoothing
function is a function on the circle. We extend this formula to a more general case and in the
same time for more than two sources.

1. INTRODUCTION
We consider a regressional model written with n observational data

y i = f (x i ) + ε i , i = 1, n

where xi ∈ [0,1] , ε = (ε 1 , ε 2 ,..., ε n ) ~ N 0,σ 2 I , a gaussian n dimensional vector with


(

)

zero mean and σ I matrix of covariances. About the regression function we just
know the information that f is in some space Wm defined as
2

{

}

Wm = Wm [0,1] = f : f , f ′,..., f (m−1) absolutely continuous, f (m ) ∈ L2 .
Then, we can speak about a spline smoothing problem. So we obtain an
estimate of f by finding f λ ∈ Wm to minimize
1
n

2
∑ [yi − f (xi )]2 + λ ∫ ( f (m ) (u )) du,

1

n

i =1

λ >0.

(1)

0

It is known that the solution of this problem is the natural polynomial spline of degree
2m − 1 with knots xi , i = 1, n .
For the beginning we consider the smoothing parameter λ fixed. Then we can
search a solution for (1) in a certain subspace of Wm , spaned by n appropriate chosen
basis functions. According to [2] such basis functions are related to B-spline but here
we are not interested in this. We just consider f of the form

f =


∑c B
n

k =1

k

with Bk basis functions.

75

k

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

Now we rewrite the problem in matriceal form using the following notations:

y = ( y1 , y 2 ,..., y n )



c = (c1 , c2 ,..., cn )

( )

B = Bij

J(f )=

1≤i ≤ n ,
1≤ j ≤ n

Bij = B j ( xi ) .

( )
∫ ( f (u )) du in matriceal form c′Σc

Also,

from


1

[2]

we

know

that

2

m

we

can

write


the

seminorm

for some matrix Σ . Then the problem is

to find c to minimize y − Bc + λc ′Σc . The solution for this problem is given as
0

2

c = (B ′B + nλΣ ) B ′y . When the smoothing parameter λ is too small we obtain some
function f which is close to data despite of its smoothness and when λ is too big we
obtain some function f which is very smooth but is not sufficient close to data.
Among the methods which provide an optimal λ from the data are the(cross
validation)CV and (general cross validation) GCV procedures described in [2].
According to CV method, λ is the minimizer of the expression
−1


CV (λ ) =

1
n

∑ (y
n

k =1

k

)

2
− f λ[k ] (xk )

with f λ[k ] the spline estimate using all data but the k-th data point of y. As a
generalization, GCV procedure use a more general function
1

(I − A(λ ))y 2
GCV (λ ) = n
2
1

(
(
)
)
Tr
I
A
λ

n




where A(λ ) is the influence or hat matrix given by the relation

 f λ (x1 ) 


ˆy =  M  = A(λ ) y .
 f (x )
 λ n 

76

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources
From [2] we know that A(λ ) has the form A(λ ) = B(B ′B + nλΣ ) B ′ . Also we remind
that an estimate for σ

2

is given by σˆ =
2

−1


(I − A(λ )) y 2
.
Tr (I − A(λ ))

In the problem formulated here the data came from a single source. In [1],
Feng Gao formulate this problem for two sources and give a similar GCV method for
λ . Gao consider a particular case when f is a function on the circle.
In this paper we extend the results of Gao for more than two sources and in
the same time we consider a more general case for domain of function f.
2.MAIN RESULTS

We consider the case when the data came from l sources with unknown
weights as in
y1i = f ( x1i ) + ε 1i , i = 1, N 1

y 2i = f ( x 2i ) + ε 2i , i = 1, N 2
y li = f (xli ) + ε li , i = 1, N l ,
N1 + N 2 + ... + N l = n ,

..........................................


(

with residual gaussian vectors defined as

ε 1 = (ε 11 , ε 12 ,..., ε 1N1 )′ ~ N 0, σ 12 I

(

)

ε 2 = (ε 21 , ε 22 ,..., ε 2 N 2 )′ ~ N 0, σ 22 I

(

)

(

)

)

.......................................................
ε l = ε l1 , ε l 2 ,..., ε lNl ′ ~ N 0, σ l2 I
We assume that σ 12 ,σ 22 ,...,σ l2 are unknown and ε1 , ε 2 ,..., ε l are independent.

If we knew σ 12 ,σ 22 ,...,σ l2 , an estimate of function f is a solution of the
variational problem given as

77

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

min
f ∈Wm

+

1
σ l2

1
N 1 + N 2 + ... + N l

∑ (y
Nl

i =1

li

1
 2
 σ1

2
∑ ( y1i − f (x1i )) +
N1

i =1

2
− f (xli ))  + λJ ( f ).


(
= (y

y2

(

21 ,

∑ (y
N2

i =1

∑c B

,..., y )

,..., y )

We search for f ∈ Wm of the form f =
y1 = y11 , y12

1
σ 22

− f ( x 2i )) + ...
2

2i

n

k =1

k

k

and we use the notations

1 N1

y 22

2 N2

)

...................................

y l = y l1 , y l 2 ,..., y lN l

c = (c1 , c 2 ,..., c n )

( )
= (B )

B1 = Bij
B2

1≤i ≤ N1 ,
1≤ j ≤ n

ij 1≤i ≤ N 2
1≤ j ≤ n

( )

Bij = B j (x1i )

, Bij = B j ( x 2i )

, Bij = B j (xli )

..............................................
Bl = Bij

1≤i ≤ N l
1≤ j ≤ n

Now we have l matrices B.
The model becomes
y1 = B1c + ε 1
.....................
y l = Bl c + ε l

(

(

)

ε i ~ N 0, σ i2 I , i = 1, l

)

and the variational problem is now equivalent to find c the minimizer of
1
2
2
2
r y − B c + r2 y 2 − B2 c + ... + rl yl − Bl c + αc′Σc
(2)
θ ⋅n 1 1 1

with Σ some known matrix, θ a nuisance parameter, α a smoothing parameter and
ri the weighting parameters given as

78

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

θ = σ 1σ 2 ...σ l

ri =

σ 1σ 2 ...σ i −1σ i +1 ...σ l
, i = 1, l (r1r2 ...rl = 1)
σi

α = σ 1σ 2 ...σ l ⋅ λ ⋅ n .

We can prove the following proposition:
Proposition 1.
problem (2) is

For fixed r = (r1 , r2 ,..., rl ) and α the solution of the variational







cˆr ,α =  r1 B1 B1 + r2 B2 B2 + ... + rl Bl Bl + αΣ   r1 B1 y1 + r2 B2 y 2 + ...+ rl Bl yl  (3)

 

−1

Proof. We denote by E the expression that has to minimize so we have the condition
∂E
= 0.
∂c
This condition is equivalent with




− r1 B1 y1 + r1 B1 B1c − ... − rl Bl yl + rl Bl Bl c + αΣc = 0

and the solution of this matriceal equation is






c =  r1 B1 B1 + r2 B2 B2 + ... + rl Bl Bl + αΣ   r1 B1 y1 + r2 B2 y 2 + ...+ rl Bl yl  .

 

−1

According to formula (3) it is important to get some estimates for


1
 and α . Using the similar methods with [1] we give
r  r1 , r2 ,..., rl −1 and rl =
r1r2 ...rl −1 

a GCV function that we can use for estimating r and α in the same time.
We introduce the notations

′ ′

y =  y1 , y 2 ,..., yl 


(4)




B =  B1 , B2 ,..., Bl 


and the matrix

79

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

 1
I N1

 r1

 0
I (r ) = 
 ...

 0

 0


0

0

0 ...

0

0

IN
r2 2
...

0

0 ...

0

0

... ... ...

...

...

1

0

0

0 ...

0

0

0 ...

1
rl −1

0

I Nl −1

r1 r2 ...rl −1 I Nl

0














(5)

Also we use the notations




M =  r1 B1 B1 + r2 B2 B2 + ... + rl Bl Bl + αΣ 


y r = I −1 (r ) ⋅ y

B =I
r

−1

(r ) ⋅ B

(6)

We can prove the following proposition:

Proposition 2. The influence matrix A r (r ,α ) defined by

yˆ r = A r (r ,α ) y r

 r B M −1 B ′
1
 1 1
−1 ′

A r (r ,α ) =  r1 r2 B2 M B1

...

−1 ′
 rl r1 Bl M B1

has the form

or

(7)


r1 r2 B1 M −1 B2

r2 B2 M −1 B2
...

rl r2 Bl M −1 B2


A r (r ,α ) = B r M −1 B r .

...
...
...
...


r1 rl B1 M −1 Bl 

r2 rl B2 M −1 Bl 


...

−1 ′
rl Bl M Bl 

Proof. The solution (3) of the variational problem (2) with the notations (4), (5), (6)
can be written as
−1



ˆcr ,α =  B r B r + αΣ  B r y r = M −1 B r y r .


The condition (7) becomes

80

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

′ 

B r ⋅  M −1 B r y r  = A r (r ,α ) ⋅ y r



A r (r ,α ) = B r M −1 B r .

and further

In order to get a GCV formula for r and α , first we construct for our case a
CV-like formula.
Now we denote by cr[k,α] , the spline estimate of c, using all but the k-th data
point of y; also we denote by y k the k-th data point and by B k and B k ,r the k-th row
of B and B r respectively.
Then the (cross validation) CV function for our model would be
CV (r ,α ) =

1
n

∑ (y
n

k =1

k

− B k cr[k,α]

)

2

Next we prove the corresponding leaving-out-one lemma for our case that is:
Lemma 3. Let be h r ,α [k , z ] the solution to the variational problem
min
c

(

1
2
2
r1 y1 − B1c + ... + rl yl − Bl c + αc′Σc
θ ⋅n

)

with the k-th data point y kr replaced by z. Then we have

[

]

hr ,α k , B k ,r c r[k,α] = c r[k,α]
Proof. We have

(B

k ,r

c r[k,α] − B k ,r c r[k,α]

(

= ∑ y ir − B i ,r c r[k,α]
n

i =1
i≠k

) + ∑ (y
2

)

2

n

i =1
i≠k

r
i

( )

− B i ,r c r[k,α]


+ α c r[k,α] Σc r[k,α] ≤

81

)

2

( )


+ α c r[k,α] Σc r[k,α] =

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

(

)

≤ ∑ yir − B i ,r c + αc′Σc ≤
n

i =1
i≠k

(

2

)

(

)

≤ B k ,r cr[k,α] − B k ,r c + ∑ yir − Bir c + αc′Σc , (∀)c ∈ R n
n

2

i =1
i≠k

2

Based on the leaving-out-one lemma now we can prove the following theorem:

(I

Theorem 4. We have the identity:

1
CV (r ,α ) =
n

∑
n

−1
k

(r ) ⋅ [I − A r (r ,α )]I −1 (r )y )

[

](

)

2

′
−1
−1
r
 I k (r ) ⋅ I − A (r , α ) I k (r ) 


−1
−1
with I k (r ) , the k-th row of the matrix I (r ) .
Proof. We consider the following identity
k =1

y k − B k cr[k,α] = y k − B k cr[k,α] ⋅
Further we have

[k ]

y k − B c r ,α =
k

y kr − B k ,r cr ,α

y kr − B k ,r cr ,α

.

2

(8)

y kr − B k ,r c r ,α

rsk ⋅ y k − rsk B k c r ,α
y k − B k c r[k,α]

 1 if k ∈ {1,2,..., N1 }
 2 if k ∈ {N1 + 1,..., N 2 }
.
sk = 
 .....................................
l if k ∈ {N l −1 + 1,..., N l }

where

Moreover, we can write

y k − B k c r[k,α] =

rsk −

y kr − B k ,r cr ,α

B k ,r c r ,α − B k ,r c r[k,α]
y k − B k c r[k,α]

By the leaving out one lemma we can write that

82

(9)

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources
B k ,r c r ,α − B k ,r c r[k,α]
[k ]

y k − B c r ,α
k

=

[

]

[

]

[k ]

− B c r ,α
k ,r

[k ]

y k − B c r ,α

[

B k ,r hr ,α k , y kr − B k ,r hr ,α k , B k ,r c r[k,α]
y kr

[

B k , r hr ,α k , B k ,r c r ,α − B k ,r hr ,α k , B k ,r c r[k,α]

=

k

]

]=

rsk

Because B k ,r c r ,α is linear in each data point, we can replace the divided difference
below by a derivative. So we can write that
B k ,r c r ,α − B k , r c r[k,α]
y k − B k c r[k,α]

Moreover, we have

=

∂B k ,r c r ,α
∂y kr

∂B k , r c r ,α
∂y kr

⋅ rsk .

= a kk ,

(10)

(11)

the kk-th entry from A r (r ,α ) matrix.
So, according to (8), (9), (10) and (11) we can write
y k − B k c r[k,α] =

Moreover,

y kr − B k ,r c r ,α

(1 − a kk ) ⋅ I kk−1 (r )

[

.

]

y kr − B k ,r cr ,α = y kr − A k ,r (r ,α ) ⋅ y r = I k − A k ,r (r ,α ) ⋅ I −1 (r ) y

where A k ,r (r ,α ), I k is a k-th row of matrix A r (r ,α ) and I respectively.
So we have
I − A k ,r (r ,α ) ⋅ I −1 (r ) y
y k − B k cr[k,α] = k
.
r
⋅ I kk−1 (r )
1 − a kk

[

(

)

]

After multiplying the numerator and the denominator by I kk−1 (r ) we obtain the results.
In the same manner as in [1] or [2] we generate the GCV function as follows.

We replace the denominator I k−1 (r ) I − A r (r ,α ) I k−1 (r ) by the average

(

)

83

Breaz Nicoleta - On the smoothing parameter in case of data from multiple sources

1
n

1
∑ I (r )(I − A (r ,α ))I (r )′ = n Tr  I (r )(I − A (r ,α ))⋅ I (r )′ 
n

k =1

−1
k

−1
k

r

and we have a GCV-like function
GCV (r ,α ) =

−1

[

]

1 −1
I (r ) I − A r (r ,α ) I −1 (r ) y
n

(

−1

r

)

2

 1  −1
′ 
r
−1
 n Tr  I (r ) I − A (r ,α ) ⋅ I (r ) 



2

This function is a generalization for the GCV (λ ) function from [2] (one
source) and the GCV (r ,α ) function from [1] (two sources).
REFERENCES

[1] Gao Feng, On combining data from multiple sources with unknown relative
weights, Technical Report, no. 894/1993, University of Wisconsin, Madison
[2] Wahba Grace, Spline models for observational data, SIAM, Philadelphia, CBMSNSF Regional Conference, Series in Applied Mathematics, vol. 59.
Author:

Nicoleta Breaz, "1 Decembrie 1918" University of Alba lulia, Romania., Computer
Science and Mathematics Department, e-mail: nbreaz@uab.ro

84