51
3.8.1 Maximum Likelihood Estimator
The idea behind maximum likelihood is to obtain the most likely values of the parameters for a given likelihood function that best describes the data. In our case,
i
y
is the only data gathered from the observed information, which may depend on past
observations, so the likelihood function for one component life cycle can be written as
T i
i i
T
y p
L
1 1
, |
|
T i
N x
x N
x i
i i
i i
i
i i
i
x P
x x
P x
y p
1 1
1 1
2 1
1
1 1
, |
, |
, |
3-24
where
2 1
,
is a set of unknown parameters,
T
is the number of monitoring checks during the cycle,
N
is the number of states,
i
y
is the condition monitoring reading at time
i
t
and
1 ,
2 1
1
..., ,
i i
y y
y
. Parameter estimates that maximize the likelihood of equation 3-24 are the most likely values for the parameters. However,
the products of the likelihood could turn out to be very small numbers and inconvenient for computations. For numerical convenience, the log of the likelihood is used.
Therefore, equation 3-24 can be rewritten as
, |
log |
1 1
i i
T i
T
y p
LogL
T i
N x
x N
x i
i i
i i
i
i i
i
x P
x x
P x
y p
1 1
1 1
2 1
1
1 1
, |
, |
, |
log
3-25
In our case, we could have more than one component life cycle; hence we simulated five life cycles, which are shown in
Figure 3-9
.
52
Figure 3-9: Simulation of five life cycles of data imitating the bearing case
Each set contains one complete life cycle with
T
variable monitoring checks in each cycle. Similarly, the likelihood for m life cycles is given by
m j
T i
ji ij
mT
j
y p
L
1 1
1
, |
|
m j
T i
N x
x N
x ji
ji ji
ji ji
ji
j i
i i
x P
x x
P x
y p
1 1
1 1
1 2
1 1
1 1
, |
, |
, |
3-26
where m is the number of complete life cycles,
2 1
,
is a set of unknown parameters,
j
T
is the number of monitoring checks of the jth cycle,
ji
y is the condition monitoring reading at
i
t
for the jth cycle and the
ith
monitoring,
ji
x
and
1
ji
x
are the states of the system at
i
t
and
1
i
t for the jth cycle and
ith
monitoring respectively, and
1 ,
2 1
1
..., ,
ji j
j ji
y y
y
.
Similarly, for numerical convenience, the log of the likelihood is used. Therefore, equation 3-26 can be rewritten as
53
m j
T i
ji ij
mT
j
y p
L
1 1
1
, |
log |
m j
T i
N x
x N
x ji
ji ji
ji ji
ji
j i
i i
x P
x x
P x
y p
1 1
1 1
1 2
1 1
1 1
, |
, |
, |
log
3-27
We coded equation 3-27 in FORTRAN95 and used NAG E04JYF optimisation routines to obtain the values of the estimated parameters
1 1
, ,
,
b and
2
as shown in
Table 3-2 .
Estimated
Value
1
ˆ = 0.2062
ˆ
= 3.1710 bˆ = 0.8109
1
ˆ
= 0.0231
2
ˆ
= not available
True Value
1
= 0.2176 = 4.0
b
= 0.8
1
= 0.05
2
= 0.025
Table 3-2: The estimated parameters and their true values
We obtained good estimated values for
bˆ ,
ˆ ,
ˆ
and
1
ˆ
, but had no success for
2
ˆ
, as it tended to the lowest setting boundary in the optimisation routine. In order to explain this,
let us look at each monitoring point where we have
|
i i
y p
until the last monitoring point,
|
j j
T T
y p
, before failure occurs, as shown in
Figure 3-10
. Here, the failure information is not used and because of that, the likelihood of equation 3-26 contains
little information on the length
2
l , which is the reason
2
ˆ
is ―non-identifiable‖ from these data.
Figure 3-10: The likelihood function at each monitoring point
By incorporating the failure information, the probability of
3
1
j
T
X
given
j
jT
is
54
| 2
2 |
3 |
1 1
| 3
| 3
1 1
1
j j
j j
j j
j j
j j
jT jT
jT jT
jT T
jT jT
jT T
X P
X X
P X
P x
X P
X P
3-28
Multiplying this term in equation 3-26, the likelihood function now becomes
| 3
, |
, |
, |
|
1 1
1 1
1 1
2 1
1
1 1
j j
j i
i i
jT jT
m j
T i
N x
x N
x ji
ji ji
ji ji
ji mT
X P
x P
x x
P x
y p
L
3-29
By taking logs on both sides of equation 3-29 and maximizing the log likelihood, we
coded it using the same E04JYF routines and obtained the results as shown in Table 3-3
below.
Estimated
Value
1
ˆ
= 0.2056
ˆ
= 3.1702 bˆ = 0.8126
1
ˆ
= 0.0232
2
ˆ
= 0.0362
True Value
1
= 0.2176 = 4.0
b
= 0.8
1
= 0.05
2
= 0.025
Table 3-3: The estimated parameters and true values using the likelihood function with failure information
The results show a significant improvement in the estimated value of
2
by incorporating the failure information, which is shown to be a vital piece of information
needed for parameter estimation, although it may not be available in practice.
Although the likelihood function with failure information was successful for estimating parameters and some of the results were very good, it was felt that the results might be
better if another technique was used; hence we turned our attention to the Expectation- Maximization EM algorithm, which is particularly useful in HMM Rabiner, 1989.
55
3.8.2 Expectation-Maximization EM Algorithm