Maximum Likelihood Estimator Parameter Estimation

51

3.8.1 Maximum Likelihood Estimator

The idea behind maximum likelihood is to obtain the most likely values of the parameters for a given likelihood function that best describes the data. In our case, i y is the only data gathered from the observed information, which may depend on past observations, so the likelihood function for one component life cycle can be written as       T i i i T y p L 1 1 , | |                      T i N x x N x i i i i i i i i i x P x x P x y p 1 1 1 1 2 1 1 1 1 , | , | , |    3-24 where   2 1 ,     is a set of unknown parameters, T is the number of monitoring checks during the cycle, N is the number of states, i y is the condition monitoring reading at time i t and   1 , 2 1 1 ..., ,     i i y y y . Parameter estimates that maximize the likelihood of equation 3-24 are the most likely values for the parameters. However, the products of the likelihood could turn out to be very small numbers and inconvenient for computations. For numerical convenience, the log of the likelihood is used. Therefore, equation 3-24 can be rewritten as , | log | 1 1         i i T i T y p LogL                    T i N x x N x i i i i i i i i i x P x x P x y p 1 1 1 1 2 1 1 1 1 , | , | , | log    3-25 In our case, we could have more than one component life cycle; hence we simulated five life cycles, which are shown in Figure 3-9 . 52 Figure 3-9: Simulation of five life cycles of data imitating the bearing case Each set contains one complete life cycle with T variable monitoring checks in each cycle. Similarly, the likelihood for m life cycles is given by        m j T i ji ij mT j y p L 1 1 1 , | |                       m j T i N x x N x ji ji ji ji ji ji j i i i x P x x P x y p 1 1 1 1 1 2 1 1 1 1 , | , | , |    3-26 where m is the number of complete life cycles,   2 1 ,     is a set of unknown parameters, j T is the number of monitoring checks of the jth cycle, ji y is the condition monitoring reading at i t for the jth cycle and the ith monitoring, ji x and 1  ji x are the states of the system at i t and 1  i t for the jth cycle and ith monitoring respectively, and   1 , 2 1 1 ..., ,     ji j j ji y y y . Similarly, for numerical convenience, the log of the likelihood is used. Therefore, equation 3-26 can be rewritten as 53        m j T i ji ij mT j y p L 1 1 1 , | log |                       m j T i N x x N x ji ji ji ji ji ji j i i i x P x x P x y p 1 1 1 1 1 2 1 1 1 1 , | , | , | log    3-27 We coded equation 3-27 in FORTRAN95 and used NAG E04JYF optimisation routines to obtain the values of the estimated parameters 1 1 , , ,    b and 2  as shown in Table 3-2 . Estimated Value 1 ˆ = 0.2062  ˆ = 3.1710 bˆ = 0.8109 1 ˆ  = 0.0231 2 ˆ  = not available True Value 1  = 0.2176  = 4.0 b = 0.8 1  = 0.05 2  = 0.025 Table 3-2: The estimated parameters and their true values We obtained good estimated values for bˆ , ˆ , ˆ   and 1 ˆ  , but had no success for 2 ˆ  , as it tended to the lowest setting boundary in the optimisation routine. In order to explain this, let us look at each monitoring point where we have | i i y p  until the last monitoring point, | j j T T y p  , before failure occurs, as shown in Figure 3-10 . Here, the failure information is not used and because of that, the likelihood of equation 3-26 contains little information on the length 2 l , which is the reason 2 ˆ  is ―non-identifiable‖ from these data. Figure 3-10: The likelihood function at each monitoring point By incorporating the failure information, the probability of 3 1   j T X given j jT  is 54 | 2 2 | 3 | 1 1 | 3 | 3 1 1 1 j j j j j j j j j j jT jT jT jT jT T jT jT jT T X P X X P X P x X P X P                3-28 Multiplying this term in equation 3-26, the likelihood function now becomes | 3 , | , | , | | 1 1 1 1 1 1 2 1 1 1 1 j j j i i i jT jT m j T i N x x N x ji ji ji ji ji ji mT X P x P x x P x y p L                                      3-29 By taking logs on both sides of equation 3-29 and maximizing the log likelihood, we coded it using the same E04JYF routines and obtained the results as shown in Table 3-3 below. Estimated Value 1 ˆ  = 0.2056  ˆ = 3.1702 bˆ = 0.8126 1 ˆ  = 0.0232 2 ˆ  = 0.0362 True Value 1  = 0.2176  = 4.0 b = 0.8 1  = 0.05 2  = 0.025 Table 3-3: The estimated parameters and true values using the likelihood function with failure information The results show a significant improvement in the estimated value of 2  by incorporating the failure information, which is shown to be a vital piece of information needed for parameter estimation, although it may not be available in practice. Although the likelihood function with failure information was successful for estimating parameters and some of the results were very good, it was felt that the results might be better if another technique was used; hence we turned our attention to the Expectation- Maximization EM algorithm, which is particularly useful in HMM Rabiner, 1989. 55

3.8.2 Expectation-Maximization EM Algorithm