Background of Hidden Markov Models

37 HMMs have become very popular in modern statistics, as they have been used widely in the field of speech recognition Rabiner, 1989. The usage has been recently spread more to other areas such as bioinformatics and engineering, including machine tool monitoring Ertunc et al., 2001, image segmentation Li ad Gray, 2000 and fault detection and diagnosis Bunks et al., 2000.

3.2 Background of Hidden Markov Models

An HMM is a Markov chain observed in noise Cappe et al., 2005. This Markov chain is assumed to have a finite number of states, i x but these are not directly observable Rabiner, 1989. Instead, we can observe another random measurement i y , which is related to each state by a probability distribution. The term ‗hidden‘ comes from the notion of making observations and drawing conclusions without knowing the exact states of the system, which are hidden. The structure of an HMM may be represented graphically Cappe et al., 2005, as in Figure 3-1 below. Figure 3-1: Graphical representation of the structure of a hidden Markov model Figure 3-1 implies that the hidden states 1  i x at time 1  i t are independent of the history of the process, 1 , ,  i x x  but conditional on the previous state i x at time i t . This is called the first order Markov assumption and the resulting model becomes a first-order HMM. Generally, 1  i x may depend on part of past states until i j x j  , ; it is possible to obtain such a model, called an th j i 1   order HMM, but a higher-order HMM will have greater complexity. Similarly, the observed values of 1  i y are independent of the 38 past observations i y y , ,  but conditional on the value of 1  i x . Hence, in order to define the HMM, the following elements are needed: 1. A set of hidden states, which are the actual states of the process, but cannot be observed directly. 2. Transition probabilities | 1  i i x x P , which describes the probability of moving from state 1  i x at time 1  i t to i x at time i t . 3. A set of values that can be observed or measured, which is a random function of the current state. 4. Observation probability density function pdf. | i i x y p , which describes the pdf. of observing i y given that the actual state is i x at time i t . The relationship between the system and the data observed is best explained by way of an example. Consider some observed data represented by vibration monitoring readings, as shown in Figure 3-2 below. i y is the observed data and i x shows the hidden states of the system. Figure 3-2: Processes of observation and hidden states At each particular time , i t condition-monitoring data is collected through an observation, . i y We assume that condition-monitoring data is stochastically correlated with the true state , i x which is hidden. The true state may be assumed to be describable by a set of terms such as ‗good‘, ‗minor defect‘, ‗major defect‘ or ‗failed‘. We are 39 interested in what the system state is at any particular moment and beyond. Surely the exact states cannot be known but their probabilities can be calculated. Hence, at each time i t , we seek to find the probability of i x , the true state of the system, given the monitored condition information history   i i y y y ,... , 2 1   , namely | i i x P  .

3.3 Modelling Methodology