37 HMMs have become very popular in modern statistics, as they have been used widely in
the field of speech recognition Rabiner, 1989. The usage has been recently spread more to other areas such as bioinformatics and engineering, including machine tool
monitoring Ertunc et al., 2001, image segmentation Li ad Gray, 2000 and fault detection and diagnosis Bunks et al., 2000.
3.2 Background of Hidden Markov Models
An HMM is a Markov chain observed in noise Cappe et al., 2005. This Markov chain is assumed to have a finite number of states,
i
x but these are not directly observable Rabiner, 1989. Instead, we can observe another random measurement
i
y
, which is related to each state by a probability distribution.
The term ‗hidden‘ comes from the notion of making observations and drawing conclusions without knowing the exact
states of the system, which are hidden. The structure of an HMM may be represented
graphically Cappe et al., 2005, as in Figure 3-1 below.
Figure 3-1: Graphical representation of the structure of a hidden Markov model
Figure 3-1 implies that the hidden states
1
i
x
at time
1
i
t
are independent of the history of the process,
1
, ,
i
x x
but conditional on the previous state
i
x at time
i
t
. This is called the first order Markov assumption and the resulting model becomes a first-order
HMM. Generally,
1
i
x
may depend on part of past states until
i j
x
j
,
; it is possible to obtain such a model, called an
th
j i
1
order HMM, but a higher-order HMM will have greater complexity. Similarly, the observed values of
1
i
y
are independent of the
38 past observations
i
y y
, ,
but conditional on the value of
1
i
x
. Hence, in order to define the HMM, the following elements are needed:
1.
A set of hidden states, which are the actual states of the process, but cannot be observed directly.
2.
Transition probabilities
|
1
i i
x x
P
, which describes the probability of moving from state
1
i
x
at time
1
i
t to
i
x at time
i
t
.
3.
A set of values that can be observed or measured, which is a random function of the current state.
4.
Observation probability density function pdf.
|
i i
x y
p
, which describes the pdf. of observing
i
y
given that the actual state is
i
x at time
i
t
. The relationship between the system and the data observed is best explained by way of
an example. Consider some observed data represented by vibration monitoring readings,
as shown in Figure 3-2 below.
i
y
is the observed data and
i
x shows the hidden states of the system.
Figure 3-2: Processes of observation and hidden states
At each particular time
,
i
t
condition-monitoring data is collected through an observation,
.
i
y
We assume that condition-monitoring data is stochastically correlated with the true state ,
i
x which is hidden. The true state may be assumed to be describable by a set of terms such as ‗good‘, ‗minor defect‘, ‗major defect‘ or ‗failed‘. We are
39 interested in what the system state is at any particular moment and beyond. Surely the
exact states cannot be known but their probabilities can be calculated. Hence, at each time
i
t
, we seek to find the probability of
i
x , the true state of the system, given the monitored condition information history
i i
y y
y ,...
,
2 1
, namely |
i i
x P
.
3.3 Modelling Methodology