Nur Azman Abu et al. Journal of Computer Science 11 2: 351.360, 2015
355
Science Publications
JCS
1 1
1 1
1 1
2 2
2 2
2 2
n 1 n 1
n 1 n 1
n 1 n 1
1 2
1 1
2 1
1 2
1 1
2 1
t t
t t n
c x
t t
t t n
c x
t t
t t n
c x
t t
t t
n c
x
− −
− −
− −
−
−
= −
−
18
where, C is the coefficient of discrete Tchebichef transform, which represented by c
, c
1
, c
2
, ….,c
n-1.
T is a matrix computation of discrete orthonormal
Tchebichef polynomials t
k
n for k = 0,1,…., N-1. S is
the sample of speech signal window which is given by x
, x
1
, x
2
, …, x
n-1
. The coefficient of DTT is given in Equation 19 as follows:
-1
C = T S
19 Next, speech signal on frame 3 is computed with
1024 discrete orthono1rmal Tchebichef polynomials.
3.5. Spectrum Analysis
The spectrum analysis using FFT can be generated in Equation 20 as follows:
2
p k c n
= 20
The spectrum analysis using FFT of the vowel ‘O’ is
shown in Fig. 7. The spectrum analysis using DTT can be defined in Equation 21 and 22 as follows:
2
p k c n
= 21
k
x n c n
t n
= 22
where, cn is the coefficient of DTT, xn is the sample data at time index n and t
k
n is the computation matrix of orthonormal Tchebichef polynomials. The spectrum
analysis using DTT of the vowel ‘O’ is shown in Fig. 8.
3.6. Power Spectral Density
Power Spectral Density PSD shows the strength of the variations energy as a function of frequency. In
other words, it shows the frequencies at which variations are strong and at which frequency variations are weak.
The one-sided power spectral density using FFT can be computed in Equation 23 as follows:
2 2
1
2 X k
ps k t
t
= −
23 where, Xk is a vector of N values at frequency index
k, the factor 2 is called for here in order to include the contributions from positive and negative frequencies.
The result is precisely the average power of spectrum in the time range t
1
, t
2
. The power spectral density in 23 and 24 are plotted on a decibel dB scale of
20log
10
. The power spectral density using FFT for
vowel ‘O’ on frame 3 is shown in Fig. 9. The power spectral density using DTT can be generated in Equation
24 as follows:
2 2
1
2 c n
pw k t
t
= −
24 where, cn is the coefficient of discrete Tchebichef
transform. The power spectral density using DTT for vowel ‘O’ on frame 3 is shown in Fig. 10.
3.7. Autoregressive Model
Autoregressive AR models are used for linear prediction model Hsu and Liu, 2010 to obtain all pole
estimate of the signal’s power spectrum. Autoregressive model is used to determine the characteristics of the
vocal and to evaluate the formants. The autoregressive process of a series y
j
using FFT of order v is given in Equation 25 as follows:
v 1
j k
j -k j
k
y a
q e
=
= − ⋅
+
∑
25 where, a
k
are real value autoregression coefficients, q
j
represents the inverse FFT from power spectral density and v is set to 12. The peaks of frequency
formants using FFT in autoregressive for vowel ‘O’ on frame 3 are shown in Fig. 11. The autoregressive
process of a series y
j
using DTT of order v is given in Equation 26 as follows:
1 v
j k
j -k j
k
y a
c + e
=
= − ⋅
∑
26 where, a
k
are real value autoregression coefficients, v is 12 and c
j
is the coefficient of DTT at frequency index j. e
j
represents the errors that are term independent of past samples. The frequency formants
using DTT which are autoregressive for vowel ‘O’ on frame 3 are shown in Fig. 12.
Nur Azman Abu et al. Journal of Computer Science 11 2: 351.360, 2015
356
Science Publications
JCS
3.8. Frequency Formants
The uniqueness of each vowel is measured by formants. The resonance frequencies known as formant
can be detected as the peaks of the magnitude spectrum of speech signals. Formants are defined as the resonance
frequencies of the vocal tract which are formed by the shape of vocal tract Ozkan et al., 2009.
A formant is a characteristic resonant region peak in the power spectral density of a sound. Next, the
frequency formants shall be detected. The formants of the autoregressive curve are found at the peaks using a
numerical derivative. These vector positions of the formants are used to characterize a particular vowel. The
first two formants F
1
, F
2
of a vowel utterance cue the phonemic identity of the vowel Patil et al., 2010. The
third formant F
3
is also important for vowel categorisation Kiefte et al., 2010. However, it is
frequently excluded from vowel plots overshadowed by the first two formant.
The frequency peak formants of the experiment result F
1
, F
2
and F
3
are compared to referenced formants to decide on the output of the vowel. The
frequency formants of the five vowels and the five consonants using FFT and DTT on frame 3 are as
shown in
Table 1
and 2
respectively.
Table 1.
Frequency formants of five vowels FFT
DTT ----------------------------------------------------------
--------------------------------------------------------- Vowel
F
1
F
2
F
3
F
1
F
2
F
3
i 215
2444 3434
226 2411
3466 e
322 1453
2401 301
1485 2357
a 667
1055 2637
581 979
2670 o
462 689
3208 452
710 3219
u 247
689 3413
301 699
3380
Table 2.
Frequency formants of five consonants FFT
DTT ----------------------------------------------------------
--------------------------------------------------------- Consonant
F
1
F
2
F
3
F
1
F
2
F
3
k 796
1152 2347
721 1130
2336 n
764 1324
2519 839
1345 2508
p 785
1076 2573
753 1065
2562 r
635 1281
2121 624
1248 2131
t 829
1152 2519
796 1141
2530
Fig. 3.
Speech signals after pre-emphasis of the vowel ‘O’
Nur Azman Abu et al. Journal of Computer Science 11 2: 351.360, 2015
357
Science Publications
JCS
Frame 1 Frame 2
Frame 3 Frame 4
Fig. 4.
Speech signal windowed into four frames
Fig. 5.
Imaginary part of FFT for speech signal of vowel ‘O’ on frame 3
Fig. 6.
Coefficient of DTT for speech signal of the vowel ‘O’ on frame 3
Nur Azman Abu et al. Journal of Computer Science 11 2: 351.360, 2015
358
Science Publications
JCS Fig. 7.
Imaginary part of FFT for spectrum analysis of the vowel ‘O’ on frame 3
Fig. 8.
Coefficient of DTT for spectrum analysis of the vowel ‘O’ on frame 3
Fig. 9.
Power Spectral Density using FFT for vowel ‘O’ on frame 3
Fig. 10.
Power spectral density using DTT for vowel ‘O’ on frame 3
Fig. 11.
Autoregressive using FFT for vowel ‘O’ on frame 3
Fig. 12.
Autoregressive using DTT for vowel ‘O’ on frame 3
Nur Azman Abu et al. Journal of Computer Science 11 2: 351.360, 2015
359
Science Publications
JCS
4. DISCUSSION