English digits speech recognition system based on Hidden Markov Models

Author

Abushariah, Ahmad A M ; Gunawan, Teddy S. ; Khalifa, Othman O. ; Abushariah, Mohammad A M

Author_Institution

Electr. & Comput. Eng. Dept., Int. Islamic Univ., Kuala Lumpur, Malaysia

fYear

2010

fDate

11-12 May 2010

Firstpage

1

Lastpage

5

Abstract

This paper aims to design and implement English digits speech recognition system using Matlab (GUI). This work was based on the Hidden Markov Model (HMM), which provides a highly reliable way for recognizing speech. The system is able to recognize the speech waveform by translating the speech waveform into a set of feature vectors using Mel Frequency Cepstral Coefficients (MFCC) technique This paper focuses on all English digits from (Zero through Nine), which is based on isolated words structure. Two modules were developed, namely the isolated words speech recognition and the continuous speech recognition. Both modules were tested in both clean and noisy environments and showed a successful recognition rates. In clean environment and isolated words speech recognition module, the multi-speaker mode achieved 99.5% whereas the speaker-independent mode achieved 79.5%. In clean environment and continuous speech recognition module, the multi-speaker mode achieved 72.5% whereas the speaker-independent mode achieved 56.25%. However in noisy environment and isolated words speech recognition module, the multi-speaker mode achieved 88% whereas the speaker-independent mode achieved 67%. In noisy environment and continuous speech recognition module, the multi-speaker mode achieved 82.5% whereas the speaker-independent mode achieved 76.67%. These recognition rates are relatively successful if compared to similar systems.

Keywords

cepstral analysis; hidden Markov models; natural language processing; speech recognition; English digits speech recognition system; MFCC technique; Matlab GUI; continuous speech recognition; feature vector; hidden Markov model; isolated word speech recognition; mel frequency cepstral coefficient; multispeaker mode; noisy environment; speaker-independent mode; speech waveform; word structure; Hidden Markov models; Mel frequency cepstral coefficient; Noise measurement; Speech; Speech recognition; Testing; Training; English digits; Features extraction; Hidden Markov Models; Mel Frequency Cepstral Coefficients;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer and Communication Engineering (ICCCE), 2010 International Conference on

Conference_Location

Kuala Lumpur

Print_ISBN

978-1-4244-6233-9

Type

conf

DOI

10.1109/ICCCE.2010.5556819

Filename

5556819