Static representation of speech dynamics for isolated word recognition

Author

Chan, Chorkin ; Wu, Jian-Xiong

Author_Institution

Dept. of Comput. Sci., Hong Kong Univ., Hong Kong

Volume

1

fYear

1992

fDate

23-26 Mar 1992

Firstpage

529

Abstract

A static model (SM) in the form of a single vector is proposed to represent the temporal properties of a sequence of speech feature vectors. In contrast to a hidden Markov model which captures the conditional probabilities of state transitions of consecutive observations x^→_t and x^→_t+1 over time, an SM captures their average joint probabilities of belonging to a pair of phonetic classes ω_i and ω_j without any Markovian assumption. SM is tested with isolated words derived from the TIMIT database as well as artificially created words. The vocabulary is a subset of TIMIT consisting of 21 words derived from the two `sa´ sentences spoken by 420 speakers. The artificial vocabulary of 10 words is designed to study the limitations of SM. Experimental results indicate that apart from a rather mild limitation of SM in handling a certain type of vocabulary, SM actually performs better than baselined continuous hidden Markov models (CHMM) in terms of recognition rate as far as isolated word recognition is concerned, and it takes only 60% of the time needed by CHMM in recognition

Keywords

speech recognition; TIMIT database; average joint probabilities; isolated word recognition; phonetic classes; recognition rate; speech dynamics; speech feature vectors; static model; temporal properties; vocabulary; Biological system modeling; Computer science; Databases; Hidden Markov models; Humans; Robustness; Samarium; Speech recognition; Testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on

Conference_Location

San Francisco, CA

ISSN

1520-6149

Print_ISBN

0-7803-0532-9

Type

conf

DOI

10.1109/ICASSP.1992.225854

Filename

225854