Title :
A VQ-based preprocessor using cepstral dynamic features for large vocabulary word recognition
Author_Institution :
NTT Electrical Communications Laboratories, Musashino-shi, Tokyo, Japan
Abstract :
This paper proposes a new VQ (Vector Quantization)- based preprocessor for use in a method which reduces the amount of computation necessary in speaker-independent large vocabulary isolated word recognition. A speech wave is analyzed by time functions of instantaneous cepstrum coefficients and short-time regression coefficients for both cepstrum coefficients and logarithmic energy. A universal VQ codebook for these time functions is constructed based on a multi-speaker, multi-word database. Next, a separate codebook is designed as a subset of the universal codebook for each word in the vocabulary. These word-specific codebooks are used for front-end preprocessing to eliminate word candidates whose distance scores are large. A dynamic time-warping processor based on a word dictionary, in which each word is represented as a time-sequence of the universal codebook elements (SPLIT method), then resolves the choice among the remaining word candidates. Effectiveness of this method has been ascertained by recognition experiments using a database consisting of words from a vocabulary of 100 Japanese city names uttered by 20 male speakers.
Keywords :
Cepstral analysis; Cepstrum; Cities and towns; Databases; Dictionaries; Laboratories; Linear predictive coding; Probability density function; Speech analysis; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.
DOI :
10.1109/ICASSP.1987.1169790