Title :
A VQ-based preprocessor using cepstral dynamic features for speaker-independent large vocabulary word recognition
Author_Institution :
Human Interface Lab., NTT, Tokyo, Japan
fDate :
7/1/1988 12:00:00 AM
Abstract :
A VQ (vector quantization)-based preprocessor is proposed which reduces the amount of computation in speaker-independent large-vocabulary isolated-word recognition. The features introduced here are the use of a universal codebook in the VQ-based preprocessor and the use of multiple feature sets including cepstral dynamic features. Word-specific codebooks are used for front-end preprocessing to eliminate word candidates whose distance scores are large. A dynamic time-warping (DTW) processor based on a word dictionary, in which each word is represented as a time sequence of the universal codebook elements (SPLIT method), then resolves the choice among the remaining word candidates. Recognition experiments using a database consisting of words from a vocabulary of 100 Japanese city names uttered by 20 male speakers confirmed the effectiveness of this method. The total amount of calculation necessary in this condition is almost 1/10 of that without preprocessing
Keywords :
analogue-digital conversion; speech analysis and processing; speech recognition; Japanese city names; SPLIT method; VQ-based preprocessor; cepstral dynamic features; dynamic time-warping; front-end preprocessing; isolated-word recognition; male speakers; multiple feature sets; speaker-independent large vocabulary word recognition; time sequence; universal codebook; vector quantization; work specific codebooks; Books; Cepstral analysis; Cepstrum; Cities and towns; Databases; Dictionaries; Hidden Markov models; Humans; Spatial databases; Speech analysis; Vector quantization; Vocabulary;
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on