Title :
Suprasegmentals in very large vocabulary isolated word recognition
Author_Institution :
Carnegie Mellon University, Pittsburgh, PA, USA
Abstract :
Prosodic information is believed to be valuable informnation in human speech perception, but speech recognition systems to date have largely been based on segmental spectral analysis. In this paper I describe parts of a front end to a very-large-vocabulary isolated word recognition system using prosodic information. The present front end is template independent (speaker training for large vocabulary systems (> 20,000 words) is undesirable) and makes use of robust cues in the incoming speech to obtain a presorted vocabulary of candidates. It is shown that prosodic information, e.g., the rhythmic structure of an input word, its syllabic structure, voiced/unvoiced regions in the word and the temporal distribution of back/front vowels, nasals and liquids and glides, can be used effectively to select a substantially reduced subvocabulary of candidates, before any fine phonetic analysis is attempted to recognize the word.
Keywords :
Aerospace electronics; Computer science; Filters; Information analysis; Liquids; Robustness; Spectral analysis; Speech analysis; Speech recognition; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '84.
DOI :
10.1109/ICASSP.1984.1172524