Title :
Dictionary learning: performance through consistency
Author_Institution :
Interactive Syst. Lab., Karlsruhe Univ., Germany
Abstract :
We present first results from our efforts in automatically increasing and adapting phonetic dictionaries for spontaneous speech recognition. Spontaneous speech adds a variety of phenomena to a speech recognition task: false starts, human and nonhuman noises, new words and alternative pronunciations. All of these phenomena have to be tackled when adapting a speech recognition system for spontaneous speech. For phonetic dictionaries (especially for spontaneous speech) it is important to choose the pronunciations of a word according to the frequency in which they appear in the database rather than the “correct” pronunciation as it might be found in a lexicon. Additionally modifications of the dictionary should not lead to a higher phoneme confusability. Therefore we propose a data-driven approach to add new pronunciations to a given phonetic dictionary, in a way that they model the given occurrences of words in the database. We show how even a simple approach can lead to significant improvements in recognition performance. First experiments have been performed on the German Spontaneous Scheduling Task (GSST), using the speech recognition engine of JANUS-2, the spontaneous speech-to-speech translation system of the Interactive Systems Laboratories at Carnegie Mellon and Karlsruhe University
Keywords :
acoustic signal processing; adaptive signal processing; language translation; learning systems; speech processing; speech recognition; German Spontaneous Scheduling Task; JANUS-2; Karlsruhe University; alternative pronunciations; data-driven approach; database; dictionary learning; false starts; human noise; lexicon; new pronunciations; new words; nonhuman noise; phonetic dictionaries; phonetic dictionary; recognition performance; speech recognition engine; spontaneous speech recognition; spontaneous speech-to-speech translation system; Acoustics; Databases; Dictionaries; Engines; Frequency; Humans; Interactive systems; Laboratories; Speech processing; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479626