Dictionary learning: performance through consistency

Author

Sloboda, Tilo

Author_Institution

Interactive Syst. Lab., Karlsruhe Univ., Germany

Volume

1

fYear

1995

fDate

9-12 May 1995

Firstpage

453

Abstract

We present first results from our efforts in automatically increasing and adapting phonetic dictionaries for spontaneous speech recognition. Spontaneous speech adds a variety of phenomena to a speech recognition task: false starts, human and nonhuman noises, new words and alternative pronunciations. All of these phenomena have to be tackled when adapting a speech recognition system for spontaneous speech. For phonetic dictionaries (especially for spontaneous speech) it is important to choose the pronunciations of a word according to the frequency in which they appear in the database rather than the “correct” pronunciation as it might be found in a lexicon. Additionally modifications of the dictionary should not lead to a higher phoneme confusability. Therefore we propose a data-driven approach to add new pronunciations to a given phonetic dictionary, in a way that they model the given occurrences of words in the database. We show how even a simple approach can lead to significant improvements in recognition performance. First experiments have been performed on the German Spontaneous Scheduling Task (GSST), using the speech recognition engine of JANUS-2, the spontaneous speech-to-speech translation system of the Interactive Systems Laboratories at Carnegie Mellon and Karlsruhe University

Keywords

acoustic signal processing; adaptive signal processing; language translation; learning systems; speech processing; speech recognition; German Spontaneous Scheduling Task; JANUS-2; Karlsruhe University; alternative pronunciations; data-driven approach; database; dictionary learning; false starts; human noise; lexicon; new pronunciations; new words; nonhuman noise; phonetic dictionaries; phonetic dictionary; recognition performance; speech recognition engine; spontaneous speech recognition; spontaneous speech-to-speech translation system; Acoustics; Databases; Dictionaries; Engines; Frequency; Humans; Interactive systems; Laboratories; Speech processing; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location

Detroit, MI

ISSN

1520-6149

Print_ISBN

0-7803-2431-5

Type

conf

DOI

10.1109/ICASSP.1995.479626

Filename

479626