Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon

Author

Hartmann, W. ; Roy, Anirban ; Lamel, Lori ; Gauvain, Jean-Luc

Author_Institution

Spoken Language Process. Group, LIMSI, Orsay, France

fYear

2013

fDate

8-12 Dec. 2013

Firstpage

380

Lastpage

385

Abstract

We present a framework for discovering acoustic units and generating an associated pronunciation lexicon from an initial grapheme-based recognition system. Our approach consists of two distinct contributions. First, context-dependent grapheme models are clustered using a spectral clustering approach to create a set of phone-like acoustic units. Next, we transform the pronunciation lexicon using a statistical machine translation-based approach. Pronunciation hypotheses generated from a decoding of the training set are used to create a phrase-based translation table. We propose a novel method for scoring the phrase-based rules that significantly improves the output of the transformation process. Results on an English language dataset demonstrate the combined methods provide a 13% relative reduction in word error rate compared to a baseline grapheme-based system. Our approach could potentially be applied to low-resource languages without existing lexicons, such as in the Babel project.

Keywords

acoustic signal processing; language translation; natural language processing; pattern clustering; spectral analysis; speech recognition; statistical analysis; Babel project; English language dataset; context-dependent grapheme models; grapheme-based lexicon; grapheme-based speech recognition system; low-resource languages; phone-like acoustic unit discovery; phrase-based rule scoring; phrase-based translation table; pronunciation hypothesis generation; pronunciation lexicon generation; relative reduction; spectral clustering approach; statistical machine translation-based approach; training set decoding; transformation process output improvement; word error rate; Acoustics; Computational modeling; Context modeling; Dictionaries; Hidden Markov models; Training; Training data; acoustic unit discovery; automatic speech recognition; grapheme-based speech recognition; pronunciation learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location

Olomouc

Type

conf

DOI

10.1109/ASRU.2013.6707760

Filename

6707760