DocumentCode :
1340815
Title :
Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition
Author :
Akita, Yuya ; Kawahara, Tatsuya
Author_Institution :
Acad. Center for Comput. & Media Studies, Kyoto Univ., Kyoto, Japan
Volume :
18
Issue :
6
fYear :
2010
Firstpage :
1539
Lastpage :
1549
Abstract :
We propose a novel approach based on a statistical transformation framework for language and pronunciation modeling of spontaneous speech. Since it is not practical to train a spoken-style model using numerous spoken transcripts, the proposed approach generates a spoken-style model by transforming an orthographic model trained with document archives such as the minutes of meetings and the proceedings of lectures. The transformation is based on a statistical model estimated using a small amount of a parallel corpus, which consists of faithful transcripts aligned with their orthographic documents. Patterns of transformation, such as substitution, deletion, and insertion of words, are extracted with their word and part-of-speech (POS) contexts, and transformation probabilities are estimated based on occurrence statistics in a parallel aligned corpus. For pronunciation modeling, subword-based mapping between baseforms and surface forms is extracted with their occurrence counts, then a set of rewrite rules with their probabilities are derived as a transformation model. Spoken-style language and pronunciation (surface forms) models can be predicted by applying these transformation patterns to a document-style language model and baseforms in a lexicon, respectively. The transformed models significantly reduced perplexity and word error rates (WERs) in a task of transcribing congressional meetings, even though the domains and topics were different from the parallel corpus. This result demonstrates the generality and portability of the proposed framework.
Keywords :
rewriting systems; speech recognition; statistical analysis; document-style language model; orthographic model; pronunciation model; rewrite rules; spoken-style language model; spontaneous speech recognition; statistical transformation; subword-based mapping; word error rate; Automatic speech recognition (ASR); language model (LM); pronunciation model; spontaneous speech; statistical transformation;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2037400
Filename :
5340564
Link To Document :
بازگشت