مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker adaptive training: a maximum likelihood approach to speaker normalization

DocumentCode :

310575

Title :

Speaker adaptive training: a maximum likelihood approach to speaker normalization

Author :

Anastasakos, Tasos ; McDonough, John ; Makhoul, John

Author_Institution :

Northeastern Univ., Boston, MA, USA

Volume :

fYear :

1997

fDate :

21-24 Apr 1997

Firstpage :

1043

Abstract :

This paper describes the speaker adaptive training (SAT) approach for speaker independent (SI) speech recognizers as a method for joint speaker normalization and estimation of the parameters of the SI acoustic models. In SAT, speaker characteristics are modeled explicitly as linear transformations of the SI acoustic parameters. The effect of inter-speaker variability in the training data is reduced, leading to parsimonious acoustic models that represent more accurately the phonetically relevant information of the speech signal. The proposed training method is applied to the Wall Street Journal (WSJ) corpus that consists of multiple training speakers. Experimental results in the context of batch supervised adaptation demonstrate the effectiveness of the proposed method in large vocabulary speech recognition tasks and show that significant reductions in word error rate can be achieved over the common pooled speaker-independent paradigm

Keywords :

acoustic signal processing; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; Wall Street Journal corpus; acoustic models; acoustic parameters; batch supervised adaptation; inter-speaker variability; large vocabulary speech recognition; linear transformations; maximum likelihood approach; multiple training speakers; parameter estimation; parsimonious acoustic models; phonetically relevant information; speaker adaptive training; speaker independent speech recognizers; speaker normalization; word error rate reduction; Acoustic testing; Error analysis; Hidden Markov models; Loudspeakers; Maximum likelihood estimation; Parameter estimation; Robustness; Speech recognition; Training data; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location :

Munich

ISSN :

1520-6149

Print_ISBN :

0-8186-7919-0

Type :

conf

DOI :

10.1109/ICASSP.1997.596119

Filename :

596119

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=310575