مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative acoustic model using eigenspace mapping for rapid speaker adaptation

DocumentCode :

394253

Title :

Discriminative acoustic model using eigenspace mapping for rapid speaker adaptation

Author :

Zhou, Bowen ; Hansen, John H. L.

Author_Institution :

Robust Speech Process. Group, Colorado Univ., Boulder, CO, USA

Volume :

fYear :

2003

fDate :

6-10 April 2003

Abstract :

It is widely believed that strong correlations exist across an utterance as a consequence of time-invariant characteristics of speaker and acoustic environments. It is verified in this paper that the first primary eigendirections of the utterance covariance matrix are speaker dependent. Based on this observation, a fast speaker adaptation algorithm entitled Eigenspace Mapping (EigMap) is proposed and described. EigMap rapidly adapts the speaker independent models by constructing discriminative acoustic models in the test speaker´s eigenspace. Unsupervised adaptation experiments show that EigMap is effective in improving baseline models using very limited amounts of adaptation data with superior performance to conventional adaptation technique such as block diagonal MLLR. A relative improvement of 18.4% over baseline recognizer is achieved using EigMap with only about 4.5 seconds of adaptation data. It is also demonstrated that EigMap is additive to MLLR by encompassing the speaker dependent discrimination information. A significant relative improvement of 24.6% over baseline is observed by combining MLLR and EigMap techniques.

Keywords :

acoustic signal processing; covariance matrices; eigenvalues and eigenfunctions; speaker recognition; EigMap; MLLR; acoustic environment; baseline models; baseline recognizer; correlation; discriminative acoustic model; discriminative acoustic models; eigendirections; eigenspace mapping; fast speaker adaptation algorithm; rapid speaker adaptation; speaker dependent discrimination; speaker environment; speaker independent models; time-invariant characteristics; unsupervised adaptation experiments; utterance covariance matrix; Acoustic testing; Linear discriminant analysis; Loudspeakers; Maximum likelihood decoding; Maximum likelihood linear regression; Natural languages; Robustness; Speech processing; Speech recognition; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN :

1520-6149

Print_ISBN :

0-7803-7663-3

Type :

conf

DOI :

10.1109/ICASSP.2003.1198779

Filename :

1198779

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=394253