مرکز منطقه ای اطلاع رساني علوم و فناوري - Robust speech recognition with on-line unsupervised acoustic feature compensation

DocumentCode :

2768728

Title :

Robust speech recognition with on-line unsupervised acoustic feature compensation

Author :

Buera, Luis ; Miguel, Antonio ; Lleida, Eduardo ; Saz, Óscar ; Ortega, Alfonso

Author_Institution :

Zaragoza Univ., Zaragoza

fYear :

2007

fDate :

9-13 Dec. 2007

Firstpage :

105

Lastpage :

110

Abstract :

An on-line unsupervised hybrid compensation technique is proposed to reduce the mismatch between training and testing conditions. It combines multi-environment model based linear normalization with cross-probability model based on GMMs (MEMLIN CPM) with a novel acoustic model adaptation method based on rotation transformations. Hence, a set of rotation transformations is estimated with clean and MEMLIN CPM-normalized training data by linear regression in an unsupervised process. Thus, in testing, each MEMLIN CPM normalized frame is decoded using a modified Viterbi algorithm and expanded acoustic models, which are obtained from the reference ones and the set of rotation transformations. To test the proposed solution, some experiments with Spanish SpeechDat Car database were carried out. MEMLIN CPM over standard ETSI front-end parameters reaches 83.89% of average improvement in WER, while the introduced hybrid solution goes up to 92.07%. Also, the proposed hybrid technique was tested with Aurora 2 database, obtaining an average improvement of 68.88% with clean training.

Keywords :

Gaussian processes; audio acoustics; compensation; decoding; estimation theory; feature extraction; matrix algebra; probability; regression analysis; speech coding; speech recognition; unsupervised learning; vectors; GMM; MEMLIN CPM-normalized training data; Viterbi algorithm; acoustic model adaptation method; cross-probability model; feature vector normalization; linear regression; multienvironment model linear normalization; normalized frame decoding; online unsupervised acoustic feature compensation; online unsupervised hybrid compensation technique; rotation matrix estimation process; rotation transformations; speech recognition; testing conditions; training conditions; Acoustic testing; Adaptation model; Databases; Decoding; Linear regression; Robustness; Speech recognition; Telecommunication standards; Training data; Viterbi algorithm; acoustic model adaptation; feature vector normalization; robust speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Conference_Location :

Kyoto

Print_ISBN :

978-1-4244-1746-9

Electronic_ISBN :

978-1-4244-1746-9

Type :

conf

DOI :

10.1109/ASRU.2007.4430092

Filename :

4430092

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2768728