مرکز منطقه ای اطلاع رساني علوم و فناوري - On the use of feature-space MLLR adaptation for non-native speech recognition

DocumentCode :

2800723

Title :

On the use of feature-space MLLR adaptation for non-native speech recognition

Author :

Oh, Yoo Rhee ; Kim, Hong Kook

Author_Institution :

Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea

fYear :

2010

fDate :

14-19 March 2010

Firstpage :

4314

Lastpage :

4317

Abstract :

In this paper, we address issues associated with a feature-space maximum likelihood linear regression (fMLLR) adaptation method applied to non-native speech recognition. In particular, fMLLR smoothing is proposed here to compensate for mismatches between adaptation and test data, caused by the various disfluencies of non-native speakers. The proposed fMLLR smoothing is performed with a Viterbi decoding procedure and implemented at two levels: a Gaussian mixture probability density function (mpdf) level and an observation probability density function (opdf) level. The mpdf-level smoothing is performed by comparing the pdf of each Gaussian mixture component of an original speech feature vector with that transformed by the fMLLR. On the other hand, the opdf-level smoothing compares the Gaussian mixture probabilities between the original and its fMLLR transformed feature vectors. It is shown from non-native automatic speech recognition experiments on a Korean-spoken English continuous speech corpus that an ASR system employing the proposed mpdf-level and opdf-level fMLLR smoothing methods can relatively reduce the average word error rate by 30.65% and 29.82%, respectively, when compared to a traditional fMLLR adaptation method.

Keywords :

Gaussian processes; Viterbi decoding; maximum likelihood estimation; probability; regression analysis; smoothing methods; speech recognition; transforms; Gaussian mixture probability density function; Korean-spoken English continuous speech corpus; Viterbi decoding procedure; average word error rate; fMLLR smoothing; feature-space MLLR adaptation; feature-space maximum likelihood linear regression adaptation; nonnative speech recognition; observation probability density function; transformed feature vectors; Acoustic testing; Adaptation model; Automatic speech recognition; Maximum likelihood decoding; Maximum likelihood linear regression; Natural languages; Probability density function; Smoothing methods; Speech recognition; Viterbi algorithm; Non-native speech recognition; acoustic model adaptation; feature compensation; feature-space maximum likelihood linear regression (fMLLR);

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location :

Dallas, TX

ISSN :

1520-6149

Print_ISBN :

978-1-4244-4295-9

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2010.5495658

Filename :

5495658

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2800723