DocumentCode :
2800723
Title :
On the use of feature-space MLLR adaptation for non-native speech recognition
Author :
Oh, Yoo Rhee ; Kim, Hong Kook
Author_Institution :
Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4314
Lastpage :
4317
Abstract :
In this paper, we address issues associated with a feature-space maximum likelihood linear regression (fMLLR) adaptation method applied to non-native speech recognition. In particular, fMLLR smoothing is proposed here to compensate for mismatches between adaptation and test data, caused by the various disfluencies of non-native speakers. The proposed fMLLR smoothing is performed with a Viterbi decoding procedure and implemented at two levels: a Gaussian mixture probability density function (mpdf) level and an observation probability density function (opdf) level. The mpdf-level smoothing is performed by comparing the pdf of each Gaussian mixture component of an original speech feature vector with that transformed by the fMLLR. On the other hand, the opdf-level smoothing compares the Gaussian mixture probabilities between the original and its fMLLR transformed feature vectors. It is shown from non-native automatic speech recognition experiments on a Korean-spoken English continuous speech corpus that an ASR system employing the proposed mpdf-level and opdf-level fMLLR smoothing methods can relatively reduce the average word error rate by 30.65% and 29.82%, respectively, when compared to a traditional fMLLR adaptation method.
Keywords :
Gaussian processes; Viterbi decoding; maximum likelihood estimation; probability; regression analysis; smoothing methods; speech recognition; transforms; Gaussian mixture probability density function; Korean-spoken English continuous speech corpus; Viterbi decoding procedure; average word error rate; fMLLR smoothing; feature-space MLLR adaptation; feature-space maximum likelihood linear regression adaptation; nonnative speech recognition; observation probability density function; transformed feature vectors; Acoustic testing; Adaptation model; Automatic speech recognition; Maximum likelihood decoding; Maximum likelihood linear regression; Natural languages; Probability density function; Smoothing methods; Speech recognition; Viterbi algorithm; Non-native speech recognition; acoustic model adaptation; feature compensation; feature-space maximum likelihood linear regression (fMLLR);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495658
Filename :
5495658
Link To Document :
بازگشت