DocumentCode
734996
Title
Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling
Author
Meixu Song ; Qingqing Zhang ; Jielin Pan ; Yonghong Yan
Author_Institution
Key Lab. of Speech Acoust. & Content Understanding, Beijing, China
fYear
2015
fDate
12-15 July 2015
Firstpage
20
Lastpage
24
Abstract
In HMM/DNN automatic speech recognition (ASR) systems, the DNNs model the posterior probabilities for triphone states. However, triphone states are unevenly distributed. In this situation, the training algorithm tends to converge to a local optimum more related to states with rich data than states with poor data. Thus, the imbalance of the training data decreases the ASR performances, especially for under-resourced languages. To deal with this issue, we explore a resampling technique, called “probabilistic sampling”, which can be seen as a linear smoothing between the original sampling and the uniform sampling. The effectiveness of the probabilistic sampling has been studied in two under-resourced ASR experiments. With the probabilistic sampling, the first experiment got a 6.3% relative phone error rate (PER) reduction compared to the conventional DNN baseline; the second experiment used shared-hidden-layer multilingual DNN as the baseline, and obtained a 4.9% relative PER reduction.
Keywords
hidden Markov models; learning (artificial intelligence); neural nets; probability; signal sampling; smoothing methods; speech recognition; ASR system; HMM-DNN automatic speech recognition system; PER reduction; deep neural network; linear smoothing; phone error rate; posterior probability; probabilistic sampling; resampling technique; shared-hidden-layer multilingual DNN; training algorithm; training data imbalance; triphone state; underresourced language; uniform sampling; Acoustics; Hidden Markov models; Probabilistic logic; Speech; Speech recognition; Training; Training data; Automatic speech recognition; HM-M/DNN hybrid; probabilistic sampling; under-resourced languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location
Chengdu
Type
conf
DOI
10.1109/ChinaSIP.2015.7230354
Filename
7230354
Link To Document