Title :
Accurate hidden Markov models for non-audible murmur (NAM) recognition based on iterative supervised adaptation
Author :
Heracleous, Panikos ; Nakajima, Yoshiki ; Lee, Akinobu ; Saruwatari, Hiroshi ; Shikano, Kiyohiro
Author_Institution :
Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan
fDate :
30 Nov.-3 Dec. 2003
Abstract :
In previous works, we introduced a special device (Non-Audible Murmur (NATM) microphone) able to detect very quietly uttered speech (murmur), which cannot be heard by listeners near the talker. Experimental results showed the efficiency of the device in NAM recognition. Using normal-speech monophone hidden Markov models (HMM) retrained with NAM data from a specific speaker, we could recognize NAM with high accuracy. Although the results were very promising, a serious problem is the HMM retraining, which requires a large amount of training data. In this paper, we introduce a new method for NAM recognition, which requires only a small amount of NAM data for training. The proposed method is based on supervised adaptation. The main difference from other adaptation approaches lies in the fact that instead of single-iteration adaptation, we use iterative adaptation (iterative supervised MLLR). Experiments prove the efficiency of the proposed method. Using normal-speech clean initial models and only 350 adaptation NAM utterances, we achieved a recognition accuracy of 88.62%, which is a very promising result. Therefore, with a small amount of adaptation data, we were able to create accurate individual HMM. We also introduce results of experiments, which show the effects of the number of iterations, the amount of adaptation data, and the regression tree classes.
Keywords :
hidden Markov models; iterative methods; regression analysis; speech recognition; HMM retraining; NAM recognition; hidden Markov models; iterative supervised MLLR; iterative supervised adaptation; nonaudible murmur recognition; normal-speech clean initial models; recognition accuracy; regression tree classes; Head; Hidden Markov models; Iterative methods; Maximum likelihood linear regression; Microphones; Privacy; Regression tree analysis; Speech recognition; Training data; Working environment noise;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318406