DocumentCode
2259785
Title
Unsupervised and incremental speaker adaptation under adverse environmental conditions
Author
Takagi, Keizaburo ; Shinoda, Kazuma ; Hattori, Hiroaki ; Watanabe, Takao
Author_Institution
Inf. Technol. Res. Labs., NEC Corp., Kawasaki, Japan
Volume
4
fYear
1996
fDate
3-6 Oct 1996
Firstpage
2079
Abstract
A speaker adaptation method is described. In practical applications of speaker adaptation, adaptation and testing environments change significantly and are unknown beforehand. In such cases, since the speaker adaptation adapts a reference pattern to the adaptation utterances with regard to differences in both environment and speaker at the same time, performance in speaker adaptation would be degraded. To cope with this problem, our proposed method first eliminates the environmental differences between each input utterance and a reference pattern by using a rapid environment adaptation algorithm based on spectrum equalization (REALISE) (K. Takagi et al., 1995). Then we apply an unsupervised and incremental speaker adaptation with autonomous control using tree structure pdfs (ACTS) (K. Shinoda and T. Watanabe, 1995) to the environmentally adapted reference pattern. By combining these two methods, the resulting system is expected to perform well under adverse environmental conditions and to show a stable improvement, regardless of the amount of adaptation data. Evaluation experiments were carried out for utterances under three vehicle speed conditions. Recognition rates for a 100 Japanese word recognition task after 100 word adaptation were improved from 92% (ACTS alone) to 95% (proposed method)
Keywords
adaptive systems; natural languages; probability; speech processing; speech recognition; tree data structures; ACTS; Japanese word recognition task; REALISE; adaptation data; adaptation utterances; adverse environmental conditions; autonomous control; environmental differences; environmentally adapted reference pattern; incremental speaker adaptation; input utterance; rapid environment adaptation algorithm; reference pattern; speaker adaptation method; spectrum equalization; tree structure pdfs; unsupervised speaker adaptation; vehicle speed conditions; Additive noise; Degradation; Information technology; National electric code; Probability density function; Speech recognition; Testing; Tree data structures; Vehicles; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607211
Filename
607211
Link To Document