Title :
Active and unsupervised learning for spoken word acquisition through a multimodal interface
Author_Institution :
ATR Spoken Language Translation Lab., Kyoto, Japan
Abstract :
This work presents a new interactive learning method for spoken word acquisition through human-machine multimodal interfaces. During the course of learning, a machine makes a decision about whether an orally input word is a word in the lexicon the machine has learned using both speech and visual cues. Learning is carried out on-line, incrementally, based on a combination of active and unsupervised learning principles. If the machine judges with a high degree of confidence that its decision is correct, it learns the statistical models of the word and a corresponding image class as its meaning in an unsupervised way. Otherwise, it asks the user a question in an active way. The function used to estimate the degree of confidence is also learned adaptively on-line. Experimental results show that the method enables a machine and a user to adapt to each other, which makes the learning process more efficient.
Keywords :
decision making; speech processing; speech-based user interfaces; statistical analysis; unsupervised learning; adaptive online learning; decision making; human-machine multimodal interfaces; interactive learning; spoken word acquisition; statistical word models; unsupervised learning; Humans; Intelligent robots; Learning systems; Machine learning; Man machine systems; Natural languages; Robot sensing systems; Speech; Ubiquitous computing; Unsupervised learning;
Conference_Titel :
Robot and Human Interactive Communication, 2004. ROMAN 2004. 13th IEEE International Workshop on
Print_ISBN :
0-7803-8570-5
DOI :
10.1109/ROMAN.2004.1374800