Title :
Towards unsupervised speech processing
Author_Institution :
MIT Comput. Sci. & Artificial Intell. Lab., Cambridge, MA, USA
Abstract :
The development of an automatic speech recognizer is typically a highly supervised process involving the specification of phonetic inventories, lexicons, acoustic and language models, and requiring annotated training corpora consisting of parallel speech and text data. Although some model parameters may be modified via adaptation, the overall structure of the speech recognizer usually remains relatively static. While this approach has been effective for problems where there is adequate human expertise, and labelled corpora are available, it is challenged by less-supervised or unsupervised scenarios. It also contrasts sharply with human speech processing where learning is an inherent ability. In this paper, three alternative scenarios for speech recognition “training” are described, each requiring decreasing amounts of human expertise and annotated resources, and increasing amounts of unsupervised learning. A speech deciphering challenge is then suggested whereby speech recognizers must learn sub-word inventories and word pronunciations from unannotated speech, supplemented with only non-parallel text resources. It is argued that such a capability will help alleviate the language barrier that currently limits the scope of speech recognition capabilities around the world, and empower speech recognizers to continually learn and evolve through use.
Keywords :
speech recognition; unsupervised learning; acoustic model; automatic speech recognizer; language barrier; language model; lexicons; nonparallel text resource; parallel speech; phonetic inventory; speech deciphering; speech recognition; text data; unsupervised learning; unsupervised speech processing; Acoustics; Hidden Markov models; Humans; Speech; Speech processing; Speech recognition; Training;
Conference_Titel :
Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on
Conference_Location :
Montreal, QC
Print_ISBN :
978-1-4673-0381-1
Electronic_ISBN :
978-1-4673-0380-4
DOI :
10.1109/ISSPA.2012.6310546