DocumentCode :
1133436
Title :
An introduction to voice search
Author :
Wang, Ye-Yi ; Yu, Dong ; Ju, Yun-Cheng ; Acero, Alex
Author_Institution :
Shanghai Jiao Tong Univ., Shanghai
Volume :
25
Issue :
3
fYear :
2008
fDate :
5/1/2008 12:00:00 AM
Firstpage :
28
Lastpage :
38
Abstract :
Voice search is the technology underlying many spoken dialog systems (SDSs) that provide users with the information they request with a spoken query. The information normally exists in a large database, and the query has to be compared with a field in the database to obtain the relevant information. The contents of the field, such as business or product names, are often unstructured text. This article categorized spoken dialog technology into form filling, call routing, and voice search, and reviewed the voice search technology. The categorization was made from the technological perspective. It is important to note that a single SDS may apply the technology from multiple categories. Robustness is the central issue in voice search. The technology in acoustic modeling aims at improved robustness to environment noise, different channel conditions, and speaker variance; the pronunciation research addresses the problem of unseen word pronunciation and pronunciation variance; the language model research focuses on linguistic variance; the studies in search give rise to improved robustness to linguistic variance and ASR errors; the dialog management research enables graceful recovery from confusions and understanding errors; and the learning in the feedback loop speeds up system tuning for more robust performance. While tremendous achievements have been accomplished in the past decade on voice search, large challenges remain. Many voice search dialog systems have automation rates around or below 50% in field trials.
Keywords :
interactive systems; query processing; speaker recognition; acoustic modeling; automatic speech recognition error; channel condition; dialog management; environment noise; large database; linguistic variance; pronunciation research; query processing; speaker variance; spoken dialog system categorization; voice search; Acoustic noise; Automatic speech recognition; Databases; Environmental management; Filling; Loudspeakers; Natural languages; Noise robustness; Routing; Working environment noise;
fLanguage :
English
Journal_Title :
Signal Processing Magazine, IEEE
Publisher :
ieee
ISSN :
1053-5888
Type :
jour
DOI :
10.1109/MSP.2008.918411
Filename :
4490199
Link To Document :
بازگشت