DocumentCode :
2307534
Title :
Connected digit recognition experiments with the OGI Toolkit´s neural network and HMM-based recognizers
Author :
Cosi, Piero ; Hosom, John-Paul ; Shalkwyk, J. ; Sutton, Stephen ; Cole, Ronald A.
Author_Institution :
Inst. of Phonetics, CNR, Padova, Italy
fYear :
1998
fDate :
29-30 Sep 1998
Firstpage :
135
Lastpage :
140
Abstract :
This paper describes a series of experiments that compare different approaches to training a speaker-independent continuous-speech digit recognizer using the CSLU Toolkit. Comparisons are made between the hidden Markov model (HMM) and neural network (NN) approaches. In addition, a description of the CSLU Toolkit research environment is given. The CSLU Toolkit is a research and development software environment that provides a powerful and flexible tool for creating and using spoken language systems for telephone and PC applications. In particular, the CSLU-HMM, the CSLU-NN, and the CSLU-FBNN development environments, with which our experiments were implemented, are described in detail and recognition results are compared. Our speech corpus is OGI 30K-Numbers, which is a collection of spontaneous ordinal and cardinal numbers, continuous digit strings and isolated digit strings. The utterances were recorded by having a large number of people recite their ZIP code, street address, or other numeric information over the telephone. This corpus represents a very noisy and difficult recognition task. Our best results (98% word recognition, 92% sentence recognition), obtained with the FBNN architecture, suggest the effectiveness of the CSLU Toolkit in building real-life speech recognition systems
Keywords :
hidden Markov models; learning (artificial intelligence); microcomputer applications; natural languages; neural net architecture; software tools; speech recognition; telephony; CSLU Toolkit; CSLU-FBNN; CSLU-HMM; CSLU-NN; FBNN architecture; HMM-based recognizers; OGI 30K-Numbers; OGI Toolkit neural network; PC applications; ZIP code; connected digit recognition experiments; continuous digit strings; hidden Markov model; isolated digit strings; noisy recognition task; numeric information; research and development software environment; research environment; sentence recognition; speaker-independent continuous-speech digit recognizer; speech corpus; speech recognition systems; spoken language systems; spontaneous cardinal numbers; spontaneous ordinal numbers; street address; telephone applications; training; word recognition; Application software; Hidden Markov models; Natural languages; Neural networks; Performance evaluation; Software tools; Speech recognition; Speech synthesis; System testing; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Interactive Voice Technology for Telecommunications Applications, 1998. IVTTA '98. Proceedings. 1998 IEEE 4th Workshop
Conference_Location :
Torino
Print_ISBN :
0-7803-5028-6
Type :
conf
DOI :
10.1109/IVTTA.1998.727708
Filename :
727708
Link To Document :
بازگشت