DocumentCode :
290051
Title :
Approaches to topic identification on the switchboard corpus
Author :
McDonough, J. ; Ng, K. ; Jeanrenaud, P. ; Gish, H. ; Rohlicek, J.R.
Author_Institution :
BNN Syst. & Technol., Cambridge, MA, USA
Volume :
i
fYear :
1994
fDate :
19-22 Apr 1994
Abstract :
Topic identification (TID) is the automatic classification of speech messages into one of a known set of possible topics. The TID task can be view as having three principal components: 1) event generation, 2) keyword event selection, and 3) topic modeling. Using data from the Switchboard corpus, the authors present experimental results for various approaches to the TID problem and compare the relative effectiveness of each. In addition, they examine the effect of keyword set size on identification accuracy and gauge the loss in performance when mismatched topic modeling and keyword selection schemes are used
Keywords :
identification; speech processing; speech recognition; TID problem; automatic classification; event generation; identification accuracy; keyword event selection; keyword selection schemes; keyword set size; performance; speech messages; switchboard corpus; topic identification; topic modeling; Air traffic control; Data mining; Event detection; Feature extraction; Hidden Markov models; Natural languages; Performance loss; Speech recognition; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
ISSN :
1520-6149
Print_ISBN :
0-7803-1775-0
Type :
conf
DOI :
10.1109/ICASSP.1994.389275
Filename :
389275
Link To Document :
بازگشت