DocumentCode :
302103
Title :
Clustering words for statistical language models based on contextual word similarity
Author :
Farhat, Azarshid ; Isabelle, Jean-François ; O´Shaughnessy, Douglas
Author_Institution :
INRS Telecommun., Ile des Soeurs, Que., Canada
Volume :
1
fYear :
1996
fDate :
7-10 May 1996
Firstpage :
180
Abstract :
This paper describes a new word clustering approach for statistical language modeling. The classification criteria used by our approach is the contextual word similarity used in a simplified clustering algorithm. This clustering technique was tested on the INRS speech recognizer using the spontaneous English corpora, ATIS. Automatic word classification increases the word accuracy rate by 8.6% with a perplexity reduction about of 6.9%
Keywords :
natural languages; pattern classification; speech recognition; statistical analysis; ATIS; INRS speech recognizer; automatic word classification; classification criteria; clustering algorithm; contextual word similarity; perplexity reduction; spontaneous English corpora; statistical language models; word clustering approach; Automatic speech recognition; Business; Clustering algorithms; Context modeling; Natural languages; Smoothing methods; Speech recognition; Stochastic processes; Testing; US Department of Transportation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
ISSN :
1520-6149
Print_ISBN :
0-7803-3192-3
Type :
conf
DOI :
10.1109/ICASSP.1996.540320
Filename :
540320
Link To Document :
بازگشت