Title :
Interpolated distanced bigram language models for robust word clustering
Author :
Bassiou, N.K. ; Kotropoulos, C.L.
Author_Institution :
Aristotle Univ. of Thessaloniki, Greece
Abstract :
Summary form only given. Two methods for interpolating the distanced bigram language model are examined which take into account pairs of words that appear at varying distances within a context. The language models under study yield a lower perplexity than the baseline bigram model. A word clustering algorithm based on mutual information with robust estimates of the mean vector and the covariance matrix is employed in the proposed interpolated language model. The word clusters obtained by using the aforementioned language model are proved more meaningful than the word clusters derived using the baseline bigram.
Keywords :
covariance matrices; interpolation; natural languages; covariance matrix mean vector estimation; distanced bigram language models; interpolated language models; model perplexity; mutual information; robust word clustering algorithm; word context distance variation; Clustering algorithms; Context modeling; Covariance matrix; Mutual information; Robustness;
Conference_Titel :
Nonlinear Signal and Image Processing, 2005. NSIP 2005. Abstracts. IEEE-Eurasip
Conference_Location :
Sapporo
Print_ISBN :
0-7803-9064-4
DOI :
10.1109/NSIP.2005.1502228