Title :
Clustering word category based on binomial posteriori co-occurrence distribution
Author :
Tamoto, Masafunmi ; Kawabata, Takeshi
Author_Institution :
NTT Basic Res. Labs., Kanagawa, Japan
Abstract :
This paper describes a word clustering technique for stochastic language modeling and reports experimental evidence for its validity. The binomial posteriori distribution (BPD) distance measurement between words is introduced. It is based on word co-occurrency and reliability. We plan to consider a practical application of this clustering technology by utilizing each cluster as a Markov state in the construction of a word prediction model
Keywords :
Markov processes; binomial distribution; natural languages; reliability; speech processing; stochastic processes; Markov state; binomial posteriori co-occurrence distribution; clustering technology; distance measurement; experiment; stochastic language modeling; word category clustering technique; word co-occurrency; word prediction model; word reliability; Distance measurement; Frequency estimation; Laboratories; Mutual information; Parameter estimation; Predictive models; Probability; Robustness; Stochastic processes;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
Print_ISBN :
0-7803-2431-5
DOI :
10.1109/ICASSP.1995.479390