DocumentCode :
1901503
Title :
Improved Viterbi Algorithm-Based HMM2 for Chinese Words Segmentation
Author :
La, Lei ; Guo, Qiao ; Yang, Dequan ; Cao, Qimin
Author_Institution :
Sch. of Autom., Beijing Inst. of Technol., Beijing, China
Volume :
1
fYear :
2012
fDate :
23-25 March 2012
Firstpage :
266
Lastpage :
269
Abstract :
In order to solve problems caused by the individualism of Chinese architecture more and more researchers focus on Hybrid and improved Hidden Markov Model. However, as the foundation of Chinese natural language processing, studies on Chinese words segmentation based on Second-order Hidden Markov Model (HMM2) are not abundant. A words frequency weighted smoothing method and a Threshold-Viterbi algorithm are proposed and combined to build a Improved Viterbi Algorithm-based HHM2(IV-HMM2) model in this article to overcome the sparse problem and improve the accuracy. Experimental rusults demonstrate that the improved model has better performance and lower overhead than traditional HMM2.
Keywords :
hidden Markov models; natural language processing; text analysis; Chinese architecture; Chinese natural language processing; Chinese word segmentation; IV-HMM2 model; hybrid hidden Markov model; improved Viterbi algorithm-based HMM2; improved hidden Markov model; second-order hidden Markov model; sparse problem; threshold-Viterbi algorithm; words frequency weighted smoothing method; Computational modeling; Data models; Hidden Markov models; Natural language processing; Smoothing methods; Speech; Viterbi algorithm; Chinese words segmentation; IV-HMM2; Threshold-Viterbi Algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-0689-8
Type :
conf
DOI :
10.1109/ICCSEE.2012.249
Filename :
6188145
Link To Document :
بازگشت