Title :
Segmentation of Chinese word based on method of rough segment and part of speech tagging
Author :
Jiang Fang ; Yue Xiang ; Li Guo-he ; Wu Weijiang
Author_Institution :
Coll. of Geophys. & Inf. Eng., China Univ. of Pet., Beijing, China
Abstract :
The segmentation of Chinese words from text documents is one of important contents of Chinese information processing. After every segmentation of Chinese words is obtained by the Chinese word rough segmentation by maximum match and ambiguity detection algorithms, each word in every rough segmentation is tagged by Viterbi algorithm according to HMM model of part-of-speech tagging. At last, each rough segmentation is estimated by the definition of optimal estimation function of part-of-speech tagging, and then the best one is selected as the optimal segmentation. The segmentation presented is better than others by the comparison of experiments.
Keywords :
classification; hidden Markov models; text analysis; word processing; Chinese word rough segmentation; HMM model; Viterbi algorithm; ambiguity detection algorithms; hidden Markov model; maximum match algorithms; optimal estimation function; part-of-speech tagging; Accuracy; Educational institutions; Hidden Markov models; Information processing; Speech; Tagging; Viterbi algorithm; HMM; Viterbi Algorithm; part-of-speech tagging; word segmentation;
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2013 3rd International Conference on
Conference_Location :
Dalian
DOI :
10.1109/ICCSNT.2013.6967171