Title :
Chinese New Words Extraction Based on Machine Learning Approach
Author :
Zhang, Zi-Ru ; Wang, Qiang-jun ; Tian, Xue-dong
Author_Institution :
Coll. of Humanities, Hebei Univ., Baoding
Abstract :
Chinese new words extraction is an important problem for Chinese information processing. In this paper a new words extraction method based on machine learning is proposed, where the context information, the word construction rules and statistic information are combined to extract new words. An experiment, based on two-character-nouns, shows that this method can well improve the efficiency and accuracy of extracting new words
Keywords :
dictionaries; learning (artificial intelligence); natural languages; text analysis; Chinese information processing; Chinese new word extraction; context information; dictionary; machine learning approach; statistic information; two-character-nouns; word construction rules; Cybernetics; Data mining; Dictionaries; Educational institutions; Information processing; Machine learning; Mathematics; Natural languages; Probability; Statistics; Text processing; New words extraction; machine learning; word construction rules; word segmentation;
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
DOI :
10.1109/ICMLC.2006.258498