DocumentCode :
3169151
Title :
Efficient unsupervised extraction of words categories using symmetric patterns and high frequency words
Author :
Rong, Liu ; Zhiping, Zhang ; Ning, Pang
Author_Institution :
Foreign Language Coll., Taiyuan Univ. of Technol., Taiyuan, China
fYear :
2010
fDate :
29-30 Oct. 2010
Firstpage :
542
Lastpage :
545
Abstract :
This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.
Keywords :
graph theory; natural language processing; Chinese; POS; graph based measures; high frequency words; unsupervised words categories extraction; Semantics; sharing semantic meaning; symmetric patterns; unsupervised;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial Intelligence and Education (ICAIE), 2010 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4244-6935-2
Type :
conf
DOI :
10.1109/ICAIE.2010.5641103
Filename :
5641103
Link To Document :
بازگشت