Title :
Induction of Semantic Classes Based on Coordinate Patterns
Author :
Qiu, Likun ; Wu, Yunfang ; Shi, Jing ; Shao, Yanqiu ; Long, Zhiyi
Author_Institution :
Key Lab. of Comput. Linguistics, Peking Univ., Beijing, China
Abstract :
Many NLP and IR applications require semantic classification knowledge of words. However, manually constructing semantic classes is a time-consuming and labor-intensive task. In this paper, we present an algorithm for induction of Chinese semantic classes from natural language text based on coordinate patterns. First, several coordinate patterns are proposed to harvest high-quality coordinate instance. Second, an iterative clustering process is used to cluster words into semantic classes. The clustering process mainly used coordinate relation between words. Experiment results show that the proposed approach performs relatively well and achieves 53.2% in terms of precision. Finally, a thesaurus containing about 15000 Chinese words is generated automatically.
Keywords :
natural language processing; pattern clustering; text analysis; Chinese semantic class induction; Chinese words; coordinate pattern; information retrieval; iterative clustering process; natural language processing; natural language text; word semantic classification knowledge; bottom-up clustering; coordinate structure; language resource; semantic class;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location :
Lyon
Print_ISBN :
978-1-4577-1373-6
Electronic_ISBN :
978-0-7695-4513-4
DOI :
10.1109/WI-IAT.2011.66