DocumentCode :
3585510
Title :
Keywords Extraction from Chinese Document Based on Complex Network Theory
Author :
Jiangxia Nan ; Bo Xiao ; Zhiqing Lin ; Qianfang Xu
Author_Institution :
Inst. of Sensing Technol. & Bus., Beijing Univ. of Posts & Telecommun. Beijing, Beijing, China
Volume :
2
fYear :
2014
Firstpage :
383
Lastpage :
386
Abstract :
Keywords extraction is the process of choosing several words from a document to express its main idea. Keywords help people understand an article quickly and clearly. In recent years, more and more researchers pay attention to its research since its important role in text clustering, text classification, automatic abstracting, and text retrieval. This paper proposes an algorithm called EC-DC to extract keywords based on centrality measures of complex network. A document is mapped to a network with its words mapped to vertices and relations between words mapped to edges. Then, the importance of words is evaluated using eccentricity centrality and degree centrality. The most important K words are extracted as keywords. Experimental results show that the EC-DC algorithm has an improvement of about 9% in precision, recall and F-score compared to classical TFIDF algorithm.
Keywords :
complex networks; feature extraction; text analysis; Chinese document; EC-DC algorithm; automatic abstracting; complex network centrality measures; complex network theory; degree centrality; eccentricity centrality; keywords extraction; text classification; text clustering; text retrieval; Approximation algorithms; Business; Complex networks; Data mining; Feature extraction; Internet; Semantics; complex network; degree centrality; document network; eccentricity centrality; keywords extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Design (ISCID), 2014 Seventh International Symposium on
Print_ISBN :
978-1-4799-7004-9
Type :
conf
DOI :
10.1109/ISCID.2014.183
Filename :
7082012
Link To Document :
بازگشت