DocumentCode :
2282341
Title :
Concept Extraction and Clustering for Topic Digital Library Construction
Author :
Chengzhi Zhang ; Dan Wu
Author_Institution :
Dept. of Inf. Manage., Nanjing Univ. of Sci. & Technol., Nanjing
Volume :
3
fYear :
2008
fDate :
9-12 Dec. 2008
Firstpage :
299
Lastpage :
302
Abstract :
This paper is to introduce a new approach to build topic digital library using concept extraction and document clustering. Firstly, documents in a special domain are automatically produced by document classification approach. Then, the keywords of each document are extracted using the machine learning approach. The keywords are used to cluster the documents subset. The clustered result is the taxonomy of the subset. Lastly, the taxonomy is modified to the hierarchical structure for user navigation by manual adjustments. The topic digital library is constructed after combining the full-text retrieval and hierarchical navigation function.
Keywords :
digital libraries; information retrieval; learning (artificial intelligence); pattern classification; pattern clustering; text analysis; concept extraction; document classification approach; document clustering; full-text retrieval; hierarchical navigation function; machine learning approach; taxonomy; topic digital library construction; Data mining; Frequency; Information management; Intelligent agent; Labeling; Navigation; Software libraries; Statistics; Taxonomy; Web and internet services; concept extraction; document clustering; topic digital library;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
Type :
conf
DOI :
10.1109/WIIAT.2008.81
Filename :
4740784
Link To Document :
بازگشت