DocumentCode
2548801
Title
Name Disambiguation Using Atomic Clusters
Author
Wang, Feng ; Li, Juanzi ; Tang, Jie ; Zhang, Jing ; Wang, Kehong
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
fYear
2008
fDate
20-22 July 2008
Firstpage
357
Lastpage
364
Abstract
Name ambiguity is a critical problem in many applications, in particular in the online bibliography systems, such as DBLP and CiteSeer. Previously, several clustering based methods have been proposed although, the problem still presents to be a big challenge for both research and industry communities. In this paper, we present a complementary study to the problem from another point of view. We propose an approach of finding atomic clusters to improve the performance of existing clustering-based methods. We conducted experiments on a dataset from a real-world system: Arnetminer.org. Experiments results show that significant improvements can be obtained by using the proposed atomic clusters finding approach (about +8% and +27% improvements depending on different clustering methods).
Keywords
bibliographic systems; pattern clustering; text analysis; Arnetminer.org; CiteSeer; DBLP; atomic cluster finding; clustering-based method; name disambiguation; online bibliography systems; Application software; Atomic measurements; Bibliographies; Clustering algorithms; Clustering methods; Computer science; Hidden Markov models; Information management; Proposals; atomic cluster; name disambiguation;
fLanguage
English
Publisher
ieee
Conference_Titel
Web-Age Information Management, 2008. WAIM '08. The Ninth International Conference on
Conference_Location
Zhangjiajie Hunan
Print_ISBN
978-0-7695-3185-4
Electronic_ISBN
978-0-7695-3185-4
Type
conf
DOI
10.1109/WAIM.2008.96
Filename
4597035
Link To Document