DocumentCode :
3125523
Title :
ADANA: Active Name Disambiguation
Author :
Wang, Xuezhi ; Tang, Jie ; Cheng, Hong ; Yu, Philip S.
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
794
Lastpage :
803
Abstract :
Name ambiguity has long been viewed as a challenging problem in many applications, such as scientific literature management, people search, and social network analysis. When we search a person name in these systems, many documents (e.g., papers, web pages) containing that person´s name may be returned. It is hard to determine which documents are about the person we care about. Although much research has been conducted, the problem remains largely unsolved, especially with the rapid growth of the people information available on the Web. In this paper, we try to study this problem from a new perspective and propose an ADANA method for disambiguating person names via active user interactions. In ADANA, we first introduce a pairwise factor graph (PFG) model for person name disambiguation. The model is flexible and can be easily extended by incorporating various features. Based on the PFG model, we propose an active name disambiguation algorithm, aiming to improve the disambiguation performance by maximizing the utility of the user´s correction. Experimental results on three different genres of data sets show that with only a few user corrections, the error rate of name disambiguation can be reduced to 3.1%. A real system has been developed based on the proposed method and is available online.
Keywords :
Internet; graph theory; information retrieval; user interfaces; ADANA method; PFG model; active name disambiguation algorithm; active person name disambiguation; active user interactions; name ambiguity; pairwise factor graph model; user correction; utility maximizing; Accuracy; Approximation algorithms; Clustering algorithms; Correlation; Educational institutions; Inference algorithms; Web pages; Active Learning; Digital Library; Name Disambiguation; Social Network Analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.19
Filename :
6137284
Link To Document :
بازگشت