DocumentCode
2771191
Title
GRAPE: A Graph-Based Framework for Disambiguating People Appearances in Web Search
Author
Jiang, Lili ; Wang, Jianyong ; Ning An ; Wang, Shengyuan ; Zhan, Jian ; Li, Lian
Author_Institution
Minist. of Educ. Sch. of Inf. Sci. & Eng., Lanzhou Univ., Lanzhou, China
fYear
2009
fDate
6-9 Dec. 2009
Firstpage
199
Lastpage
208
Abstract
Finding information about people using search engines is one of the most common activities on the Web. However, search engines usually return a long list of Web pages, which may be relevant to many namesakes, especially given the explosive growth of Web data. To address the challenge caused by name ambiguity in Web people search, this paper proposes a novel graph-based framework, GRAPE (abbr. a graph-based framework for disambiguating people appearances in Web search). In GRAPE, people tag information (e.g., people name, organization, and email address) surrounding the queried people name is extracted from the search results, a graph-based unsupervised algorithm is then developed to cluster the extracted tags, where a new method, cohesion, is introduced to measure the importance of a tag for clustering, and each final cluster of tags represents a unique people entity. Experimental results show that our proposed framework outperforms the state-of-the-art Web people name disambiguation approaches.
Keywords
Internet; graph theory; pattern clustering; search engines; Web pages; Web people name disambiguation approaches; Web search; cohesion method; graph-based framework for disambiguating people appearances; graph-based unsupervised algorithm; people tag information extraction; search engines; tags clustering; Clustering algorithms; Computer science; Computer science education; Data mining; Information science; Open source software; Pipelines; Search engines; Web pages; Web search; Clustering; Named Entity; People Name Disambiguation; Tag Extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location
Miami, FL
ISSN
1550-4786
Print_ISBN
978-1-4244-5242-2
Electronic_ISBN
1550-4786
Type
conf
DOI
10.1109/ICDM.2009.25
Filename
5360245
Link To Document