Title :
GRAPE: A Graph-Based Framework for Disambiguating People Appearances in Web Search
Author :
Jiang, Lili ; Wang, Jianyong ; Ning An ; Wang, Shengyuan ; Zhan, Jian ; Li, Lian
Author_Institution :
Minist. of Educ. Sch. of Inf. Sci. & Eng., Lanzhou Univ., Lanzhou, China
Abstract :
Finding information about people using search engines is one of the most common activities on the Web. However, search engines usually return a long list of Web pages, which may be relevant to many namesakes, especially given the explosive growth of Web data. To address the challenge caused by name ambiguity in Web people search, this paper proposes a novel graph-based framework, GRAPE (abbr. a graph-based framework for disambiguating people appearances in Web search). In GRAPE, people tag information (e.g., people name, organization, and email address) surrounding the queried people name is extracted from the search results, a graph-based unsupervised algorithm is then developed to cluster the extracted tags, where a new method, cohesion, is introduced to measure the importance of a tag for clustering, and each final cluster of tags represents a unique people entity. Experimental results show that our proposed framework outperforms the state-of-the-art Web people name disambiguation approaches.
Keywords :
Internet; graph theory; pattern clustering; search engines; Web pages; Web people name disambiguation approaches; Web search; cohesion method; graph-based framework for disambiguating people appearances; graph-based unsupervised algorithm; people tag information extraction; search engines; tags clustering; Clustering algorithms; Computer science; Computer science education; Data mining; Information science; Open source software; Pipelines; Search engines; Web pages; Web search; Clustering; Named Entity; People Name Disambiguation; Tag Extraction;
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
DOI :
10.1109/ICDM.2009.25