Cloud-Based Name Disambiguation Algorithm

Author

Juan, Yang ; Hua, He ; Bin, Wu

Author_Institution

Beijing Key Lab. of Intell. Telecommun. Software & Multimedia, Beijing Univ. of Posts & Telecommun., Beijing, China

Volume

2

fYear

2010

fDate

7-8 Aug. 2010

Firstpage

155

Lastpage

158

Abstract

In Scientific Collaboration Networks, the phenomenon that one author name corresponds to many author entities is very common. Traditional algorithms for name disambiguation performed inefficiently in dealing with massive data. This paper presents a parallel algorithm for solving the name disambiguation problem: first merge authors with same names and similar author information, then divide the scientific collaboration networks into author communities, authors with same name in one community is supposed as one entity with great possibility. The algorithm is based on the Cloud-Computing platform, and has the ability to deal with massive data. In our experiment, the algorithm efficiently processed massive data and achieved an average f-score of 0.93.

Keywords

data handling; groupware; parallel algorithms; cloud-based name disambiguation algorithm; cloud-computing platform; massive data; parallel algorithm; scientific collaboration networks; Algorithm design and analysis; Cloud computing; Clustering algorithms; Collaboration; Communities; Software; Software algorithms; Cloud Computing; Community Detection; Name Disambiguation; Similarity;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Science and Management Engineering (ISME), 2010 International Conference of

Conference_Location

Xi´an

Print_ISBN

978-1-4244-7669-5

Electronic_ISBN

978-1-4244-7670-1

Type

conf

DOI

10.1109/ISME.2010.33

Filename

5573917