Title :
The Microsoft Academic Search challenges at KDD Cup 2013
Author :
De Cock, Martine ; Roy, Senjuti Basu ; Savvana, Swapna ; Mandava, Vani ; Dalessandro, Brian ; Perlich, Claudia ; Cukierski, William ; Hamner, Ben
Author_Institution :
Dept. of Appl. Math., CS & Stat., Ghent Univ., Gent, Belgium
Abstract :
Microsoft Academic Search is a free search engine specific to scholarly material. It currently covers more than 50 million publications and over 19 million authors across a variety of domains. One of the main challenges in correctly indexing this material is author name ambiguity and the resulting noise in author profiles. KDD Cup 2013 invited participants to tackle this problem in 2 ways: (1) by automatically determining which papers in an author profile are truly written by a given author, and (2) by identifying which author profiles need to be merged because they belong to the same author. This paper presents a brief account of the contest and the lessons learned.
Keywords :
data mining; indexing; search engines; text analysis; KDD Cup 2013; Microsoft Academic Search; author name ambiguity; author profiles; material indexing; resulting noise; scholarly material; search engine; Educational institutions; Electronic mail; Information retrieval; Lead; Materials; Measurement; Training; Microsoft Academic Search; author name disambiguation;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691761