DocumentCode :
2056190
Title :
Link Discovery: A Comprehensive Analysis
Author :
Erbs, Nicolai ; Zesch, Torsten ; Gurevych, Iryna
Author_Institution :
Ubiquitous Knowledge Process. Lab., Tech. Univ. Darmstadt, Darmstadt, Germany
fYear :
2011
fDate :
18-21 Sept. 2011
Firstpage :
83
Lastpage :
86
Abstract :
We present a comprehensive analysis of link discovery approaches. We classify them with regard to the type of knowledge being used, and identify three commonly used sources of knowledge: The text of a document, the document title, and already existing links. We analyze the influence of the knowledge source as well as of the amount of training data used. Results show that the link-based approach performs best if the amount of training data is huge. In a more realistic setting with fewer training data, the text-based approach yields better results.
Keywords :
data mining; document handling; document text; document title; knowledge sources; link discovery; Accuracy; Electronic publishing; Encyclopedias; Internet; Joining processes; Training data; Link Discovery; Natural Language Processing; Wikipedia; Wikis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
Conference_Location :
Palo Alto, CA
Print_ISBN :
978-1-4577-1648-5
Electronic_ISBN :
978-0-7695-4492-2
Type :
conf
DOI :
10.1109/ICSC.2011.63
Filename :
6061290
Link To Document :
بازگشت