DocumentCode :
134894
Title :
Cluster validation techniques for Bibliographic databases
Author :
Mishra, Sumit ; Saha, Sriparna ; Mondal, Samrat
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Patna, Patna, India
fYear :
2014
fDate :
Feb. 28 2014-March 2 2014
Firstpage :
93
Lastpage :
98
Abstract :
In entity name disambiguation technique, records of same entity are clustered together. One of the major challenges in such technique is to validate the result as the actual or correct results are often not known or difficult to know. In this context, three commonly known evaluation measures are precision, recall and f-measure. All these indices are external validity indices as they all need gold standard data. But in Bibliographic databases like DBLP, Arnetminer, Scopus, Web of Science etc., obtaining golden standard is very difficult for each entity. So, there is a need to use some other metrics to evaluate the performance on Bibliographic data. In this paper, a novel scheme based on internal validity index is used to evaluate the performance of entity name disambiguation algorithm. Several distance measures are used here to compute the similarity between two records. These functions are then incorporated in the definitions of internal validity indices.
Keywords :
bibliographic systems; bibliographies; database indexing; pattern clustering; bibliographic databases; cluster validation techniques; entity name disambiguation algorithm; entity name disambiguation technique; external validity indices; f-measure; internal validity index; precision; recall; Clustering algorithms; Equations; Gold; Indexes; Information services; Mathematical model; Standards; Bibliographic Database; Entity name disambiguation; Golden Standard; Validity Index;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Students' Technology Symposium (TechSym), 2014 IEEE
Conference_Location :
Kharagpur
Print_ISBN :
978-1-4799-2607-7
Type :
conf
DOI :
10.1109/TechSym.2014.6807921
Filename :
6807921
Link To Document :
بازگشت