Title :
Methods for precise named entity matching in digital collections
Author :
Davis, Peter T. ; Elson, David K. ; Klavans, Judith L.
Author_Institution :
Columbia Univ., New York, NY, USA
Abstract :
We describe an interactive system, built within the context of CLiMB project, which permits a user to locate the occurrences of named entities within a given text. The named entity tool was developed to identify references to a single art object (e.g. a particular building) with high precision in text related to images of that object in a digital collection. We start with an authoritative list of art objects, and seek to match variants of these named entities in related text. Our approach is to "decay" entities into progressively more general variants while retaining high precision. As variants become more general, and thus more ambiguous, we propose methods to disambiguate intermediate results. Our results are used to select records into which automatically generated metadata are loaded.
Keywords :
art; computational linguistics; data mining; digital libraries; image retrieval; interactive systems; meta data; text analysis; CLiMB project; Computational Linguistics for Metadata Building; art object; authoritative list; digital collection; high precision text; image associated text; interactive system; intermediate result disambiguation; named entity matching; object image retrieval; reference identification; text mining; Art; Computational linguistics; Digital images; Image analysis; Image retrieval; Interactive systems; Natural languages; Robustness; Software libraries; Testing;
Conference_Titel :
Digital Libraries, 2003. Proceedings. 2003 Joint Conference on
Print_ISBN :
0-7695-1939-3
DOI :
10.1109/JCDL.2003.1204852