Title :
Measuring XML document similarity: a case study for evaluating information extraction systems
Author :
Canfora, Gerardo ; Cerulo, Luigi ; Scognamiglio, Rita
Author_Institution :
Dept. of Eng., Univ. of Sannio, Benevento, Italy
Abstract :
Measuring similarity between trees, such as XML structured information, has an important role in many applications, and in particular in the evaluation of the effectiveness of information extraction systems (IES). In this paper we present an experience in evaluating the effectiveness of IES in terms of extraction and adaptation effectiveness. In the first part of the paper a similarity measure between XML trees based on a common subtree detection algorithm is introduced; then, a case study aimed at the evaluation of the effectiveness of a group of IES is presented as an example of application.
Keywords :
XML; information retrieval systems; knowledge management; trees (mathematics); XML document; information extraction system; subtree detection algorithm; Application software; Computer aided software engineering; Costs; Data mining; Detection algorithms; Particle measurements; Software measurement; Tree graphs; US Department of Transportation; XML;
Conference_Titel :
Software Metrics, 2004. Proceedings. 10th International Symposium on
Print_ISBN :
0-7695-2129-0
DOI :
10.1109/METRIC.2004.1357889