Title :
TreeFinder: a first step towards XML data mining
Author :
Termier, Alexandre ; Rousset, Marie-Christine ; Sebag, Michkle
Author_Institution :
LRI, Univ. de Paris-Sud, Orsay, France
Abstract :
In this paper we consider the problem of searching frequent trees from a collection of tree-structured data modeling XML data. The TreeFinder algorithm aims at finding trees, such that their exact or perturbed copies are frequent in a collection of labelled trees. To cope with complexity issues, TreeFinder is correct but not complete: it finds a subset of actually frequent trees. The default of completeness is experimentally investigated on artificial medium size datasets; it is shown that TreeFinder reaches completeness or falls short for a range of experimental settings.
Keywords :
computational complexity; data mining; data models; hypermedia markup languages; tree data structures; TreeFinder algorithm; XML data mining; artificial medium size datasets; completeness; complexity; exactor copies; frequent tree searching; labelled trees; perturbed copies; tree-structured data; Concrete; Corporate acquisitions; Data mining; Robustness; XML;
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
DOI :
10.1109/ICDM.2002.1183987