DocumentCode :
2831728
Title :
Efficient pattern discovery for semistructured data
Author :
Feng, Zhou ; Hsu, Wynne ; Lee, Mong Li
Author_Institution :
Sch. of Comput., National Univ. of Singapore, Kent Ridge
fYear :
2005
fDate :
16-16 Nov. 2005
Lastpage :
301
Abstract :
The process of discovering frequent patterns from large semistructured data repositories is one of the hardest categories of tree mining problems, since it involves the discovery of unordered embedded tree patterns. Existing work has focused primarily on the discovery of ordered, induced trees. This work proposes a divide-and-conquer algorithm called WTIMiner to discover the complete set of frequent unordered embedded subtrees. The algorithm successfully reduces the complexity of pattern matching and counting problem that a regular tree mining algorithm faces. Experimental results demonstrate the efficiency and scalability of WTIMiner in terms of both time and space
Keywords :
computational complexity; data mining; divide and conquer methods; pattern matching; trees (mathematics); WTIMiner; counting problem complexity; divide-and-conquer algorithm; frequent pattern discovery; frequent unordered embedded subtrees discovery; pattern matching complexity; semistructured data repositories; tree mining problems; Data mining; Databases; Drives; Embedded computing; Itemsets; Motion pictures; Pattern matching; Scalability; Tree graphs; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2005. ICTAI 05. 17th IEEE International Conference on
Conference_Location :
Hong Kong
ISSN :
1082-3409
Print_ISBN :
0-7695-2488-5
Type :
conf
DOI :
10.1109/ICTAI.2005.63
Filename :
1562952
Link To Document :
بازگشت