مرکز منطقه ای اطلاع رساني علوم و فناوري - EfficientTreeMiner: Mining Frequent Induced Substructures from XML Documents without Candidate Generation

DocumentCode :

3264363

Title :

EfficientTreeMiner: Mining Frequent Induced Substructures from XML Documents without Candidate Generation

Author :

Thilagam, Santhi P. ; Ananthanarayana, V.S.

Author_Institution :

NITK-Surathkal, Karnataka

fYear :

2006

fDate :

20-23 Dec. 2006

Firstpage :

541

Lastpage :

546

Abstract :

Tree structures are used extensively in domains such as XML databases, computational biology, pattern recognition, computer networks, Web mining, multi-relational data mining and so on. In this paper, we present an EfficientTreeMiner, a computationally efficient algorithm that discovers all frequently occurring induced subtrees in a database of labeled rooted unordered trees. The proposed algorithm mines frequent subtrees without generating any candidate subtrees. Efficiency is achieved by compressing the large database into a condensed data structure, namely prefix string representation, which reduces space complexity and by adopting a frequent immediate descendents method that avoids the costly generation of candidate sets. Experimental results show that our algorithm has less time complexity when compared to existing approaches and is also scalable for mining both long and short frequent subtrees.

Keywords :

XML; computational complexity; data mining; tree data structures; XML documents; condensed data structure; efficienttreeminer; frequent immediate descendents; labeled rooted unordered trees; mining frequent induced substructures; prefix string representation; space complexity; Biology computing; Computational biology; Computer networks; Data mining; Data structures; Databases; Pattern recognition; Tree data structures; Web mining; XML;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Computing and Communications, 2006. ADCOM 2006. International Conference on

Conference_Location :

Surathkal

Print_ISBN :

1-4244-0716-8

Electronic_ISBN :

1-4244-0716-8

Type :

conf

DOI :

10.1109/ADCOM.2006.4289951

Filename :

4289951

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3264363