Title :
Efficient extraction of maximally common subtrees from XML documents for web services
Author :
Paik, Juryon ; Son, Y.J. ; Fouthoi, Farshad ; Kim, Ungmo
Author_Institution :
Dept. of Comput. Eng., Sungkyunkwan Univ., Gyeonggi
Abstract :
Web services need to integrate and classify XML documents received from multiple and heterogeneous sources. To this end, it requires a mechanism for extracting common structures from a large XML dataset, called frequent subtrees. In this paper we propose an efficient and scalable algorithm, EMaxS, for mining frequent subtrees of Web XML documents stored in Web servers. Compared with previous works, the proposed algorithm uses only simple bitwise operations and does not require any join steps, which are typically expensive
Keywords :
Internet; XML; data mining; tree data structures; Web service; XML document handling; data mining; data structure; frequent pattern discovery; frequent subtrees; maximally common subtree; Computer science; Data mining; Data warehouses; Iterative methods; Simple object access protocol; Surges; Tree data structures; Web server; Web services; XML;
Conference_Titel :
Advanced Communication Technology, 2005, ICACT 2005. The 7th International Conference on
Conference_Location :
Phoenix Park
DOI :
10.1109/ICACT.2005.246224