DocumentCode
2483819
Title
Multisets and Clustering XML Documents
Author
Iyer, Swami ; Simovici, Dan A.
Author_Institution
Univ. of Massachusetts at Boston, Boston
Volume
1
fYear
2007
fDate
29-31 Oct. 2007
Firstpage
267
Lastpage
274
Abstract
We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define certain metrics on multisets. These metrics are used for clustering real and synthesized XML documents to produce high-quality clusterings.
Keywords
XML; document handling; tree data structures; tree searching; XML document clustering; document tree path; eXtensible Markup Language; high-quality clustering; multisets metrics; Artificial intelligence; Clustering algorithms; Clustering methods; Computer science; Costs; Data mining; Engines; Fourier transforms; Markup languages; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
Conference_Location
Patras
ISSN
1082-3409
Print_ISBN
978-0-7695-3015-4
Type
conf
DOI
10.1109/ICTAI.2007.18
Filename
4410294
Link To Document