• DocumentCode
    2483819
  • Title

    Multisets and Clustering XML Documents

  • Author

    Iyer, Swami ; Simovici, Dan A.

  • Author_Institution
    Univ. of Massachusetts at Boston, Boston
  • Volume
    1
  • fYear
    2007
  • fDate
    29-31 Oct. 2007
  • Firstpage
    267
  • Lastpage
    274
  • Abstract
    We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define certain metrics on multisets. These metrics are used for clustering real and synthesized XML documents to produce high-quality clusterings.
  • Keywords
    XML; document handling; tree data structures; tree searching; XML document clustering; document tree path; eXtensible Markup Language; high-quality clustering; multisets metrics; Artificial intelligence; Clustering algorithms; Clustering methods; Computer science; Costs; Data mining; Engines; Fourier transforms; Markup languages; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
  • Conference_Location
    Patras
  • ISSN
    1082-3409
  • Print_ISBN
    978-0-7695-3015-4
  • Type

    conf

  • DOI
    10.1109/ICTAI.2007.18
  • Filename
    4410294