• DocumentCode
    1484875
  • Title

    Data Mining for XML Query-Answering Support

  • Author

    Mazuran, Mirjana ; Quintarelli, Elisa ; Tanca, Letizia

  • Author_Institution
    Dipt. di Elettron. e Inf., Politec. di Milano, Milan, Italy
  • Volume
    24
  • Issue
    8
  • fYear
    2012
  • Firstpage
    1393
  • Lastpage
    1407
  • Abstract
    Extracting information from semistructured documents is a very hard task, and is going to become more and more critical as the amount of digital information available on the Internet grows. Indeed, documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this paper, we describe an approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intensional information on both the structure and the contents of Extensible Markup Language (XML) documents, and can be stored in XML format as well. This mined knowledge is later used to provide: 1) a concise idea-the gist-of both the structure and the content of the XML document and 2) quick, approximate answers to queries. In this paper, we focus on the second feature. A prototype system and experimental results demonstrate the effectiveness of the approach.
  • Keywords
    Internet; XML; data mining; document handling; query processing; trees (mathematics); Internet; TAR; XML query-answering support; data mining; digital information; extensible markup language documents; information extraction; interpretable knowledge; semistructured documents; tree-based association rules; Association rules; Context; Indexes; Metals; Proposals; Semantics; XML; XML; approximate query-answering; data mining; intensional information; succinct answers.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2011.80
  • Filename
    5740892