• DocumentCode
    2507204
  • Title

    A framework for the selective dissemination of XML documents based on inferred user profiles

  • Author

    Stanoi, Ioana ; Mihaila, George ; Padmanabhan, Sriram

  • Author_Institution
    IBM Thomas J. Watson Res. Center, NY, USA
  • fYear
    2003
  • fDate
    5-8 March 2003
  • Firstpage
    531
  • Lastpage
    542
  • Abstract
    As the amount of data available online and the number of pervasive applications that take advantage of it increase, systems that support selective dissemination of information are becoming more popular. At the same time, XML is becoming the standard for document exchange over the Internet. A key capability of emerging information dissemination systems is therefore the effective filtering of a continuous stream of XML data items according to user preferences. Here we propose a model for information dissemination that integrates profile inference with data dissemination and takes advantage of the structured content in XML documents. Starting from the assumption that explicitly stating one´s information interests is an inconvenient and error-prone process, we aim to automatically construct user profiles. We do this by clustering items previously deemed valuable by the user according to a novel similarity measure that takes advantage of the semantic content of XML. Furthermore, we index the profiles from all users into a multilevel index structure whose nodes naturally will be a close match to subject areas present in the document collection. Such an approach is both intuitive and efficient since the indexing structure is not primarily affected by an increasing number of users. To support our claims we experimentally validate our method and report on its effectiveness and efficiency.
  • Keywords
    Internet; XML; document handling; information dissemination; information filters; information retrieval; Internet; XML data item; XML document; document exchange standard; information dissemination; information filtering; semantic content; user profile; Bandwidth; Deductive databases; Indexing; Information filtering; Information filters; Internet; Monitoring; Pressing; Traffic control; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2003. Proceedings. 19th International Conference on
  • Print_ISBN
    0-7803-7665-X
  • Type

    conf

  • DOI
    10.1109/ICDE.2003.1260819
  • Filename
    1260819