• DocumentCode
    3231640
  • Title

    Protection Techniques from Information Extraction

  • Author

    Greco, Gianluigi ; Ianni, Giovambattista ; Lio, Vincenzino ; Palopoli, Luigi

  • Author_Institution
    Calabria Univ.
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    1029
  • Lastpage
    1033
  • Abstract
    Information extraction technologies meet the market need for automatic tools for extracting semi-structured information from Web pages. However, pages may change over time due to different reasons, ranging from restyling pages to on-purpose modifications brought about into pages in order to puzzle Web wrappers. In this paper we deal with this latter scenario, by studying the issue of on-purpose wrapper spoiling and its relationship to wrapping. We present an architecture and a tool implementing a wrapper spoiling system, and discuss some practical spoiling techniques which are also experimentally tested
  • Keywords
    Internet; information retrieval; Web pages; on-purpose wrapper spoiling system; protection techniques; semistructured information extraction; Advertising; Application software; Data mining; Electronic mail; HTML; Humans; Protection; System testing; Web pages; Wrapping;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2747-7
  • Type

    conf

  • DOI
    10.1109/WI.2006.138
  • Filename
    4061515