• DocumentCode
    2728781
  • Title

    Leveraging Webpage Classification for Data Object Recognition

  • Author

    Lin, Ling ; Zhou, Lizhu

  • Author_Institution
    Tsinghua Univ., Beijing
  • fYear
    2007
  • fDate
    2-5 Nov. 2007
  • Firstpage
    667
  • Lastpage
    670
  • Abstract
    Data-rich webpages are providing an increasingly important data source for web applications. While the problem of data object recognition is intensively discussed, it is mostly addressed as a separated process from the frontier task of relevant webpage identification. In this paper, we propose a method to leverage the classification result of data-rich webpages for efficient and scalable data object recognition. A novel context information is proposed, which can be inferred from the webpage classification and exploited in the bottom-up data object recognition. Experimental results show that the context information brings a 19% improvement in the running efficiency of the bottom- up data object recognition.
  • Keywords
    Web sites; pattern recognition; Web applications; Web page classification; Web page identification; context information; data object recognition; data-rich Web pages; Data mining; Decision trees; Labeling; Object recognition; Search engines; Software libraries; Tellurium; Utility programs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, IEEE/WIC/ACM International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3026-0
  • Type

    conf

  • DOI
    10.1109/WI.2007.48
  • Filename
    4427170