• DocumentCode
    2736808
  • Title

    A Method for Web Data Collection for Pervasive Computing

  • Author

    Wang, Lihong ; Qingzhong Li ; Li, Qingzhong

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan
  • Volume
    2
  • fYear
    2008
  • fDate
    6-8 Oct. 2008
  • Firstpage
    553
  • Lastpage
    558
  • Abstract
    A new method for web data collection for pervasive computing is proposed by this paper. With the fast expansion of World Wide Web, dynamic web pages become more important. They are usually generated from a database through a common template. The structured data extracted from these pages with semantic annotation are valuable for information system. In this paper, we study how to label attribute on data value, to automatically detect the template behind these pages and extract embedded data. To label attribute on data value, we rely on the fact that the label text is visually closed to the data element. And we propose a bootstrapping method for learning label. A novel algorithm is presented to detect template and construct wrapper. Experimental results obtained using a large number of pages show that the proposed technique is highly effective.
  • Keywords
    ubiquitous computing; Web data collection; bootstrapping method; learning label; pervasive computing; semantic annotation; world wide Web; Computer science; Data mining; Databases; HTML; Information resources; Information systems; Labeling; Pervasive computing; Web pages; Web sites; Pervasive Computing; data collection; template detection; wrapper generation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pervasive Computing and Applications, 2008. ICPCA 2008. Third International Conference on
  • Conference_Location
    Alexandria
  • Print_ISBN
    978-1-4244-2020-9
  • Electronic_ISBN
    978-1-4244-2021-6
  • Type

    conf

  • DOI
    10.1109/ICPCA.2008.4783674
  • Filename
    4783674