DocumentCode
2736808
Title
A Method for Web Data Collection for Pervasive Computing
Author
Wang, Lihong ; Qingzhong Li ; Li, Qingzhong
Author_Institution
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan
Volume
2
fYear
2008
fDate
6-8 Oct. 2008
Firstpage
553
Lastpage
558
Abstract
A new method for web data collection for pervasive computing is proposed by this paper. With the fast expansion of World Wide Web, dynamic web pages become more important. They are usually generated from a database through a common template. The structured data extracted from these pages with semantic annotation are valuable for information system. In this paper, we study how to label attribute on data value, to automatically detect the template behind these pages and extract embedded data. To label attribute on data value, we rely on the fact that the label text is visually closed to the data element. And we propose a bootstrapping method for learning label. A novel algorithm is presented to detect template and construct wrapper. Experimental results obtained using a large number of pages show that the proposed technique is highly effective.
Keywords
ubiquitous computing; Web data collection; bootstrapping method; learning label; pervasive computing; semantic annotation; world wide Web; Computer science; Data mining; Databases; HTML; Information resources; Information systems; Labeling; Pervasive computing; Web pages; Web sites; Pervasive Computing; data collection; template detection; wrapper generation;
fLanguage
English
Publisher
ieee
Conference_Titel
Pervasive Computing and Applications, 2008. ICPCA 2008. Third International Conference on
Conference_Location
Alexandria
Print_ISBN
978-1-4244-2020-9
Electronic_ISBN
978-1-4244-2021-6
Type
conf
DOI
10.1109/ICPCA.2008.4783674
Filename
4783674
Link To Document