DocumentCode :
2736808
Title :
A Method for Web Data Collection for Pervasive Computing
Author :
Wang, Lihong ; Qingzhong Li ; Li, Qingzhong
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan
Volume :
2
fYear :
2008
fDate :
6-8 Oct. 2008
Firstpage :
553
Lastpage :
558
Abstract :
A new method for web data collection for pervasive computing is proposed by this paper. With the fast expansion of World Wide Web, dynamic web pages become more important. They are usually generated from a database through a common template. The structured data extracted from these pages with semantic annotation are valuable for information system. In this paper, we study how to label attribute on data value, to automatically detect the template behind these pages and extract embedded data. To label attribute on data value, we rely on the fact that the label text is visually closed to the data element. And we propose a bootstrapping method for learning label. A novel algorithm is presented to detect template and construct wrapper. Experimental results obtained using a large number of pages show that the proposed technique is highly effective.
Keywords :
ubiquitous computing; Web data collection; bootstrapping method; learning label; pervasive computing; semantic annotation; world wide Web; Computer science; Data mining; Databases; HTML; Information resources; Information systems; Labeling; Pervasive computing; Web pages; Web sites; Pervasive Computing; data collection; template detection; wrapper generation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pervasive Computing and Applications, 2008. ICPCA 2008. Third International Conference on
Conference_Location :
Alexandria
Print_ISBN :
978-1-4244-2020-9
Electronic_ISBN :
978-1-4244-2021-6
Type :
conf
DOI :
10.1109/ICPCA.2008.4783674
Filename :
4783674
Link To Document :
بازگشت