Title :
Collaborative Web Data Record Extraction
Author :
Miao, Gengxin ; Kart, Firat ; Moser, L.E. ; Melliar-Smith, P.M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Santa Barbara, CA, USA
Abstract :
This paper describes a Web service that automatically parses and extracts data records from Web pages containing structured data. The Web service allows multiple users to share and manage a Web data record extraction task to increase its utility. A recommendation system, based on the probabilistic latency semantic indexing algorithm, enables a user to find potentially interesting content or other users who share the same interests with the user. A distributed computing platform improves the scalability of the Web service in supporting multiple users by employing multiple server computers. A Web service interface allows users to access the Web service, and allows programmers to develop their own applications and, thus, extend the functionality of the Web service.
Keywords :
Web services; data mining; data structures; groupware; information filters; information retrieval; Web page; Web service interface; collaborative Web data record extraction; data mining; data structure; distributed computing platform; multiple server computer; probabilistic latency semantic indexing algorithm; recommendation system; Collaboration; Data mining; Delay; Distributed computing; File servers; Indexing; Programming profession; Scalability; Web pages; Web services; Web Service; collaborative information extraction; data mining;
Conference_Titel :
Web Services, 2009. ICWS 2009. IEEE International Conference on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-0-7695-3709-2
DOI :
10.1109/ICWS.2009.109