DocumentCode
2195234
Title
OGSA-DWC: A Middleware for Deep Web Crawling Using the Grid
Author
Song, Jihwan ; Choi, Dong-Hoon ; Lee, Yoon-Joon
Author_Institution
Div. of Comput. Sci., KAIST, Daejeon, South Korea
fYear
2008
fDate
7-12 Dec. 2008
Firstpage
370
Lastpage
371
Abstract
Conventional search engines generally cannot find information from the Deep Web because they use hyper link-based crawling techniques to visit Web pages. Recently, lots of research efforts are being tried to crawl the Deep Web. One of the obstacles for crawling the Deep Web is the requirement of huge computing resources, but most of search engine companies hardly meet the needs. We, therefore, propose the design of the Grid-based middleware, OGSA-DWC for crawling the Deep Web. With our middleware, developers will easily implement a Grid-based Deep Web crawling system although they do not have much knowledge about how to use idle and distributed computing resources.
Keywords
Web sites; grid computing; middleware; open systems; search engines; software architecture; Deep Web crawling system; Web page; distributed computing resources; grid-based middleware; open grid services architecture; search engine; Computer science; Crawlers; Databases; Distributed computing; Grid computing; Information retrieval; Middleware; Production facilities; Search engines; Web pages; Deep Web; Grid; OGSA; crawling; middleware;
fLanguage
English
Publisher
ieee
Conference_Titel
eScience, 2008. eScience '08. IEEE Fourth International Conference on
Conference_Location
Indianapolis, IN
Print_ISBN
978-1-4244-3380-3
Electronic_ISBN
978-0-7695-3535-7
Type
conf
DOI
10.1109/eScience.2008.118
Filename
4736801
Link To Document