• DocumentCode
    2195234
  • Title

    OGSA-DWC: A Middleware for Deep Web Crawling Using the Grid

  • Author

    Song, Jihwan ; Choi, Dong-Hoon ; Lee, Yoon-Joon

  • Author_Institution
    Div. of Comput. Sci., KAIST, Daejeon, South Korea
  • fYear
    2008
  • fDate
    7-12 Dec. 2008
  • Firstpage
    370
  • Lastpage
    371
  • Abstract
    Conventional search engines generally cannot find information from the Deep Web because they use hyper link-based crawling techniques to visit Web pages. Recently, lots of research efforts are being tried to crawl the Deep Web. One of the obstacles for crawling the Deep Web is the requirement of huge computing resources, but most of search engine companies hardly meet the needs. We, therefore, propose the design of the Grid-based middleware, OGSA-DWC for crawling the Deep Web. With our middleware, developers will easily implement a Grid-based Deep Web crawling system although they do not have much knowledge about how to use idle and distributed computing resources.
  • Keywords
    Web sites; grid computing; middleware; open systems; search engines; software architecture; Deep Web crawling system; Web page; distributed computing resources; grid-based middleware; open grid services architecture; search engine; Computer science; Crawlers; Databases; Distributed computing; Grid computing; Information retrieval; Middleware; Production facilities; Search engines; Web pages; Deep Web; Grid; OGSA; crawling; middleware;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    eScience, 2008. eScience '08. IEEE Fourth International Conference on
  • Conference_Location
    Indianapolis, IN
  • Print_ISBN
    978-1-4244-3380-3
  • Electronic_ISBN
    978-0-7695-3535-7
  • Type

    conf

  • DOI
    10.1109/eScience.2008.118
  • Filename
    4736801