• DocumentCode
    245330
  • Title

    On Utilizing Stochastic Non-linear Fractional Bin Packing to Resolve Distributed Web Crawling

  • Author

    Yazid, Anis ; Oommen, B. John ; Granmo, Ole-Christoffer ; Goodwin, Morten

  • Author_Institution
    Dept. of Comput. Sci., Univ. Coll. of Oslo & Akershus, Oslo, Norway
  • fYear
    2014
  • fDate
    19-21 Dec. 2014
  • Firstpage
    32
  • Lastpage
    37
  • Abstract
    This paper deals with the extremely pertinent problem of web crawling, which is far from trivial considering the magnitude and all-pervasive nature of the World-Wide Web. While numerous AI tools can be used to deal with this task, in this paper we map the problem onto the combinatorially-hard stochastic non-linear fractional knapsack problem, which, in turn, is then solved using Learning Automata (LA). Such LA-based solutions have been recently shown to outperform previous state-of-the-art approaches to resource allocation in Web monitoring. However, the ever growing deployment of distributed systems raises the need for solutions that cope with a distributed setting. In this paper, we present a novel scheme for solving the non-linear fractional bin packing problem. Furthermore, we demonstrate that our scheme has applications to Web crawling, i.e., Distributed resource allocation, and in particular, to distributed Web monitoring. Comprehensive experimental results demonstrate the superiority of our scheme when compared to other classical approaches.
  • Keywords
    Internet; automata theory; bin packing; learning (artificial intelligence); stochastic processes; LA; Web crawling; World Wide Web; distributed resource allocation; learning automata; stochastic nonlinear fractional bin packing problem; stochastic nonlinear fractional knapsack problem; Automata; Crawlers; Educational institutions; Materials; Monitoring; Resource management; Web pages; Bin Packing; Distributed Web Monitoring; Learning Automata;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4799-7980-6
  • Type

    conf

  • DOI
    10.1109/CSE.2014.40
  • Filename
    7023551