• DocumentCode
    3063606
  • Title

    Data mining for intelligent Web caching

  • Author

    Bonchi, Francesco ; Giannotti, Fosca ; Manco, Giuseppe ; Renso, Chiara ; Nanni, Mirco ; Pedreschi, Dino ; Ruggieri, Salvatore

  • Author_Institution
    Dept. of Comput. Sci., Pisa Univ., Italy
  • fYear
    2001
  • fDate
    36982
  • Firstpage
    599
  • Lastpage
    603
  • Abstract
    Presents a vertical application of data warehousing and data mining technology: intelligent Web caching. We introduce several ways to construct intelligent Web caching algorithms that employ predictive models of Web requests; the general idea is to extend the LRU (least recently used) policy of Web and proxy servers by making it sensible to Web access models extracted from Web log data using data mining techniques. Two approaches have been studied, in particular one based on association rules and another based on decision trees. The experimental results of the new algorithms show substantial improvements over existing LRU-based caching techniques in terms of the hit rate, i.e. the fraction of Web documents directly retrieved in the cache. We designed and developed a prototypical system, which supports data warehousing of Web log data, extraction of data mining models and simulation of the Web caching algorithms, around an architecture that integrates the various phases in the knowledge discovery process. The system supports a systematic evaluation and benchmarking of the proposed algorithms with respect to existing caching strategies
  • Keywords
    Internet; cache storage; data mining; data warehouses; decision trees; file servers; information resources; Web access models; Web document retrieval; Web log data; Web servers; World Wide Web requests; algorithm evaluation; algorithm simulation; association rules; benchmarking; data mining; data warehousing; decision trees; hit rate; intelligent Web caching; knowledge discovery; least recently used policy; predictive models; proxy servers; vertical application; Application software; Computer science; Councils; Data mining; Electronic mail; Network servers; Prediction algorithms; Predictive models; Service oriented architecture; Web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: Coding and Computing, 2001. Proceedings. International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    0-7695-1062-0
  • Type

    conf

  • DOI
    10.1109/ITCC.2001.918862
  • Filename
    918862