Title :
Predicting Query Duplication with Box-Jenkins Models and Its Applications
Author :
Hu, Xinyao ; Meng, Shicong ; Shi, Cong ; Han, Dingyi ; Yu, Yong
Author_Institution :
Shanghai Jiao Tong Univ., Shanghai
Abstract :
Many previous works of Peer-to-Peer traffic characterization and modeling focused their attention on the distribution of query contents. However, few has been done towards a better understanding of the time series distribution of these queries, which is vital for system performance. To remedy this situation, this paper characterizes query traffic by using automatic time series analysis to evaluate different linear models(Box-Jenkins models and some simple windowed-mean models) for predicting the number of duplicated queries from 10 minutes to 2 hours into the future. Both the predictive power and the computational costs of these models are evaluated over 318,942,450 real world Gnutella queries collected over 3 months. We find the number of duplicated queries is consistently predictable. Simple, practical models like AR perform well on prediction. To show that these characteristics have a wide range of potential applications, we propose two enhancement to existing search results caching and load balancing algorithms. Our simulation study shows that our methodology works quite well in both scenarios in terms of efficiency and effectiveness. The main contribution of this paper lies in: (1) proposing new measurement techniques on Gnutella, (2) characterizing and modeling peer-to-peer query traffic with Box-Jenkins Models, (3) presenting a general enhancement to existing performance optimization algorithm in P2P systems.
Keywords :
cache storage; peer-to-peer computing; query processing; resource allocation; Box-Jenkins model; Gnutella query stream; P2P systems; automatic time series analysis; load balancing; peer-to-peer traffic characterization; performance optimization algorithm; query duplication prediction; result caching; Computational efficiency; Load management; Measurement techniques; Optimization; Peer to peer computing; Power system modeling; Predictive models; System performance; Time series analysis; Traffic control;
Conference_Titel :
Peer-to-Peer Computing, 2007. P2P 2007. Seventh IEEE International Conference on
Conference_Location :
Galway
Print_ISBN :
978-0-7695-2986-8
DOI :
10.1109/P2P.2007.21