Title :
Probability Density Estimation over evolving data streams using Tilted Parzen Window
Author :
Hong Shen ; Xiao-Long Yan
Author_Institution :
Dept. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei
Abstract :
Probability density estimation is a very important technology which has been widely used in data mining and data analysis. In this paper, we generalize the traditional Parzen window method to data streams and propose a new method of tilted Parzen window (TPW) for probability density estimation. To adapt to the evolvement of the data streams, we use the tilted window size that is proportional to datapsilas arrival time instead of the fixed window size. Theoretical analysis shows that the tilted Parzen window method is a valid method for estimating the probability density function (pdf) for data streams. We also propose a new strategy for discarding the historical data in data streams. We prove that this strategy can describe the probability density changes more accurately than the conventional discarding strategy. Empirical results on synthetic data set demonstrate the effectiveness and efficiency of this method.
Keywords :
data analysis; probability; data streams; probability density estimation; probability density function; synthetic data set; tilted Parzen window method; Australia; Data analysis; Data mining; Hard disks; Merging; Probability density function; Real time systems; Streaming media; Telephony; Web pages;
Conference_Titel :
Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
Conference_Location :
Marrakech
Print_ISBN :
978-1-4244-2702-4
Electronic_ISBN :
1530-1346
DOI :
10.1109/ISCC.2008.4625751