• DocumentCode
    2109683
  • Title

    A Grid and Fractal Dimension-Based Data Stream Clustering Algorithm

  • Author

    Lin, Guoping ; Chen, Leisong

  • Author_Institution
    Dept. of Math. & Inf. Sci., Zhangzhou Normal Univ., Zhangzhou
  • Volume
    1
  • fYear
    2008
  • fDate
    20-22 Dec. 2008
  • Firstpage
    66
  • Lastpage
    70
  • Abstract
    The data stream problem has been studied extensively in recent years. This is because the great in collection of the nature of data stream. The nature of stream data makes it essential to use algorithms which require only one pass over the data. And single-scan, stream analysis methods have been proposed in this context. However, clustering is still a challenging task since many published algorithms fail to do well in scaling with the size of the data stream sets and the number of dimensions that describe the point, or in finding arbitrary shapes of clusters, or dealing effectively with the presence of noise. In this paper, we propose a new data stream clustering approach, called GFDStream (a grid and fractal dimension-based data stream clustering). The method incorporates a grid method, and the fractal clustering methodology. This clustering idea in which divide the clustering process into an online component which periodically stores detailed summary statistics and an offline component which uses only this summary statistics and concepts of a pyramidal time frame in conjunction with a micro-clustering approach. The idea uses the fractal dimension in the grids as a parameter, and deals with the data space by gridding, which can improve the processing speed of the algorithm. Since points in the same cluster have a great degree of self-similarity among them.(and much less self-similarity with respect to points in other clusters), it can distinct the data better. We show via experiments that GFDStream effectively deals with data stream and is capable of recognizing clusters of arbitrary shape.
  • Keywords
    fractals; grid computing; pattern clustering; GFDStream; data stream clustering algorithm; fractal dimension; grid dimension; pyramidal time frame; stream analysis method; Cluster; Data stream; Fractal-Dimension; Grid;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering, 2008. ISISE '08. International Symposium on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-2727-4
  • Type

    conf

  • DOI
    10.1109/ISISE.2008.141
  • Filename
    4732171