• DocumentCode
    3031364
  • Title

    Creating Streaming Iterative Soft Clustering Algorithms

  • Author

    Hore, Prodip ; Hall, Lawrence O. ; Goldgof, Dmitry B.

  • Author_Institution
    Univ. of South Florida, Tampa
  • fYear
    2007
  • fDate
    24-27 June 2007
  • Firstpage
    484
  • Lastpage
    488
  • Abstract
    There are an increasing number of large labeled and unlabeled data sets available. Clustering algorithms are the best suited for helping one make sense out of unlabeled data. However, scaling iterative clustering algorithms to large amounts of data has been a challenge. The computation time can be very great and for data sets that will not fit in even the largest memory, only carefully chosen subsets of data can be practically clustered. We present a general approach which enables iterative fuzzy/possibilistic clustering algorithms to be turned into algorithms that can handle arbitrary amounts of streaming data. The computation time is also reduced for very large data sets while the results of clustering will be very similar to clustering with all the data, if that was possible. We introduce transformed equations for fuzzy-C-means, possibilistic C-means, the Gustafson-Kessel algorithm and show the excellent performance with a streaming fuzzy C-means implementation. The resulting clusters are both sensible and for comparable data sets (those that fit in memory) almost identical to those obtained with the original clustering algorithm.
  • Keywords
    fuzzy logic; iterative methods; pattern clustering; possibility theory; Gustafson-Kessel algorithm; fuzzy-C-means; iterative fuzzy-possibilistic clustering algorithms; possibilistic C-means; streaming iterative soft clustering algorithm; unlabeled data set; Clustering algorithms; Computer science; Equations; Fuzzy sets; Iterative algorithms; Iterative methods; Labeling; Partitioning algorithms; Sampling methods; Wrapping; clustering; fuzzy; possibilistic; scalable; streaming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Information Processing Society, 2007. NAFIPS '07. Annual Meeting of the North American
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    1-4244-1213-7
  • Electronic_ISBN
    1-4244-1214-5
  • Type

    conf

  • DOI
    10.1109/NAFIPS.2007.383888
  • Filename
    4271111