• DocumentCode
    2310289
  • Title

    Scalable fuzzy neighborhood DBSCAN

  • Author

    Parker, Jonathon K. ; Hall, Lawrence O. ; Kandel, Abraham

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
  • fYear
    2010
  • fDate
    18-23 July 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    The majority of data available in most disciplines is unlabeled and unclassified. The amount of data is often massive, hence scalable processing methods are required. One method of providing structure to unlabeled data is to group it by clustering. Density based methods discover the number of clusters. Additionally, the shape of such clusters can also be irregular. In this paper we examine a version of DBSCAN modified to use fuzzy membership functions (FN-DBSCAN). FN-DBSCAN was implemented using the WEKA data mining framework and a scalable technique (SFN-DBSCAN) is simulated using the framework. Experimental results show that SFN-DBSCAN can be over three times as fast as FN-DBSCAN for small to medium size data. The resulting cluster assignments match at an average rate of 90% when compared with assignments by FN-DBSCAN. SFN-DBSCAN´s speed increases proportionally with respect to the number of subsets, but cluster assignment concurrence between FN-DBSCAN and SFN-DBSCAN suffers from degradation as the number of subsets increase.
  • Keywords
    data mining; fuzzy set theory; WEKA data mining framework; density based method; fuzzy membership functions; scalable fuzzy neighborhood DBSCAN; scalable processing; scalable technique; Accuracy; Classification algorithms; Clustering algorithms; Complexity theory; Fuzzy logic; Noise; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems (FUZZ), 2010 IEEE International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1098-7584
  • Print_ISBN
    978-1-4244-6919-2
  • Type

    conf

  • DOI
    10.1109/FUZZY.2010.5584527
  • Filename
    5584527