• DocumentCode
    3186660
  • Title

    GCA: An algorithm based on the gower similarity for clustering of categorical variables

  • Author

    dos Santos, T.R.L. ; Zarate, Luis E.

  • fYear
    2012
  • fDate
    1-5 Oct. 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The data clustering is a technique used to make groups of objects present similar characteristics from a database. These databases may contain different variable types (numeric, categorical, scalar, binary, etc..), but categorical variables such as become a challenge clustering because lack of natural ordering. With this lack there is a big deficiency of tools and algorithms for clustering databases with categorical variables. The present work propose a new clustering algorithm for categorical data called GCA (Gower Clustering Algorithm) based in combination of algorithm TaxMap and measure of similarity coefficient of Gower. The GCA algorithm was compared with two others algorithms (clope and FarthestFirst) and through a brief statistical analysis, GCA had a very significant performance to contribute with deficiency cited.
  • Keywords
    pattern clustering; statistical analysis; Gower clustering algorithm; Gower similarity coefficient measurement; TaxMap algorithm; categorical variable data clustering technique; clustering databases; statistical analysis; Algorithm design and analysis; Clustering algorithms; Data mining; Databases; Electronic mail; Machine learning; Statistical analysis; Algorithm; categorical; clustering; similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatica (CLEI), 2012 XXXVIII Conferencia Latinoamericana En
  • Conference_Location
    Medellin
  • Print_ISBN
    978-1-4673-0794-9
  • Type

    conf

  • DOI
    10.1109/CLEI.2012.6427180
  • Filename
    6427180