• DocumentCode
    475556
  • Title

    Clustering breast cancer data by consensus of different validity indices

  • Author

    Soria, Daniele ; Garibaldi, Jonathan M. ; Ambrogi, F. ; Lisboa, Paulo J. G. ; Boracchi, P. ; Biganzoli, E.

  • Author_Institution
    Sch. of Comput. Sci., Univ. of Nottingham, Nottingham
  • fYear
    2008
  • fDate
    14-16 July 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number.
  • Keywords
    biological organs; gynaecology; medical computing; pattern clustering; K-means method; PAM algorithm; breast cancer; clustering algorithms; fuzzy Z-means; Breast cancer; Clustering algorithms; Validity indices;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Advances in Medical, Signal and Information Processing, 2008. MEDSIP 2008. 4th IET International Conference on
  • Conference_Location
    Santa Margherita Ligure
  • ISSN
    0537-9989
  • Print_ISBN
    978-0-86341-934-8
  • Type

    conf

  • Filename
    4609085