• DocumentCode
    2960768
  • Title

    Comparative study on normalization procedures for cluster analysis of gene expression datasets

  • Author

    De Souto, Marcilio C P ; De Araujo, Daniel S A ; Costa, Ivan G. ; Soares, Rodrigo G F ; Ludermir, Teresa B. ; Schliep, Alexander

  • Author_Institution
    Dept. of Inf. & Appl. Math., Fed. Univ. of Rio Grande do Norte, Natal
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    2792
  • Lastpage
    2798
  • Abstract
    Normalization before clustering is often needed for proximity indices, such as Euclidian distance, which are sensitive to differences in the magnitude or scales of the attributes. The goal is to equalize the size or magnitude and the variability of these features. This can also be seen as a way to adjust the relative weighting of the attributes. In this context, we present a first large scale data driven comparative study of three normalization procedures applied to cancer gene expression data. The results are presented in terms of the recovering of the true cluster structure as found by five different clustering algorithms.
  • Keywords
    pattern clustering; Euclidian distance; cluster analysis; gene expression datasets; normalization procedures; proximity indices; Cancer; Clustering algorithms; Clustering methods; Data analysis; Dynamic range; Euclidean distance; Gene expression; Large-scale systems; Robustness; Standardization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634191
  • Filename
    4634191