DocumentCode :
3348125
Title :
Cross-weighted Fisher discriminant analysis for visualization of DNA microarray data
Author :
Zhang, Xinying ; Myers, Chad L. ; Kung, S.Y.
Author_Institution :
Princeton Univ., NJ, USA
Volume :
5
fYear :
2004
fDate :
17-21 May 2004
Abstract :
Fisher discriminant analysis (DA) has recently shown promise in dimensionality reduction of high dimensional DNA data. However, the 1D projection provided by this method is an optimal Bayesian classifier only when the intraclass data patterns are purely Gaussian distributed. Unfortunately, it has been well recognized that most DNA expression data are much more realistically represented by a Gaussian mixture model (GMM), which allows for multiple cluster centroids per class. When a data set from such a GMM is projected onto a 1D subspace, its inherent multi-modal nature may be partially or completely obscured. Consequently, traditional Fisher DA is quite inadequate when higher dimensional visualization (e.g. 2D or 3D) is necessary. The proposed technique addresses this problem and makes use of combined supervised and unsupervised learning techniques for several DNA microarray signal processing functions, including intraclass cluster discovery, optimal projection, and identification/selection of responsible gene groups. In particular, a cross-weighted Fisher DA is proposed and its abilities to reduce dimensionality and to visualize data sets are evaluated.
Keywords :
DNA; Gaussian distribution; data mining; data visualisation; learning (artificial intelligence); medical signal processing; signal classification; 1D projection method; DNA expression data; DNA microarray signal processing; GMM; Gaussian mixture model; cross-weighted Fisher discriminant analysis; data clustering models; data mining; data visualization tools; dimensionality reduction; gene group identification; high dimensional DNA data; higher dimensional visualization; intraclass cluster discovery; machine learning computational tools; molecular sample classification; multiple cluster centroids; optimal projection; responsible gene group selection; supervised learning techniques; unsupervised learning techniques; Bayesian methods; Biomedical signal processing; Cancer; DNA; Data visualization; Gaussian distribution; Gene expression; Medical treatment; Parameter estimation; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1327179
Filename :
1327179
Link To Document :
بازگشت