Title :
Treelets as feature transformation tool for block diagonal linear discrimination
Author :
Sheng, Lingyan ; Ortega, Antonio ; Pique-Regi, Roger ; Asgharzadeh, Shahab
Author_Institution :
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
The main novelty of this paper is to apply treelets as a feature transformation tool prior to using block diagonal linear discriminant analysis (BDLDA). Using pairwise feature transformations, treelets seek to approximate the decorrelation behavior of principal component analysis (PCA) without requiring identifying the eigenvectors of the full covariance matrix. Instead, PCA is successively applied to pairs of features, which makes it possible to reduce the magnitude of off-diagonal terms of the resulting covariance matrix, even in scenarios where the number of available training samples is small relative to the feature vector dimension. BDLDA seeks to find a block diagonal approximation to the covariance matrix that can capture the most discriminant sets of features and their correlations. Treelet-based preprocessing facilitates the BDLDA search primarily because each treelet coefficient is a linear combination of several original features. Blocks obtained after applying BDLDA to treelet coefficients are in effect larger (a larger number of original features are used). Thus treelets combined with BDLDA makes it possible to find block diagonal structures with larger effective blocks, so that the correlations between a larger number of features can be taken into account. Our experiments demonstrate that this leads to better classification performance for real DNA expression data than state of the art LDA-based techniques.
Keywords :
DNA; biology computing; molecular biophysics; principal component analysis; DNA expression; block diagonal approximation; block diagonal linear discrimination; decorrelation property; feature transformation tool; pairwise feature transformation; principal component analysis; treelet coefficient; Covariance matrix; DNA; Decorrelation; Diseases; Eigenvalues and eigenfunctions; Gene expression; Linear discriminant analysis; Pediatrics; Principal component analysis; Vectors;
Conference_Titel :
Genomic Signal Processing and Statistics, 2009. GENSIPS 2009. IEEE International Workshop on
Conference_Location :
Minneapolis, MN
Print_ISBN :
978-1-4244-4761-9
Electronic_ISBN :
978-1-4244-4762-6
DOI :
10.1109/GENSIPS.2009.5174361