DocumentCode
1622792
Title
Dimensionality reduction of unsupervised data
Author
Dash, M. ; Liu, H. ; Yao, J.
Author_Institution
Dept. of Inf. Syst. & Comput. Sci., Nat. Univ. of Singapore, Singapore
fYear
1997
Firstpage
532
Lastpage
539
Abstract
Dimensionality reduction is an important problem for efficient handling of large databases. Many feature selection methods exist for supervised data having class information. Little work has been done for dimensionality reduction of unsupervised data in which class information is not available. Principal component analysis (PCA) is often used. However, PCA creates new features. It is difficult to obtain intuitive understanding of the data using the new features only. We are concerned with the problem of determining and choosing the important original features for unsupervised data. Our method is based on the observation that removing an irrelevant feature from the feature set may not change the underlying concept of the data, but not so otherwise. We propose an entropy measure for ranking features, and conduct extensive experiments to show that our method is able to find the important features. Also it compares well with a similar feature ranking method (Relief) that requires class information unlike our method
Keywords
entropy; feature extraction; knowledge acquisition; unsupervised learning; very large databases; Relief; class information; dimensionality reduction; entropy measure; feature ranking method; feature selection methods; feature set; intuitive understanding; irrelevant feature; large databases; unsupervised data; Computer science; Data mining; Electronics packaging; Entropy; Feature extraction; Information systems; Machine learning algorithms; Personal communication networks; Principal component analysis; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 1997. Proceedings., Ninth IEEE International Conference on
Conference_Location
Newport Beach, CA
ISSN
1082-3409
Print_ISBN
0-8186-8203-5
Type
conf
DOI
10.1109/TAI.1997.632300
Filename
632300
Link To Document