Title :
A problem of selecting optimal subset of fuzzy-valued features
Author :
Wang, X.Z. ; Tsang, E.C.C. ; Yeung, D.S.
Author_Institution :
Dept. of Comput., Hong Kong Polytech. Univ., Kowloon, Hong Kong
fDate :
6/21/1905 12:00:00 AM
Abstract :
Feature subset selection refers to a data mining enhancement technique which aims to reduce the number of features to be used. This reduction is expected to improve the performance of data mining algorithms to be used, in aspects of speed, accuracy and simplicity. Although there has been some work on feature subset selection, research into the theoretically computational complexity of this problem and on the optimal selection of fuzzy-valued feature subsets has not been carried out. This paper focuses on a problem called optimal fuzzy-valued feature subset selection (OFFSS) which is regarded as being important but difficult in machine learning and pattern recognition. The measure of the quality of a set of features is defined by the overall overlapping degree between two classes of examples and the size of feature subset. The main contributions of this paper are that: (1) the concept of fuzzy extension matrix is introduced; (2) the computational complexity of OFFSS is proved to be NP-hard; (3) a simple but powerful heuristic algorithm for OFFSS is given; and (4) the feasibility and simplicity of the proposed algorithm are demonstrated via applications of OFFSS to input selection of neuro-fuzzy systems and to fuzzy decision tree induction
Keywords :
computational complexity; data mining; decision trees; feature extraction; fuzzy set theory; heuristic programming; learning (artificial intelligence); neural nets; pattern recognition; computational complexity; data mining algorithms; data mining enhancement technique; fuzzy decision tree induction; fuzzy extension matrix; heuristic algorithm; input selection; machine learning; neuro-fuzzy systems; optimal fuzzy-valued feature subset selection; overlapping degree; pattern recognition; Computational complexity; Data mining; Electronic mail; Fuzzy neural networks; Fuzzy systems; Heuristic algorithms; Machine learning; Pattern recognition; Size measurement; Uncertainty;
Conference_Titel :
Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on
Conference_Location :
Tokyo
Print_ISBN :
0-7803-5731-0
DOI :
10.1109/ICSMC.1999.823231