DocumentCode :
1622854
Title :
Learning maximal generalized decision rules via discretization, generalization and rough set feature selection
Author :
Hu, Xiaohua ; Cercone, Nick
Author_Institution :
Sybase Inc., Burlington, MA, USA
fYear :
1997
Firstpage :
548
Lastpage :
556
Abstract :
We present a method to mine maximal generalized decision rules from databases by integrating discretization, generalization and rough sets feature selection. Our method reduces the data horizontally and vertically. In the first phase, discretization and generalization are integrated and the numeric attributes are discretized into a few intervals. Primitive values of symbolic attributes are replaced by high level concepts and some obvious superfluous or irrelevant symbolic attributes are also eliminated. Horizontal reduction is accomplished by merging identical tuples after the substitution of an attribute value by its higher level value in a predefined concept hierarchy for symbolic attributes or the discretization of continuous (or numeric) attributes. In the second phase, a novel context sensitive feature merit measure is used to rank the features, a subset of relevant attributes is chosen based on rough sets theory and the merit values of the features. A reduced table is obtained by removing those attributes which are not in the relevant attributes subset and the data set is further reduced vertically without destroying the interdependence relationships between the classes and the attributes. Rough sets based value reduction is further performed on the reduced table and all redundant condition values are dropped, finally, tuples in the reduced table are transformed into a set of maximal generalized decision rules. The experimental results on UCI data sets and an actual market database shows that our method can dramatically reduce the feature space and improve the learning accuracy
Keywords :
decision theory; deductive databases; feature extraction; fuzzy set theory; knowledge acquisition; learning (artificial intelligence); UCI data sets; attribute value; context sensitive feature merit measure; discretization; feature space; generalization; high level concepts; identical tuples; learning accuracy; market database; maximal generalized decision rule learning; maximal generalized decision rule mining; maximal generalized decision rules; numeric attributes; predefined concept hierarchy; primitive values; redundant condition values; rough set feature selection; rough sets theory; symbolic attributes; Computer science; Data mining; Merging; Phase measurement; Rough sets; Spatial databases; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 1997. Proceedings., Ninth IEEE International Conference on
Conference_Location :
Newport Beach, CA
ISSN :
1082-3409
Print_ISBN :
0-8186-8203-5
Type :
conf
DOI :
10.1109/TAI.1997.632302
Filename :
632302
Link To Document :
بازگشت