Title :
A rough-GA hybrid algorithm for rule extraction from large data
Author :
Chakraborty, Goutam ; Chakraborty, Basabi
Author_Institution :
Dept. of Software & Inf. Sci., Iwate Prefectural Univ., Morioka, Japan
Abstract :
The process of knowledge discovery from vast real life data is encountered with varieties of problems like, presence of noise and outliers in the data set, selection of proper subset of attributes (features) from a large number of relevant and irrelevant attributes, fuzzification or discretization of real-valued data, and finally rule induction. In this proposal, the process of rule creation has two steps. The first step consists of attribute selection, which is based on rough set theory. The next phase is to explore optimal set of simple yet accurate rules. This is accomplished by genetic algorithm. Here, the contribution is how to set the fitness of chromosomes so that simplicity-accuracy tradeoff is accomplished. Finally, chromosomes are coalesced to further simplify and reduce the number of rules.
Keywords :
data mining; genetic algorithms; medical information systems; rough set theory; very large databases; attribute selection; genetic algorithm; knowledge discovery; medical information systems; real life data sets; rough set theory; rule extraction; rule induction; Data mining; Data visualization; Genetics; Hospitals; Information science; Machine learning; Neural networks; Pattern recognition; Rail transportation; Set theory;
Conference_Titel :
Computational Intelligence for Measurement Systems and Applications, 2004. CIMSA. 2004 IEEE International Conference on
Print_ISBN :
0-7803-8341-9
DOI :
10.1109/CIMSA.2004.1397237