Title :
Discretization Using Clustering and Rough Set Theory
Author :
Singh, G.K. ; Minz, Sonajharia
Author_Institution :
Sch. Comput. & Syst. Sci., Jawaharlal Nehru Univ., New Delhi
Abstract :
The majority of the data mining algorithms are applied to data described by discrete or nominal attributes. In order to apply these algorithms effectively to any dataset the continuous attribute need to be transformed to discretized ones. This paper presents an approach using clustering and rough set theory (RST). The experiments are performed on four datasets from UCI ML repository. The performance of the proposed approach is compared with some common discretization methods based on the two parameters - the number of intervals and the class-attribute interdependence redundancy (CAIR) value. The results of the proposed method show a satisfactory trade off between the number of intervals and the information loss due to discretization
Keywords :
data mining; pattern clustering; rough set theory; class-attribute interdependence redundancy value; data clustering; data mining algorithm; rough set theory; Clustering algorithms; Data mining; Data preprocessing; Data visualization; Decision trees; Entropy; Heuristic algorithms; Set theory; Statistics; Visual databases;
Conference_Titel :
Computing: Theory and Applications, 2007. ICCTA '07. International Conference on
Conference_Location :
Kolkata
Print_ISBN :
0-7695-2770-1
DOI :
10.1109/ICCTA.2007.51