Title :
Data types generalization for data mining algorithms
Author :
Jiang, Mon-Fong ; Tseng, Shian-Shyong ; Liao, Shan-Yi
Author_Institution :
Dept. of Comput. & Inf. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
fDate :
6/21/1905 12:00:00 AM
Abstract :
With the increasing use of database applications, mining interesting information from huge databases becomes of great concern and a variety of mining algorithms have been proposed in recent years. As we know, the data processed in data mining may be obtained from many sources in which different data types may be used. However, no algorithm can be applied to all applications due to the difficulty of fitting data types to the algorithm. The selection of an appropriate data mining algorithm is based not only on the goal of the application, but also the data fittability. Therefore, to transform the non-fitting data type into a target one is also important in data mining, but the work is often tedious or complex since a lot of data types exist in the real world. Merging the similar data types of a given selected mining algorithm into a generalized data type seems to be a good approach to reduce the transformation complexity. In this work, the data type fittability problem for six kinds of widely used data mining techniques is discussed and a data type generalization process, including merging and transforming phases is proposed. In the merging phase, the original data types of the data sources to be mined are first merged into the generalized ones. The transforming phase is then used to convert the generalized data types into the target ones for the selected mining algorithm. Using the data type generalization process, the user can select an appropriate mining algorithm just for the goal of the application without considering the data types
Keywords :
data mining; generalisation (artificial intelligence); merging; very large databases; data mining; data type fittability problem; data type generalization; generalized data type; huge databases; merging; transformation complexity; Application software; Cleaning; Data mining; Databases; Information science; Merging;
Conference_Titel :
Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on
Conference_Location :
Tokyo
Print_ISBN :
0-7803-5731-0
DOI :
10.1109/ICSMC.1999.823352