DocumentCode :
424335
Title :
The equivalence theory based on fuzzy theory
Author :
Li, Hua-Yang ; Liu, Yu-Bao ; Li, You-Kui ; Gui, Hao
Author_Institution :
Sch. of Software, Jiangxi Univ. of Finance & Econ., China
Volume :
2
fYear :
2004
fDate :
26-29 Aug. 2004
Firstpage :
1272
Abstract :
Data cleaning is an important work during the building process of data warehouse and data mining. The equivalence theory means the theory on how to define two records to be equivalent or duplicated. It is an important problem of data cleaning. The paper addressed a new equivalence theory and equivalence degree concept based on fuzzy theory, and put forward the corresponding calculation method of equivalence degrees. Moreover on the basis of the equivalence theory, the key word "report" is introduced and the method of clustering and handling duplicated records is presented. Compared with traditional equivalence theory, the new one is more convenient to generating rules, clustering and handling duplicated records, and reduces user\´s time of dealing with single LOG files. In addition, the paper put forward an interactive method based on clustering, which saved much of users\´ labor.
Keywords :
data handling; data mining; data warehouses; fuzzy set theory; pattern clustering; data cleaning; data clustering; data handling; data mining; data warehouse; equivalence theory; fuzzy theory; Cleaning; Containers; Data mining; Data warehouses; Educational institutions; Electronic mail; Finance; Forward contracts; Graphical user interfaces; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
Type :
conf
DOI :
10.1109/ICMLC.2004.1382388
Filename :
1382388
Link To Document :
بازگشت