Title :
Approximate Equal Frequency Discretization Method
Author :
Jiang, Sheng-Yi ; Li, Xia ; Zheng, Qi ; Wang, Lian-Xi
Author_Institution :
Sch. of Inf. Guangdong, Univ. of Foreign Studies, Guangzhou, China
Abstract :
Many algorithms for data mining and machine learning can only process discrete attributes. In order to use these algorithms when some attributes are numeric, the numeric attributes must be discretized. Because of the prevalent of normal distribution, an approximate equal frequency discretization method based on normal distribution is presented. The method is simple to implement. Computing complexity of this method is nearly linear with the size of dataset and can be applied to large size dataset. We compare this method with some other discretization methods on the UCI datasets. The experiment result shows that this unsupervised discretization method is effective and practicable.
Keywords :
attribute grammars; computational complexity; data mining; discrete event systems; learning (artificial intelligence); normal distribution; approximate equal frequency discretization method; computing complexity; data mining algorithms; machine learning algorithms; normal distribution; Data mining; Frequency; Gaussian distribution; Informatics; Intelligent systems; Learning systems; Machine learning; Machine learning algorithms; Statistical distributions; Testing; Discretization; Equal Frequency Method; Normal Distribution;
Conference_Titel :
Intelligent Systems, 2009. GCIS '09. WRI Global Congress on
Conference_Location :
Xiamen
Print_ISBN :
978-0-7695-3571-5
DOI :
10.1109/GCIS.2009.131