Title :
A Rule-Based Classification Algorithm for Uncertain Data
Author :
Qin, Biao ; Xia, Yuni ; Prabhakar, Sunil ; Tu, Yicheng
Author_Institution :
Dept. of Comput. Sci., Indiana Univ., Indianapolis, IN
fDate :
March 29 2009-April 2 2009
Abstract :
Data uncertainty is common in real-world applications due to various causes, including imprecise measurement, network latency, outdated sources and sampling errors. These kinds of uncertainty have to be handled cautiously, or else the mining results could be unreliable or even wrong. In this paper, we propose a new rule-based classification and prediction algorithm called uRule for classifying uncertain data. This algorithm introduces new measures for generating, pruning and optimizing rules. These new measures are computed considering uncertain data interval and probability distribution function. Based on the new measures, the optimal splitting attribute and splitting value can be identified and used for classification and prediction. The proposed uRule algorithm can process uncertainty in both numerical and categorical data. Our experimental results show that uRule has excellent performance even when data is highly uncertain.
Keywords :
data mining; knowledge based systems; optimisation; pattern classification; probability; data mining; optimal splitting attribute; optimal splitting value; probability distribution function; rule generation; rule optimization; rule pruning; rule-based data classification algorithm; rule-based data prediction algorithm; uncertain data interval; Cancer; Classification algorithms; Classification tree analysis; Computer science; Data engineering; Data mining; Decision trees; Delay; Neoplasms; Uncertainty;
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
DOI :
10.1109/ICDE.2009.164