DocumentCode :
14762
Title :
On Identifying Critical Nuggets of Information during Classification Tasks
Author :
Sathiaraj, David ; Triantaphyllou, Evangelos
Author_Institution :
Dept. of Comput. Sci., Louisiana State Univ., Baton Rouge, LA, USA
Volume :
25
Issue :
6
fYear :
2013
fDate :
Jun-13
Firstpage :
1354
Lastpage :
1367
Abstract :
In large databases, there may exist critical nuggets-small collections of records or instances that contain domain-specific important information. This information can be used for future decision making such as labeling of critical, unlabeled data records and improving classification results by reducing false positive and false negative errors. This work introduces the idea of critical nuggets, proposes an innovative domain-independent method to measure criticality, suggests a heuristic to reduce the search space for finding critical nuggets, and isolates and validates critical nuggets from some real-world data sets. It seems that only a few subsets may qualify to be critical nuggets, underlying the importance of finding them. The proposed methodology can detect them. This work also identifies certain properties of critical nuggets and provides experimental validation of the properties. Experimental results also helped validate that critical nuggets can assist in improving classification accuracies in real-world data sets.
Keywords :
data mining; decision making; pattern classification; classification accuracy improvement; critical information nugget identification; critical nugget isolation; critical nugget validation; critical-unlabeled data record labeling; data mining; decision making; domain-independent method; domain-specific important information; false negative error reduction; false positive error reduction; real-world data sets; search space reduction; Accuracy; Cancer; Complexity theory; Data mining; Data models; Measurement; Switches; Data mining; class boundary; classification; classification accuracy; critical nuggets; duality; outliers;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2012.112
Filename :
6205754
Link To Document :
بازگشت