DocumentCode :
1040373
Title :
Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification
Author :
Aggarwal, Charu C.
Author_Institution :
Watson Res. Center, Hawthorne
Volume :
19
Issue :
8
fYear :
2007
Firstpage :
1001
Lastpage :
1015
Abstract :
High-dimensional data is a difficult case for most subspace-based classification methods because of the large number of combinations of dimensions, which have discriminatory power. This is because there are an exponential number of combinations of dimensions that could decide the correct class instance, and this combination could vary with data locality and test instance. Therefore, most summarized models such as decision trees and rule-based systems only aim to have a global summary of the data, which is used for classification. Because of this incompleteness, a particular classification model may be more or less suited to individual test instances. Furthermore, it may not provide sufficient insight into the most representative characteristics of a particular test instance. This is undesirable for many classification applications in which the diagnostic reasoning behind the classification of a test instance is as important as the classification process itself. In an interactive application, a user may find it more valuable to develop a diagnostic decision support method, which can reveal significant classification behaviors of exemplar records. Such an approach has the additional advantage of being able to optimize the decision process for the individual record in order to design more effective classification methods. In this paper, we propose the subspace decision path (SD-Path) method, which provides the user with the ability to interactively explore a small number of nodes of a hierarchical decision process so that the most significant classification characteristics for a given test instance are revealed. In addition, the SD-Path method can provide enormous interpretability by constructing views of the data in which the different classes are clearly separated out. Even in difficult cases where the classification behavior of the test instance is ambiguous, the SD-Path method provides a diagnostic understanding of the characteristics, which results in this ambigui- ty. Therefore, this method combines the abilities of the human and the computer in creating an effective diagnostic tool for instance-centered high-dimensional classification.
Keywords :
data mining; decision support systems; human computer interaction; pattern classification; decision tree; diagnostic decision support method; diagnostic reasoning; high-dimensional classification; human computer interaction; rule-based system; subspace decision path method; subspace-based classification method; test-instance-centered diagnosis; Classification tree analysis; Data mining; Decision trees; Design optimization; Helium; Humans; Knowledge based systems; Power system modeling; Testing; Training data; Classification; interactive exploration.; visual data mining;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2007.1034
Filename :
4262532
Link To Document :
بازگشت