Title of article :
Pattern selection approaches for the logical analysis of data considering the outliers and the coverage of a pattern
Author/Authors :
Han، نويسنده , , Jeong and Kim، نويسنده , , Norman and Yum، نويسنده , , Bong-Jin and Jeong، نويسنده , , Myong K.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2011
Pages :
6
From page :
13857
To page :
13862
Abstract :
The logical analysis of data (LAD) is one of the most promising data mining methods developed to date for extracting knowledge from data. The key feature of the LAD is the capability of detecting hidden patterns in the data. Because patterns are basically combinations of certain attributes, they can be used to build a decision boundary for classification in the LAD by providing important information to distinguish observations in one class from those in the other. The use of patterns may result in a more stable performance in terms of being able to classify both positive and negative classes due to their robustness to measurement errors. D technique, however, tends to choose too many patterns by solving a set covering problem to build a classifier; this is especially the case when outliers exist in the data set. In the set covering problem of the LAD, each observation should be covered by at least one pattern, even though the observation is an outlier. Thus, existing approaches tend to select too many patterns to cover these outliers, resulting in the problem of overfitting. Here, we propose new pattern selection approaches for LAD that take both outliers and the coverage of a pattern into account. The proposed approaches can avoid the problem of overfitting by building a sparse classifier. The performances of the proposed pattern selection approaches are compared with existing LAD approaches using several public data sets. The computational results show that the sparse classifiers built on the patterns selected by the proposed new approaches yield an improved classification performance compared to the existing approaches, especially when outliers exist in the data set.
Keywords :
Classification , Logical analysis of data , Pattern selection , Set covering problem
Journal title :
Expert Systems with Applications
Serial Year :
2011
Journal title :
Expert Systems with Applications
Record number :
2350487
Link To Document :
بازگشت