Author_Institution :
Dept. of Comput. Eng., Isik Univ., Istanbul, Turkey
Abstract :
Rule learning algorithms, for example, Ripper, induces univariate rules, that is, a propositional condition in a rule uses only one feature. In this paper, we propose an omnivariate induction of rules where under each condition, both a univariate and a multivariate condition are trained, and the best is chosen according to a novel statistical test. This paper has three main contributions: First, we propose a novel statistical test, the combined 5 × 2 cv t test, to compare two classifiers, which is a variant of the 5 × 2 cv t test and give the connections to other tests as 5 × 2 cv F test and k-fold paired t test. Second, we propose a multivariate version of Ripper, where support vector machine with linear kernel is used to find multivariate linear conditions. Third, we propose an omnivariate version of Ripper, where the model selection is done via the combined 5 × 2 cv t test. Our results indicate that 1) the combined 5 × 2 cv t test has higher power (lower type II error), lower type I error, and higher replicability compared to the 5 × 2 cv t test, 2) omnivariate rules are better in that they choose whichever condition is more accurate, selecting the right model automatically and separately for each condition in a rule.
Keywords :
learning (artificial intelligence); pattern classification; statistical testing; support vector machines; Ripper; classifiers; cv F test; k-fold paired t test; linear kernel; lower type I error; lower type II error; multivariate condition; multivariate linear conditions; omnivariate rule induction; pairwise statistical test; rule learning algorithms; support vector machine; univariate condition; univariate rules; Complexity theory; Decision trees; Equations; Feature extraction; Mathematical model; Neural networks; Support vector machines; Rule induction; model selection; statistical tests; support vector machines;