Title :
On the generalization of soft margin algorithms
Author :
Shawe-Taylor, John ; Cristianini, Nello
Author_Institution :
Dept. of Comput. Sci., Univ. of London, Egham, UK
fDate :
10/1/2002 12:00:00 AM
Abstract :
Generalization bounds depending on the margin of a classifier are a relatively new development. They provide an explanation of the performance of state-of-the-art learning systems such as support vector machines (SVMs) and Adaboost. The difficulty with these bounds has been either their lack of robustness or their looseness. The question of whether the generalization of a classifier can be more tightly bounded in terms of a robust measure of the distribution of margin values has remained open for some time. The paper answers this open question in the affirmative and, furthermore, the analysis leads to bounds that motivate the previously heuristic soft margin SVM algorithms as well as justifying the use of the quadratic loss in neural network training algorithms. The results are extended to give bounds for the probability of failing to achieve a target accuracy in regression prediction, with a statistical analysis of ridge regression and Gaussian processes as a special case. The analysis presented in the paper has also lead to new boosting algorithms described elsewhere.
Keywords :
Gaussian processes; learning (artificial intelligence); learning automata; neural nets; probability; statistical analysis; Adaboost; Gaussian process; boosting algorithms; classifier generalization; classifier margin; generalization bounds; heuristic soft margin SVM algorithms; learning systems; linear function classes; margin values distribution; neural network training algorithms; probability; quadratic loss; regression prediction; ridge regression; robust measure; soft margin algorithms; statistical analysis; support vector machines; Accuracy; Algorithm design and analysis; Learning systems; Neural networks; Probability; Robustness; Statistical analysis; Support vector machine classification; Support vector machines; Time measurement;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2002.802647