DocumentCode :
2414198
Title :
An Extension of Iterative Scaling for Joint Decision-Level and Feature-Level Fusion in Ensemble Classification
Author :
Miller, David J. ; Pal, Siddharth
Author_Institution :
Dept. of EE, Penn State Univ., University Park, PA
fYear :
2005
fDate :
28-28 Sept. 2005
Firstpage :
61
Lastpage :
66
Abstract :
Improved iterative scaling (IIS) is a simple, powerful algorithm for learning maximum entropy (ME) conditional probability models that has found great utility in natural language processing and related applications. In nearly all prior work on IIS, one considers discrete-valued feature functions, depending on the data observations and class label, and encodes statistical constraints on these discrete-valued random variables. Moreover, most significantly for our purposes, the (ground-truth) constraints are measured from frequency counts, based on hard (0-1) training set instances of feature values. Here, we extend IIS for the case where the training (and test) set consists of instances of probability mass functions on the features, rather than instances of hard feature values. We show that the IIS methodology extends in a natural way for this case. This extension has applications 1) to ME aggregation of soft classifier outputs in ensemble classification and 2) to ME classification on mixed discrete-continuous feature spaces. Moreover, we combine these methods, yielding an ME method that jointly performs (soft) decision-level fusion and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against both standard boosting and bagging on UC Irvine benchmark data sets. We also discuss some of our continuing research directions
Keywords :
feature extraction; iterative methods; learning (artificial intelligence); maximum entropy methods; pattern classification; probability; random processes; statistical analysis; conditional probability model; decision-level fusion; discrete-valued feature function; discrete-valued random variables; ensemble classification; feature-level fusion; ground-truth constraints; iterative scaling; maximum entropy learning; mixed discrete-continuous feature spaces; natural language processing; probability mass function; statistical constraints; Boosting; Electronic mail; Entropy; Frequency measurement; Iterative algorithms; Natural language processing; Probability; Random variables; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing, 2005 IEEE Workshop on
Conference_Location :
Mystic, CT
Print_ISBN :
0-7803-9517-4
Type :
conf
DOI :
10.1109/MLSP.2005.1532875
Filename :
1532875
Link To Document :
بازگشت