The optimal nonlinear features for a criterion function of the general form

are studied, where the

and the are the conditional first- and second-order moments. The optimal solution is found to be a parametric function of the conditional densities. By imposing a further restriction on the functional dependence of

on the

, the optimal mapping becomes an intuitively pleasing function of the posterior probabilities. Given a finite number of features

, the problem of finding the best linear mappings to

features is next investigated. The resulting optimum mapping is a linear combination of the projections of the posterior probabilities onto the subspace spanned by the

. The problem of finding the best single feature and seqnential feature selection is discussed in this framework. Finally, several examples are discussed.