The problem to be considered is that of classifying a given time series

into one of

classes

. The stochastic process

is assumed to obey an autoregressive structure involving a parameter vector

, whose probability density

depends on the class to which

or

belongs. Assuming appropriate expressions for

, it is shown that the probability density of

characterizing each class, namely

, possesses a vector

of sufficient statistics, i.e., all the information about

needed for the discrimination between the various classes is contained in the vector

, where the functions

have the same structure for all

. Thus the best possible feature set for the problem is

From this is deduced the optimal decision rule to minimize the average probability of error. The optimal feature set and the corresponding optimal decision rule are compared with other feature sets and decision rules mentioned in the literature on speech recognition.