Title :
Synthesis and recognition of sequences
Author :
Chan, S.C. ; Wong, A.K.C.
Author_Institution :
Dept. of Syst. Design Eng., Waterloo Univ., Ont., Canada
fDate :
12/1/1991 12:00:00 AM
Abstract :
A string or sequence is a linear array of symbols that come from an alphabet. Due to unknown substitutions, insertions, and deletions of symbols, a sequence cannot be treated like a vector or a tuple of a fixed number of variables. The synthesis of an ensemble of sequences is a sequence of random elements that specify the probabilities of occurrence of the different symbols at the corresponding sites of the sequences. The synthesis is determined by a hierarchical sequence synthesis procedure (HSSP), which returns not only the taxonomic hierarchy of the whole ensemble of sequences but also the alignment and the synthesis of a group (a subset of the ensemble) of the sequences at each level of the hierarchy. The HSSP does not require the ensemble of sequences to be presented in the form of a tabulated array of data, the hierarchical information of the data, or the assumption of a stochastic process. The authors present the concept of sequence synthesis and the applicability of the HSSP as a supervised classification procedure as well as an unsupervised classification procedure
Keywords :
pattern recognition; probability; alignment; alphabet; hierarchical sequence synthesis procedure; sequences recognition; sequences synthesis; supervised classification; taxonomic hierarchy; unsupervised classification procedure; Biological cells; Birds; Frequency estimation; Genetics; Humans; Pattern recognition; Probability; Speech; Stochastic processes; Vectors;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on