• DocumentCode
    2369567
  • Title

    Sequence modeling with mixtures of conditional maximum entropy distributions

  • Author

    Pavlov, Dmitry

  • fYear
    2003
  • fDate
    19-22 Nov. 2003
  • Firstpage
    251
  • Lastpage
    258
  • Abstract
    We present a novel approach to modeling sequences using mixtures of conditional maximum entropy (maxent) distributions. Our method generalizes the mixture of first-order Markov models by including the "long-term" dependencies in model components. The "long-term" dependencies are represented by the frequently used in the natural language processing (NLP) domain probabilistic triggers or rules (such as "A occurred k positions back"→"the current symbol is B" with probability P). The maxent framework is then used to create a coherent global probabilistic model from all selected triggers. We enhance this formalism by using probabilistic mixtures with maxent models as components, thus representing hidden or unobserved effects in the data. We demonstrate how our mixture of conditional maxent models can be learned from data using the generalized EM algorithm that scales linearly in the dimensions of the data and the number of mixture components. We present empirical results on the simulated and real-world data sets and demonstrate that the proposed approach enables us to create better quality models than the mixtures of first-order Markov models and resist overfitting and curse of dimensionality that would inevitably present themselves for the higher order Markov models.
  • Keywords
    hidden Markov models; learning (artificial intelligence); maximum entropy methods; natural languages; optimisation; EM algorithm; Markov model; NLP; conditional maximum entropy distribution; global probabilistic model; maxent model; natural language processing; sequence modelling; Analytical models; DNA; Data mining; Entropy; Hidden Markov models; History; Natural language processing; Proteins; Resists; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
  • Print_ISBN
    0-7695-1978-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2003.1250927
  • Filename
    1250927