• DocumentCode
    813980
  • Title

    Parsing with probabilistic strictly locally testable tree languages

  • Author

    Verdú-Mas, Jose Luis ; Carrasco, Rafael C. ; Calera-Rubio, Jorge

  • Author_Institution
    Departament de Llenguatges i Sistemes Informatics, Universidad de Alicante, Spain
  • Volume
    27
  • Issue
    7
  • fYear
    2005
  • fDate
    7/1/2005 12:00:00 AM
  • Firstpage
    1040
  • Lastpage
    1050
  • Abstract
    Probabilistic k-testable models (usually known as k-gram models in the case of strings) can be easily identified from samples and allow for smoothing techniques to deal with unseen events during pattern classification. In this paper, we introduce the family of stochastic k-testable tree languages and describe how these models can approximate any stochastic rational tree language. The model is applied to the task of learning a probabilistic k-testable model from a sample of parsed sentences. In particular, a parser for a natural language grammar that incorporates smoothing is shown.
  • Keywords
    grammars; pattern classification; smoothing methods; trees (mathematics); k-gram models; natural language grammar; parsed sentences; pattern classification; probabilistic k-testable models; probabilistic strictly locally testable tree languages; smoothing techniques; stochastic k-testable tree languages; stochastic rational tree language; Automata; Humans; Natural language processing; Natural languages; Pattern classification; Predictive models; Smoothing methods; Stochastic processes; Testing; Text processing; Index Terms- Parsing with probabilistic grammars; stochastic learning; tree grammars.; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Information Storage and Retrieval; Models, Statistical; Natural Language Processing; Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Sequence Alignment; Sequence Analysis; Signal Processing, Computer-Assisted;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2005.144
  • Filename
    1432738