• DocumentCode
    2058023
  • Title

    Developing Probabilistic Models for Identifying Semantic Patterns in Texts

  • Author

    Huang, Minhua ; Haralick, Robert M.

  • Author_Institution
    Comput. Sci. Dept., City Univ. of New York, New York, NY, USA
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    197
  • Lastpage
    200
  • Abstract
    We present a probabilistic graphical model that finds a sequence of optimal categories for a sequence of input symbols. Based on this mode, three algorithms are developed for identifying semantic patterns in texts. They are the algorithm for extracting semantic arguments of a verb, the algorithm for classifying the sense of an ambiguous word, and the algorithm for identifying noun phrases from a sentence. Experiments conducted on standard data sets show good results. For example, our method achieves an average precision of 92:96% and an average recall of 94:94% for extracting semantic argument boundaries of verbs on WSJ data from Penn Tree bank and Prop Bank, an average accuracy of 81:12% for recognizing the six sense word ´line´, and an average precision of 97:7% and an average recall of 98:8% for recognizing noun phrases on WSJ data from Penn Tree bank.
  • Keywords
    computational linguistics; probability; text analysis; Penn Tree bank; Prop Bank; ambiguous word; noun phrase; optimal categories; probabilistic graphical model; probabilistic models; semantic argument boundary; semantic pattern; text; verb; Accuracy; Classification algorithms; Educational institutions; Graphical models; Hidden Markov models; Probabilistic logic; Semantics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
  • Conference_Location
    Palo Alto, CA
  • Print_ISBN
    978-1-4577-1648-5
  • Electronic_ISBN
    978-0-7695-4492-2
  • Type

    conf

  • DOI
    10.1109/ICSC.2011.35
  • Filename
    6061354