DocumentCode :
2052322
Title :
Identifying Patterns in Texts
Author :
Huang, Minhua ; Haralick, Robert M.
Author_Institution :
Dept. of Comput. Sci., City Univ. of New York, New York, NY, USA
fYear :
2009
fDate :
14-16 Sept. 2009
Firstpage :
59
Lastpage :
64
Abstract :
We discuss a probabilistic graphical model for recognizing patterns in texts. It is derived from the probability function for a sequence of categories given a sequence of symbols under two reasonable conditional independence assumptions and represented by a product of combinations of conditional and marginal probability functions. The novelty of our model is that it has a mathematical representation which is completely different from existing graphical models such as CRFs, HMMs, and MEMMs. Moreover, it can be used for identifying various patterns in texts. Up to now, we have used this model for recognizing NP chunks and senses of a polysemous word in sentences. This model has achieved very promising results on standard data sets. In the future, we will use this model for extracting semantic roles in a sentence.
Keywords :
pattern recognition; text analysis; NP chunks; mathematical representation; pattern identification; polysemous word; probabilistic graphical model; probability function; text patterns; Computer science; Data mining; Graphical models; Hidden Markov models; Labeling; Mathematical model; Pattern recognition; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing, 2009. ICSC '09. IEEE International Conference on
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-4962-0
Electronic_ISBN :
978-0-7695-3800-6
Type :
conf
DOI :
10.1109/ICSC.2009.22
Filename :
5298562
Link To Document :
بازگشت