DocumentCode :
1607141
Title :
Unsupervised text pattern learning using minimum description length
Author :
Wu, Ke ; Yu, Jiangsheng ; Wang, Hanpin ; Cheng, Fei
Author_Institution :
Dept. of Comput. Sci. & Technol., Peking Univ., Beijing, China
fYear :
2010
Firstpage :
161
Lastpage :
166
Abstract :
The knowledge of text patterns in a domain-specific corpus is valuable in many natural language processing (NLP) applications such as information extraction, question-answering system, and etc. In this paper, we propose a simple but effective probabilistic language model for modeling the in-decomposability of text patterns. Under the minimum description length (MDL) principle, an efficient unsupervised learning algorithm is implemented and the experiment on an English critical writing corpus has shown promising coverage of patterns compared with human summary.
Keywords :
computational linguistics; learning (artificial intelligence); natural language processing; probability; text analysis; English; critical writing corpus; minimum description length; natural language processing; probabilistic language model; text pattern; unsupervised learning; Computational linguistics; Dictionaries; Humans; Merging; Probabilistic logic; Unsupervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
Type :
conf
DOI :
10.1109/IUCS.2010.5666227
Filename :
5666227
Link To Document :
بازگشت