DocumentCode :
3316279
Title :
Part-of-speech tagger based on maximum entropy model
Author :
Huang Heyan ; Zhang Xiaofei
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing, China
fYear :
2009
fDate :
8-11 Aug. 2009
Firstpage :
26
Lastpage :
29
Abstract :
The maximum entropy (ME) conditional models don´t force to adhere to the independence assumption such as in Hidden Markov generative models, and thus the ME-based part-of-speech (POS) tagger can depend on arbitrary, non-independent features, which are benefit to the POS tagging, without accounting for the distribution of those dependencies. Since ME models are able to flexibly utilize a wide variety of features, the sparse problem of training data is efficiently solved. Experiments show that the POS tagging error rate is reduced by 54.25% in close test and 40.56% in open test over the hidden-markov-Model-based baseline, and synchronously an accuracy of 98.01% in close test and 95.56%in open test are obtained.
Keywords :
entropy; hidden Markov models; natural language processing; hidden Markov generative model; maximum entropy model; natural language processing; part-of-speech tagger; Automatic speech recognition; Computer science; Entropy; Hidden Markov models; Labeling; Natural language processing; Probability distribution; Smoothing methods; Tagging; Testing; Hidden Markov model (HMM); ME model; Natural Language Processing (NLP); POS tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Information Technology, 2009. ICCSIT 2009. 2nd IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-4519-6
Electronic_ISBN :
978-1-4244-4520-2
Type :
conf
DOI :
10.1109/ICCSIT.2009.5234787
Filename :
5234787
Link To Document :
بازگشت