Title :
Automatic and efficient recognition of proper nouns based on maximum entropy model
Author :
Li, Peng ; Guan, Yi ; Wang, Xiao-long ; Sun, Jun
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
Abstract :
This paper presents a high performance method to identify English proper nouns (PNs) based on maximum entropy model (MaxEnt). Most traditional PNs recognition systems use lexical resources such as name list, as new names are constantly coming into existence, these are necessarily incomplete. Therefore machine learning methods are used to identify PNs automatically. In the framework of MaxEnt model, semantic and lexical information of surrounding words and word itself acting as atomic features comprises feature templates and forms feature without requiring extra expert knowledge. The test on WSJ of Penn Treebank II shows that this method guarantees high precision and recall, and at the same time it can reduce the quantity of features dramatically, downsize system space consumption, and decrease the time of training and testing, so as to improve the efficiency considerably. The method in this paper can be transformed to identify other specific noun easily because the principle of methods is universal.
Keywords :
computational linguistics; learning (artificial intelligence); linguistics; maximum entropy methods; MaxEnt model; PN recognition systems; automatic English proper noun recognition; generalized iterative scaling; machine learning methods; maximum entropy model; parameter estimation; Computer science; Electronic mail; Entropy; Information retrieval; Learning systems; Machine learning; Natural language processing; Sun; System testing; Training data; Generalized Iterative Scaling; MaxEnt Model; PNs Recognition; machine learning; parameter estimation;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527597