DocumentCode :
1909838
Title :
Automatic Recognition of Chinese Organization Name Based on Conditional Random Fields
Author :
Zhang, Suxiang ; Zhang, Suxian ; Wang, Xiaojie
Author_Institution :
North China Electr. Power Univ., Baoding
fYear :
2007
fDate :
Aug. 30 2007-Sept. 1 2007
Firstpage :
229
Lastpage :
233
Abstract :
Person, location and organization have been always mentioned as a bottleneck of a named entity recognition (NER) system. Automatic recognition of Chinese organization name is the most difficult problem in NER tasks. This paper presents a new approach of Chinese organization name recognition based on cascaded conditional random fields. In the proposed approach, we first recognize the person name and location name before recognizing organization. The model structure has been designed with the cascade way, the result then is passed to the high model and suppose the decision of high model for recognition of the complicated organization names. And we proposed the new feature to realize this task. We evaluate our approach on large-scale corpus with open test method using People´s Daily (January. 1998). Chinese ORG recalling rate achieves 88.78% and the precision rate is 82.35%. The evaluation results show that our approach based on cascaded conditional random fields significantly outperforms previous approaches.
Keywords :
information retrieval; natural languages; random processes; text analysis; Chinese organization recognition; cascaded conditional random field; information extraction; large-scale corpus; named entity recognition system; question answering system; text document; Character recognition; Data mining; Educational institutions; Hidden Markov models; Large-scale systems; Machine learning; Power engineering and energy; Sun; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
Type :
conf
DOI :
10.1109/NLPKE.2007.4368038
Filename :
4368038
Link To Document :
بازگشت