DocumentCode :
3124441
Title :
Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers
Author :
Xixin Wu ; Zhiyong Wu ; Jia Jia ; Lianhong Cai
Author_Institution :
Tsinghua-CUHK Joint Res. Center for Media Sci., Technol. & Syst., Tsinghua Univ., Shenzhen, China
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
363
Lastpage :
367
Abstract :
This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain DGs to solve the problems and propose a method to automatically update DGs along with the recognition process of the named entities (NEs). With this method, the DGs can be updated to contain more and more new NEs and features of NEs that are not found in the training data. These newly added items make the DGs become more aware of the knowledge about new domains and hence be more adaptive to new domains for the recognition of OOVNEs. Experiments on the People´s Daily corpus demonstrate that our method is effective, and can improve the average F-score by 1%~2%.
Keywords :
natural language processing; speech recognition; speech synthesis; Chinese named entity recognition; OOVNE; adaptive named entity recognition; automatic updated dynamic gazetteers; average F-score; conditional random fields; hybrid model; out of vocabulary named entity; Computational linguistics; Educational institutions; Feature extraction; Hidden Markov models; Organizations; Training; Training data; Conditional random fields (CRFs); Dynamic gazetteers (DGs); Named entity recognition (NER);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423495
Filename :
6423495
Link To Document :
بازگشت