DocumentCode
2813413
Title
Semi-supervised learning for named entity recognition using weakly labeled training data
Author
Zafarian, Atefeh ; Rokni, Ali ; Khadivi, Shahram ; Ghiasifard, Sonia
Author_Institution
Dept. of Comput. Eng. & IT, Amirkabir Univ. of Technol., Tehran, Iran
fYear
2015
fDate
3-5 March 2015
Firstpage
129
Lastpage
135
Abstract
The shortage of the annotated training data is still an important challenge to building many Natural Language Process (NLP) tasks such as Named Entity Recognition. NER requires a large amount of training data with a high degree of human supervision whereas there is not enough labeled data for every language. In this paper, we use an unlabeled bilingual corpora to extract useful features from transferring information from resource-rich language toward resource-poor language and by using these features and a small training data, make a NER supervised model. Then we utilize a graph-based semi-supervised learning method that trains a CRF-based supervised classifier using that labeled data and uses high-confidence predictions on the unlabeled data to expand the training set and improve efficiency of NER model with the new training set.
Keywords
feature extraction; graph theory; learning (artificial intelligence); natural language processing; pattern classification; CRF-based supervised classifier; NER supervised model; NLP; annotated training data; feature extraction; graph-based semisupervised learning method; named entity recognition; natural language processing; unlabeled bilingual corpora; weakly labeled training data; Computational modeling; Data models; Feature extraction; Organizations; Semisupervised learning; Training; Training data; Bilingual parallel corpora; Named entity Recognition; graph-based semi-supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Artificial Intelligence and Signal Processing (AISP), 2015 International Symposium on
Conference_Location
Mashhad
Print_ISBN
978-1-4799-8817-4
Type
conf
DOI
10.1109/AISP.2015.7123504
Filename
7123504
Link To Document