DocumentCode :
2043122
Title :
Multi-document Chinese name disambiguation based on Latent Semantic Analysis
Author :
Wu, Chengrong ; Gong, Linghui ; Zeng, Jianping
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
Volume :
5
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
2367
Lastpage :
2371
Abstract :
Name disambiguation has received considerable attention as an important subtask of NLP (Natural Language Processing). Given many potential references of person entities, the goal is to find out for each reference involved in the context the most possible person entity it refers to. However, many researches in this field either focus on name disambiguation within a single text or employ machine learning models on multi-document without any consideration of semantics. In this paper we propose a new algorithm based on LSA (Latent Semantic Analysis) for the multi-document disambiguation task for Chinese name. The method employs SVD (Singular Value Decomposition) to reduce the original high dimensional text space to comparatively lower dimensional semantic space and then cluster possible reference words on the semantic space to get the result. Experiments on a real world dataset which is collected from a BBS site show that the proposed method can generate reasonable result.
Keywords :
learning (artificial intelligence); natural language processing; text analysis; BBS site; high dimensional text space; latent semantic analysis; machine learning model; multidocument Chinese name disambiguation; natural language processing; singular value decomposition; Algorithm design and analysis; Clustering algorithms; Computational linguistics; Context; Machine learning algorithms; Semantics; Tagging; LSA; SVD; name disambiguation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569867
Filename :
5569867
Link To Document :
بازگشت