DocumentCode
2043122
Title
Multi-document Chinese name disambiguation based on Latent Semantic Analysis
Author
Wu, Chengrong ; Gong, Linghui ; Zeng, Jianping
Author_Institution
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
Volume
5
fYear
2010
fDate
10-12 Aug. 2010
Firstpage
2367
Lastpage
2371
Abstract
Name disambiguation has received considerable attention as an important subtask of NLP (Natural Language Processing). Given many potential references of person entities, the goal is to find out for each reference involved in the context the most possible person entity it refers to. However, many researches in this field either focus on name disambiguation within a single text or employ machine learning models on multi-document without any consideration of semantics. In this paper we propose a new algorithm based on LSA (Latent Semantic Analysis) for the multi-document disambiguation task for Chinese name. The method employs SVD (Singular Value Decomposition) to reduce the original high dimensional text space to comparatively lower dimensional semantic space and then cluster possible reference words on the semantic space to get the result. Experiments on a real world dataset which is collected from a BBS site show that the proposed method can generate reasonable result.
Keywords
learning (artificial intelligence); natural language processing; text analysis; BBS site; high dimensional text space; latent semantic analysis; machine learning model; multidocument Chinese name disambiguation; natural language processing; singular value decomposition; Algorithm design and analysis; Clustering algorithms; Computational linguistics; Context; Machine learning algorithms; Semantics; Tagging; LSA; SVD; name disambiguation;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location
Yantai, Shandong
Print_ISBN
978-1-4244-5931-5
Type
conf
DOI
10.1109/FSKD.2010.5569867
Filename
5569867
Link To Document