Title :
Discrimination of person names based on contexts co-occurrence
Author :
Chen Chen ; Huilin Liu ; Liwei Zhang
Author_Institution :
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
Abstract :
As the universal phenomenon that two people share a same name, picking out the relevant pages of a specific person from a mass of result documents which are related to multiple namesakes becomes a very troublesome and annoying thing. This paper proposes a contexts-co-occurrence-based method to deal with the name discrimination problem. Firstly, in order to consider both the global and local properties of a document, we extract important terms from the result collection through the method of combining the weight model and the windows model together. Secondly, base on those terms, the co-occurrence scores between any two terms are computed and then a contexts co-occurrence matrix is obtained. After that, according to the context matrix, we combine related context terms together to form several decision vectors. Finally, the similarities between any document vectors and decision vectors are computed through the VSM(vector space model) model. For a person name search task, the result documents can be divided into several groups automatically by the proposed method. The experiment result proves that our method can discriminate different persons who share the same name accurately and effectively.
Keywords :
decision making; document handling; matrix algebra; pattern matching; personal information systems; VSM model; context cooccurrence based method; contexts cooccurrence matrix; decision vector; document vector; name discrimination problem; person name; vector space model; Accuracy; Computational modeling; Context; Context modeling; Feature extraction; Social network services; Sparse matrices; contexts co-occurrence matrix; decision vector; name discrimination;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
DOI :
10.1109/FSKD.2011.6020059