Title :
Overcoming Asymmetry in Entity Graphs
Author :
Taesung Lee ; Young-rok Cha ; Seung-won Hwang
Author_Institution :
Dept. of Comput. Sci. & Eng., POSTECH, Pohang, South Korea
Abstract :
This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of “symmetry”, quickly deteriorate on “weakly” comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.
Keywords :
data mining; entity-relationship modelling; graph theory; knowledge engineering; English-Chinese corpora; F1-score; contextual similarity; entity graphs; knowledge engineering methodologies; node similarity; parallel corpora; translation matrix; Context modeling; Electronic publishing; Graph theory; Internet; Semantics; Knowledge modeling; entity translation; knowledge engineering methodologies;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2014.2316799