DocumentCode :
3089785
Title :
Calculating Wikipedia Article Similarity Using Machine Translation Evaluation Metrics
Author :
Erdmann, Maike ; Finch, Andrew ; Nakayama, Kotaro ; Sumita, Eiichiro ; Hara, Takahiro ; Nishio, Shojiro
Author_Institution :
Dept. of Inf. Sci. & Technol., Osaka Univ., Osaka, Japan
fYear :
2011
fDate :
22-25 March 2011
Firstpage :
620
Lastpage :
625
Abstract :
Calculating the similarity of Wikipedia articles in different languages is helpful for bilingual dictionary construction and various other research areas. However, standard methods for document similarity calculation are usually very simple. Therefore, we describe an approach of translating one Wikipedia article into the language of the other article, and then calculating article similarity with standard machine translation evaluation metrics. An experiment revealed that our approach is effective for identifying Wikipedia articles in different languages that are covering the same concept.
Keywords :
Web sites; language translation; natural language processing; Wikipedia article similarity; bilingual dictionary construction; machine translation evaluation metrics; Dictionaries; Electronic publishing; Encyclopedias; Internet; Measurement; Thesauri; Bilingual Dictionary Construction; Cross-language Document Similarity; Wikipedia Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications (WAINA), 2011 IEEE Workshops of International Conference on
Conference_Location :
Biopolis
Print_ISBN :
978-1-61284-829-7
Electronic_ISBN :
978-0-7695-4338-3
Type :
conf
DOI :
10.1109/WAINA.2011.132
Filename :
5763570
Link To Document :
بازگشت