DocumentCode
3089785
Title
Calculating Wikipedia Article Similarity Using Machine Translation Evaluation Metrics
Author
Erdmann, Maike ; Finch, Andrew ; Nakayama, Kotaro ; Sumita, Eiichiro ; Hara, Takahiro ; Nishio, Shojiro
Author_Institution
Dept. of Inf. Sci. & Technol., Osaka Univ., Osaka, Japan
fYear
2011
fDate
22-25 March 2011
Firstpage
620
Lastpage
625
Abstract
Calculating the similarity of Wikipedia articles in different languages is helpful for bilingual dictionary construction and various other research areas. However, standard methods for document similarity calculation are usually very simple. Therefore, we describe an approach of translating one Wikipedia article into the language of the other article, and then calculating article similarity with standard machine translation evaluation metrics. An experiment revealed that our approach is effective for identifying Wikipedia articles in different languages that are covering the same concept.
Keywords
Web sites; language translation; natural language processing; Wikipedia article similarity; bilingual dictionary construction; machine translation evaluation metrics; Dictionaries; Electronic publishing; Encyclopedias; Internet; Measurement; Thesauri; Bilingual Dictionary Construction; Cross-language Document Similarity; Wikipedia Mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Information Networking and Applications (WAINA), 2011 IEEE Workshops of International Conference on
Conference_Location
Biopolis
Print_ISBN
978-1-61284-829-7
Electronic_ISBN
978-0-7695-4338-3
Type
conf
DOI
10.1109/WAINA.2011.132
Filename
5763570
Link To Document