DocumentCode
1629869
Title
A Sense Based Similarity Measure for Cross-Lingual Documents
Author
Huang, Hsun-Hui ; Yang, Horng-Chang ; Kuo, Yau-Hwang
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan
Volume
1
fYear
2008
Firstpage
9
Lastpage
13
Abstract
As cross-lingual information retrieval attracts increasing attention, tools that measure cross-lingual document similarity become desirable. Since the way that people convey thoughts at the abstract concept level makes little, if any, difference in the languages they use, it is possible to measure semantic similarity between different lingual documents based on the concepts conveyed by the documents. In this paper, we use senses for document representation to alleviate the barrier of different languages and adopt fuzzy set functions to cope with the inherent fuzziness among senses and propose two document similarity measures- one based on Tversky´s notion on similarity and the other on the much used information retrieval criterion. Their performances are compared experimentally. We only focus on documents in English and Chinese. But the proposed approach can be easily extended to process documents in other languages.
Keywords
fuzzy set theory; information retrieval; natural languages; text analysis; abstract concept level; cross-lingual document similarity; cross-lingual information retrieval; document representation; fuzzy set function; semantic similarity; sense based similarity measure; Application software; Computer science; Design engineering; Fuzzy sets; Information retrieval; Intelligent systems; Internet; Natural language processing; Natural languages; Web pages; cross-lingual; semantic similarity; sense;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location
Kaohsiung
Print_ISBN
978-0-7695-3382-7
Type
conf
DOI
10.1109/ISDA.2008.284
Filename
4696168
Link To Document