DocumentCode :
243575
Title :
A Localization Toolkit for Sentic Net
Author :
Yunqing Xia ; Xiaoyu Li ; Cambria, Erik ; Hussain, Amir
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2014
fDate :
14-14 Dec. 2014
Firstpage :
403
Lastpage :
408
Abstract :
SenticNet is a popular resource for concept-level sentiment analysis. Because SenticNet was created specifically for opinion mining in English language, however, its localization can be very laborious. In this work, a toolkit for creating non-English versions of SenticNet in a time- and cost-effective way is proposed. This is achieved by exploiting online facilities such as Web dictionaries and translation engines. The challenging issues are three: firstly, when a Web lexicon is used, one sentiment concept in English can usually be mapped to multiple concepts in the local language. In this work, we develop a concept disambiguation algorithm to discover context within texts in the target language. Secondly, the polarity of some concepts in the local language may be different from the counterpart in English, which is referred to as language-dependent sentiment concepts. An algorithm is developed to detect sentiment conflict using sentiment annotation corpora in the two languages. Lastly, some sentiment concepts are not included in the local language after dictionary consulting and online translation. In this work, we develop a tool to extract these concepts from sentiment dictionary in the local language. Our practice and evaluation in constructing the Chinese version of SenticNet indicate that the proposed algorithms represent an effective toolkit for localizing SenticNet.
Keywords :
data mining; dictionaries; language translation; linguistics; natural languages; Chinese version; English language; SenticNet; Web dictionaries; Web lexicon; concept disambiguation algorithm; concept-level sentiment analysis; language-dependent sentiment concepts; local language; localization toolkit; nonEnglish versions; online facilities; online translation; opinion mining; sentiment annotation corpora; sentiment conflict; sentiment dictionary; translation engines; Accuracy; Context; Dictionaries; Educational institutions; Semantics; Sensitivity; Sentiment analysis; Sentic Net; Sentiment analysis; common sense; localization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4799-4275-6
Type :
conf
DOI :
10.1109/ICDMW.2014.179
Filename :
7022624
Link To Document :
بازگشت