DocumentCode :
2690544
Title :
Generic high-throughput methods for multilingual sentiment detection
Author :
Gindl, Stefan ; Scharl, Arno ; Weichselbraun, A.
Author_Institution :
Dept. of New Media Technol., MODUL Univ. Vienna, Vienna, Austria
fYear :
2010
fDate :
13-16 April 2010
Firstpage :
239
Lastpage :
244
Abstract :
Digital ecosystems typically involve a large number of participants from different sectors who generate rapidly growing archives of unstructured text. Measuring the frequency of certain terms to determine the popularity of a topic is comparably straightforward. Detecting sentiment expressed in user-generated electronic content is more challenging, especially in the case of digital ecosystems comprising heterogeneous sets of multilingual documents. This paper describes the use of language-specific grammar patterns and multilingual tagged dictionaries to detect sentiment in German and English document repositories. Digital ecosystems may contain millions of frequently updated documents, requiring sentiment detection methods that maximize throughput. The ideal combination of high-throughput techniques and more accurate (but slower) approaches depends on the specific requirements of an application. To accommodate a wide range of possible applications, this paper presents (i) an adaptive method, balancing accuracy and scalability of multilingual textual sources, (ii) a generic approach for generating language- specific grammar patterns and multilingual tagged dictionaries, and (iii) an extensive evaluation verifying the method´s performance based on Amazon product reviews and user evaluations from Sentiment Quiz, a “game with a purpose” that invites users of the Facebook social networking platform to assess the sentiment of individual sentences.
Keywords :
dictionaries; grammars; natural language processing; pattern recognition; social networking (online); text analysis; English document repositories; Facebook social networking platform; German document repositories; digital ecosystems; generic high-throughput methods; language-specific grammar patterns; multilingual sentiment detection; multilingual tagged dictionaries; unstructured text; Accuracy; Conferences; Dictionaries; Ecosystems; Feature extraction; Grammar; Media; evaluation; grammar patterns; multilingual sentiment detection; semantic orientation; tagged dictionaries;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Ecosystems and Technologies (DEST), 2010 4th IEEE International Conference on
Conference_Location :
Dubai
ISSN :
2150-4938
Print_ISBN :
978-1-4244-5551-5
Type :
conf
DOI :
10.1109/DEST.2010.5610641
Filename :
5610641
Link To Document :
بازگشت