Title :
The Godfather vs. Chaos: Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation
Author :
Schuller, Björn ; Schenk, Joachim ; Rigoll, Gerhard ; Knaup, Tobias
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, Munich, Germany
Abstract :
In the fields of sentiment and emotion recognition, bag of words modeling has lately become popular for the estimation of valence in text. A typical application is the evaluation of reviews of e.g. movies, music, or games. In this respect we suggest the use of back-off N-Grams as basis for a vector space construction in order to combine advantages of word-order modeling and easy integration into potential acoustic feature vectors intended for spoken document retrieval. For a fine granular estimate we consider data-driven regression next to classification based on support vector machines. Alternatively the on-line knowledge sources ConceptNet, general inquirer, and WordNet not only serve to reduce out-of-vocabulary events, but also as basis for a purely linguistic analysis. As special benefit, this approach does not demand labeled training data. A large set of 100 k movie reviews of 20 years stemming from Metacritic is utilized throughout extensive parameter discussion and comparative evaluation effectively demonstrating efficiency of the proposed methods.
Keywords :
document handling; information retrieval; regression analysis; support vector machines; ConceptNet; WordNet; acoustic feature vectors; data-driven regression; emotion recognition; general inquirer; linguistic analysis; movie review valence estimation; online knowledge sources; sentiment recognition; spoken document retrieval; support vector machines; vector space construction; word-order modeling; Chaotic communication; DVD; Databases; Man machine systems; Motion pictures; Music information retrieval; Support vector machine classification; Support vector machines; Text analysis; Training data; Document Retrieval; Valence Estimation; Vector Space Modelling;
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2009.194