DocumentCode :
124210
Title :
Evaluating Feature Sets and Classifiers for Sentiment Analysis of Financial News
Author :
Njolstad, Pal Christian S. ; Hoysaeter, Lars S. ; Wei Wei ; Gulla, Jon Atle
Author_Institution :
Dept. of Comput. & Inf. Sci., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
Volume :
2
fYear :
2014
fDate :
11-14 Aug. 2014
Firstpage :
71
Lastpage :
78
Abstract :
Work on sentiment analysis has thus far been limited in the news article domain. This has mainly been caused by 1) news articles lacking a clearly defined target, 2) the difficulty in separating good and bad news from positive and negative sentiment, and 3) the seeming necessity of, and complexity in, relying on domain-specific interpretations and background knowledge. In this paper we propose, define, experiment with, and evaluate, four different feature categories, composed of 26 article features, for sentiment analysis. Using five different machine learning methods, we train sentiment classifiers of Norwegian financial internet news articles, and achieve classification precisions up to ~71%. This is comparable to the state-of-the-art in other domains and close to the human baseline. Our experimentation with different feature subsets shows that the category relying on domain-specific sentiment lexical (´contextual´ category), able to grasp the jargon and lingo used in Norwegian financial news, is of cardinal importance in classification - these features yield a precision increase of ~21% when added to the other feature categories. When comparing different machine learning classifiers, we find J48 classification trees to yield the highest performance, closely followed by Random Forests (RF), in line with recent studies, and in opposition to the antedated conception that Support Vector Machines (SVM) is superior in this domain.
Keywords :
Internet; data mining; electronic publishing; emotion recognition; feature extraction; financial data processing; learning (artificial intelligence); pattern classification; support vector machines; text analysis; tree searching; trees (mathematics); J48 classification trees; Norwegian financial Internet news articles; SVM; domain-specific interpretations; domain-specific sentiment lexica; feature categories; feature classifiers; feature set evaluation; financial news; jargon; lingo; machine learning classifiers; machine learning methods; negative sentiment; positive sentiment; random forests; sentiment analysis; sentiment classifiers; support vector machines; Aggregates; Correlation; Feature extraction; Radio frequency; Reliability; Sentiment analysis; Support vector machines; Artificial neural networks; Decision trees; Feature extraction; Machine learning; Supervised learning; Support vector machines; Text analysis; Web mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Warsaw
Type :
conf
DOI :
10.1109/WI-IAT.2014.82
Filename :
6927609
Link To Document :
بازگشت