Title :
Classifying news stories to estimate the direction of a stock market index
Author :
Drury, Brett ; Torgo, Luis ; Almeida, J.J.
Author_Institution :
LIAAD, INESC, Porto, Portugal
Abstract :
News can contain information which may provide an indication of the future direction of a share or stock market index. The possibility of predicting future stock market prices has attracted an increasing numbers of industry practitioners and academic researchers to this area of investigation. Popular approaches have relied upon either: models constructed from manually selected training or manually constructed dictionaries. A potential flaw of manually selecting data is that the effectiveness of the trained model is dependent upon the ability of the human annotator. An alternative approach is to align news stories with trends in a specific market. A negative story is inferred if it co-occurs with a market losing value where as positive story is associated with a rise. This approach may have its flaws because news stories may co-occur with market movements by chance and consequently may inhibit the construction of a robust classifier with data gathered with this method. This paper presents a strategy which combines a: rule classifier, alignment strategy and self-training to induce a robust model for classifying news stories. The proposed method is compared with several competing methodologies and is evaluated with: estimated F-Measure and estimated trading returns. In addition the paper provides an evaluation of classifying a news story with it´s: headline, description or story text. The results demonstrate a clear advantage for the proposed methodology when evaluated by estimated F-Measure. The proposed strategy also produces the highest trading returns. In addition the paper clearly demonstrates that a news story´s headline provides the greatest assistance for classification. The models induced from headlines gained the highest estimated F-Measure and trading returns for each strategy with the exception of the alignment method which performed uniformly poorly.
Keywords :
estimation theory; pattern classification; publishing; stock markets; F-Measure estimate; alignment strategy; direction estimate; news stories classification; robust classifier; rule classifier; stock market prices; trading returns; Argon; Manuals; constrained learning; finance; news;
Conference_Titel :
Information Systems and Technologies (CISTI), 2011 6th Iberian Conference on
Conference_Location :
Chaves
Print_ISBN :
978-1-4577-1487-0