Title :
Multi-representation approach to text regression of financial risks
Author :
Roman Trusov;Alexey Natekin;Pavel Kalaidin;Sergey Ovcharenko;Alois Knoll;Aida Fazylova
Author_Institution :
ITMO University, Saint-Petersburg, Russia
Abstract :
Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.
Keywords :
"Radio frequency","Predictive models","Databases","Visualization","Optimization"
Conference_Titel :
Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), 2015
DOI :
10.1109/AINL-ISMW-FRUCT.2015.7382979