• DocumentCode
    3731344
  • Title

    Multi-representation approach to text regression of financial risks

  • Author

    Roman Trusov;Alexey Natekin;Pavel Kalaidin;Sergey Ovcharenko;Alois Knoll;Aida Fazylova

  • Author_Institution
    ITMO University, Saint-Petersburg, Russia
  • fYear
    2015
  • Firstpage
    110
  • Lastpage
    117
  • Abstract
    Different approaches for textual feature extraction have been proposed starting with simple word count features and continuing with deeper representations capturing distributional semantics. In recent publications word embedding methods have been successfully used as a representation basis for a large number of NLP tasks like text classification, part of speech tagging and many others. In this article we explore opportunities of using multiple text representations simultaneously within one regression task in order to exploit conventional bag of words approach with the more semantically rich embeddings. We investigate performance of this multi-representation approach on the financial risk prediction problem. Publicly available 10-K reports filled by US trading companies are used as the basis for predicting next year change in stock price volatility. Our study shows that models based on single representations achieve performance that is comparable to the previously published results on risk prediction and models with multiple representations benefit from complementary information and outperform both baseline and single representation models.
  • Keywords
    "Radio frequency","Predictive models","Databases","Visualization","Optimization"
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), 2015
  • Type

    conf

  • DOI
    10.1109/AINL-ISMW-FRUCT.2015.7382979
  • Filename
    7382979