مرکز منطقه ای اطلاع رساني علوم و فناوري - The Role of Feature Engineering in Prediction of Tehran Stock Exchange Index Based on LSTM

Other language title :

تأثير روش هاي مختلف پيش‌پردازش داده براي پيش‌بيني شاخص بورس تهران با استفاده از شبكه عصبي حافظه كوتاه و بلند مدت ماندگار

Title of article :

The Role of Feature Engineering in Prediction of Tehran Stock Exchange Index Based on LSTM

Author/Authors :

Aminimehr, Amin Department of Management - Ershad Damavand University - Tehran Branch - Tehran, Iran , Raoofi, Ali Faculty of Economics - Allameh Tabatabaei University - Tehran, Iran , Aminimehr, Akbar Faculty of Accounting - Management and Economic - Payame Noor University - Tehran, Iran , Aminimehr, Amirhossein School of Computer Engineering - Iran University of Science and Technology - Tehran, Iran

Pages :

From page :

527

To page :

548

Abstract :

In this research, the impact of different preprocessing methods on the Long-Short term memory in predicting the financial time series was examined. At first, the model was implemented on the Tehran stock exchange index by utilizing the Principal Component Analysis (PCA) model on 78 technical indicators. Then, the same model was implemented by the advantage of the random forest to select features rather than the PCA to extract them. In the next step, other technical strategy dummy variables were added to the model to examine the changes in its performance. Finally, two deep learning methods with the advantage of only target lags were deployed to compare the accuracy to the other models. The first deep model was plain but the second one was with the advantage of the Wavelet denoising process. The results of the MSE, MAE, MAPE, and R2 score on unseen test sequences showed that applying the Long Short-Term Memory with its own deep feature extraction procedure and the wavelet’s denoising process leads to the best accuracy in prediction of the Tehran stock exchange index. Finally, the Diebold Mariano test exposed a significant difference between the accuracy of the best model and the rest. This result implied that although the application of deep learning gains accurate results, it can be alleviated by feeding the model with creatively extracted and denoised features.

Farsi abstract :

در اين تحقيق به منظور پيش‌بيني داده هاي سري زماني مالي با استفاده از شبكه عصبي حافظه كوتاه و بلند مدت ماندگا تاثير روش هاي مختلف پيش‌پردازش داده ها با همديگر مقايسه شده است. در روش اول داده هاي مربوط به 78 انديكاتور تكنيكال به الگوريتم تحليل مولفه هاي اوليه داده شده و با استفاده از خروجي هاي آن، مدل پيش‌بيني پياده سازي شده است. در روش دوم به جاي استخراج مؤلفه هاي موثر، براي انتخاب موثر ترين متغير ها از الگوريتم جنگل تصادفي استفاده شد. در آزمايشي ديگر از متغير هاي توليد شده توسط استراتژي هاي تكنيكال براي توسعه الگوريتم پيش‌بيني استفاده شده است. در نهايت با استفاده از مدل يادگيري عميق سري زماني تغذيه شده توسط وقفه هاي متغير وابسته، يك بار با كمك موجك و نوفه زدايي و بار ديگر بدون موجك پيش‌بيني انجام شده است. نتايج حاصل شده از توابع متعدد سنجش خطا از جمله MSE, MAE, MAPE و نيكويي برازش بر روي داده هاي آزمون نشان داد كه مدل يادگيري عميق به همراه موجك بهترين پيش‌بيني را بر روي شاخص بورس تهران ارائه داده اند. در نهايت آزمون ديابود ماريانو نشان داد كه اختلاف دقت روش هاي مقايسه شده در اين تحقيق از حيث آماري معني دار ميباشد. به طور خلاصه اين تحقيق نشان داد كه با وجود اين كه مدل هاي يادگيري عميق توان خوبي براي استخراج دانش از ميان داده هاي سري زماني مربوط به شاخص بورس تهران را دارند، اين عملكرد را با استفاده از تكنيك نوفه زدايي موجك بهبود بخشيده مي شود.

Keywords :

Tehran Stock Exchange , Price Prediction , Deep Neural Network , Feature Engineering , Knowledge Extraction

Journal title :

Iranian Journal of Economic Studies

Serial Year :

2020

Record number :

2629507

Link To Document :

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=2629507