DocumentCode :
3536167
Title :
Preprocessing and Symbolic Representation of Stock Data
Author :
Kumar, Mukesh ; Kalia, Arvind
Author_Institution :
Dept. of Comput. Sci., Himachal Pradesh Univ., Shimla, India
fYear :
2012
fDate :
7-8 Jan. 2012
Firstpage :
83
Lastpage :
88
Abstract :
There has been a lot of interest in mining the time series data. Stock data mining plays an important role to visualize the behavior of financial market. In financial data mining the data is normally represented in the numeric format, however, the symbolic representation is also used to evaluate the overall impact. Time series data are difficult to manipulate, but when they are treated as symbols instead of data points, interesting patterns can be discovered and it becomes an easier task to mine them. In this paper, a symbolic representation of NSE stock data of thirteen years period i.e. from Jan. 1996 to Dec.2008 is presented. The data preprocessing is an essential part of data mining Data cleaning fills in missing values, smoothes noisy data, handles or removes outliers, resolves inconsistencies. First of all the data was normalized, Normalization was done on the dataset using min-max normalization. The data transformation steps performed include offset translation, removing of linear trend, and removing of noise using moving average smoothing method. Further a best fitting line is used to remove the linear trend from the dataset. Euclidean distance measure has been used to establish relationships among various stocks. Three symbols [up, down, neutral] have been used for symbolic representation of the data and distance is evaluated as per the matching pattern of these symbols. It has been found that symbolic representation provides an easier interpretation and helped to determine an overall pattern. Symbolic pattern is having resemblance with price change pattern in numeric representation.
Keywords :
data mining; data visualisation; minimax techniques; smoothing methods; stock markets; time series; NSE stock data; average smoothing method; behavior visualization; data cleaning; data transformation steps; financial market; linear trend removal; min-max normalization; noise removal; numeric representation; offset translation; price change pattern; stock data mining; symbolic representation; time series data mining; Data mining; Data preprocessing; Euclidean distance; Noise; Noise measurement; Smoothing methods; Time series analysis; Financial data mining; numeric data set; preprocessing; symbolic data set;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computing & Communication Technologies (ACCT), 2012 Second International Conference on
Conference_Location :
Rohtak, Haryana
Print_ISBN :
978-1-4673-0471-9
Type :
conf
DOI :
10.1109/ACCT.2012.89
Filename :
6168338
Link To Document :
بازگشت