Stock market prediction using Hadoop Map-Reduce ecosystem

Author

Dubey, Arun Kumar ; Jain, Vanita ; Mittal, A.P.

Author_Institution

Bharati Vidyapeeth´s Coll. of Eng., New Delhi, India

fYear

2015

fDate

11-13 March 2015

Firstpage

616

Lastpage

621

Abstract

Now a day´s Academia and Industry people are working on large amount of data, in petabyte, and they are using technique of Map Reduce for data analysis. The input for such framework is very large and main requirement for theses inputs are that all the files cannot be kept on single node. After putting all data on single machine, we have to process it parellely. Hadoop is a framework which enables applications to work on large amounts of data on clusters with thousands of nodes. A distributed file system (HDFS) stores the data on these nodes, enabling a high bandwidth across the cluster. Hadoop also implements a parallel computational algorithm, MapReduce, which divides the main task into small chunks and these work in parallel known as mapping, and all the results are combined into a final output, the reduce stage. This paper is based on Hadoop Based Stock forecasting using neural networks. Stock Market has high profit and high risk features thts why its prediction must be in the parallel of accuracy, the main issue about such data are , these are very complex nonlinear function and can only be learnt by a data mining method such as neural networks to recognize future market trend. This project focuses mainly on learning of feed-forward artificial neural network (ANN) on a hadoop framework. We have tried to utilize distributing capability of Hadoop ecosystem which is parallel too. Map-Reduce for managing training of large datasets on the neural network. Our experimental results basically show the speedup achieved by increasing number of processors to the hadoop cluster for an artificial neural network.

Keywords

data analysis; data mining; feedforward neural nets; parallel processing; stock markets; ANN; HDFS stores; Hadoop MapReduce ecosystem; Hadoop based stock forecasting; complex nonlinear function; data analysis; data mining method; distributed file system; feedforward artificial neural network; parallel computational algorithm; parallel processing; stock market prediction; Biological neural networks; Distributed databases; File systems; Indexes; Parallel processing; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on

Conference_Location

New Delhi

Print_ISBN

978-9-3805-4415-1

Type

conf

Filename

7100323