Title :
Power Grid Time Series Data Analysis with Pig on a Hadoop Cluster Compared to Multi Core Systems
Author :
Bach, F. ; Cakmak, H.K. ; Maass, H. ; Kuehnapfel, U.
Author_Institution :
Inst. for Appl. Comput. Sci., Karlsruhe Inst. of Technol., Karlsruhe, Germany
fDate :
Feb. 27 2013-March 1 2013
Abstract :
In order to understand the dependencies in the power system we try to derive state information by combining high-rate voltage time series captures at different locations together with data analysis at different scales. This may enable large-scale simulation and modeling of the grid. Data captured by our recently introduced Electrical Data Recorders (EDR) and power grid simulation data are stored in the large scale data facility (LSDF) at Karlsruhe Institute of Technology (KIT) and growing rapidly in size. In this article we compare classic sequential multithreaded time series data processing to a distributed processing using Pig on a Hadoop cluster. Further we present our ideas for a better organization for our raw- and metadata that is indexable, searchable and suitable for big data.
Keywords :
data analysis; meta data; multi-threading; multiprocessing systems; power engineering computing; power grids; time series; EDR; Hadoop cluster; Karlsruhe Institute of Technology; LSDF; distributed processing; electrical data recorders; high-rate voltage time series; large scale data facility; large-scale simulation; metadata; multi core systems; multithreaded time series data processing; power grid simulation data; power grid time series data analysis; Data models; Data visualization; Java; Phasor measurement units; Power grids; Time series analysis; Voltage measurement; Hadoop; LSDF; Pig; Power system; big data; data analysis; multicore; time series;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location :
Belfast
Print_ISBN :
978-1-4673-5321-2
Electronic_ISBN :
1066-6192
DOI :
10.1109/PDP.2013.37