DocumentCode :
2446834
Title :
Performance Considerations of Data Acquisition in Hadoop System
Author :
Jia, Baodong ; Wlodarczyk, T.W. ; Rong, Chunming
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Stavanger, Stavanger, Norway
fYear :
2010
fDate :
Nov. 30 2010-Dec. 3 2010
Firstpage :
545
Lastpage :
549
Abstract :
Data have become more and more important these years, especially for big companies, and it is of great benefit to mine useful information in these data. Oil & Gas industry has to deal with vast amounts of data, both in real-time and historical context. As the amount of data is significant, it is usually infeasible or very time consuming to actually process the data. In our project we investigate usage of Hadoop to solve this problem. In order to perform Hadoop jobs, data must first exist in the Hadoop file system, which creates the problem of data acquisition. In this paper, two solutions are investigates, performance comparison is performed and solution based on Chukwa is demonstrated to be more efficient than a naïve implementation in particular for bigger file sizes.
Keywords :
data acquisition; data mining; distributed databases; gas industry; petroleum industry; production engineering computing; Chukwa; Hadoop file system; data acquisition performance considerations; data mining; gas industry; oil industry; Clustering algorithms; Data acquisition; Drilling; File systems; Java; Monitoring; Real time systems; Chukwa; Data Acquisition; Hadoop; Historical data; Performance; Real-time data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on
Conference_Location :
Indianapolis, IN
Print_ISBN :
978-1-4244-9405-7
Electronic_ISBN :
978-0-7695-4302-4
Type :
conf
DOI :
10.1109/CloudCom.2010.93
Filename :
5708498
Link To Document :
بازگشت