DocumentCode :
3739723
Title :
Impact of HDFS block size in MapReduce based segmentation and feature extraction algorithm
Author :
Hameeza Ahmed;Muhammad Ali Ismail
Author_Institution :
High Performance Computing Centre (HPCC) Department of Computer & Information Systems Engineering, NED University of Engineering & Technology, University Road, Karachi-75270, Pakistan
fYear :
2015
Firstpage :
58
Lastpage :
63
Abstract :
Apache Hadoop is one of the open source frameworks to process big data. Despite the huge success of Hadoop, it lacks in addressing many of the significant real world problems. And the inability to handle data dependencies efficiently is one of the major issues among those. This paper highlights the data dependency issue in Hadoop framework. This is done using a newly developed MapReduce based segmentation & feature extraction algorithm for very large data dependencies dataset. Data dependency is managed by the variation in HDFS block size. With smaller block size, having larger data dependency and greater parallelism the framework shows uncertain results. As block size is increased, that is by minimizing the data dependency the framework starts to show result stability against the loss of parallelism.
Keywords :
"Feature extraction","Programming","Mathematical model","Parallel processing","Distributed databases","Tuning","Algorithm design and analysis"
Publisher :
ieee
Conference_Titel :
Open Source Systems & Technologies (ICOSST), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICOSST.2015.7396403
Filename :
7396403
Link To Document :
بازگشت