• DocumentCode
    680798
  • Title

    Dynamic Data Partitioning and Virtual Machine Mapping: Efficient Data Intensive Computation

  • Author

    Slagter, Kenn ; Ching-Hsien Hsu ; Yeh-Ching Chung

  • Author_Institution
    Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • Volume
    2
  • fYear
    2013
  • fDate
    2-5 Dec. 2013
  • Firstpage
    220
  • Lastpage
    223
  • Abstract
    Big data refers to data that is so large that it exceeds the processing capabilities of traditional systems. Big data can be awkward to work and the storage, processing and analysis of big data can be problematic. MapReduce is a recent programming model that can handle big data. MapReduce achieves this by distributing the storage and processing of data amongst a large number of computers (nodes). However, this means the time required to process a MapReduce job is dependent on whichever node is last to complete a task. This problem is exacerbated by heterogeneous environments. In this paper we propose a method to improve MapReduce execution in heterogeneous environments. This is done by dynamically partitioning data during the Map phase and by using virtual machine mapping in the Reduce phase in order to maximize resource utilization.
  • Keywords
    Big Data; storage management; virtual machines; MapReduce; big data; data intensive computation; data storage; dynamic data partitioning; programming model; resource utilization; virtual machine mapping; Cloud computing; Data handling; Data storage systems; Graphics processing units; Information management; Partitioning algorithms; Virtual machining; BigData; Cloud Computing; Hadoop; Heterogeneous environment; MapReduce; Parallel Computing; Virtual Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
  • Conference_Location
    Bristol
  • Type

    conf

  • DOI
    10.1109/CloudCom.2013.134
  • Filename
    6735423