• DocumentCode
    3525654
  • Title

    Topology-aware Heuristic Data Allocation Algorithm for Big Data Infrastructure

  • Author

    Wuhui Chen ; Kumara, Banage T. G. S. ; Incheon Paik ; Zhenni Li

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Univ. of Aizu, Aizu-Wakamatsu, Japan
  • fYear
    2015
  • fDate
    March 30 2015-April 2 2015
  • Firstpage
    353
  • Lastpage
    360
  • Abstract
    We propose a novel optimal data placement technique considering not only the data locality but also the global data access cost to improve the performance of MapReduce in cloud data centers. We first conducted analytical and experimental study to identify the performance issues of MapReduce in data center and show that MapReduce tasks which are involved in unexpected remote data access take much more communication cost and execution time, and could significant deteriorate the all over performance. To solve optimal data placement problem, we propose a topology-aware heuristic Algorithm by firstly constructing a replica-equalized structure for abstract tree structure, and then building replica-similarity structure for detail tree construction. The experimental results demonstrated that our optimal data placement approach can minimize global data access cost effectively with low communication cost and less execution time, by reducing the unexpected remote data access.
  • Keywords
    Big Data; cloud computing; tree data structures; MapReduce; abstract tree structure; big data infrastructure; building replica-similarity structure; cloud data center; communication cost; data locality; detail tree construction; execution time; global data access cost; optimal data placement approach; optimal data placement problem; optimal data placement technique; replica-equalized structure; topology-aware heuristic algorithm; topology-aware heuristic data allocation algorithm; unexpected remote data access; Heuristic algorithms; Matrix decomposition; Network topology; Servers; Shape; Telecommunication traffic; Topology; MapReduce; cloud data center; heuristic algorithm; optimal data allocation; topology-aware;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
  • Conference_Location
    Redwood City, CA
  • Type

    conf

  • DOI
    10.1109/BigDataService.2015.10
  • Filename
    7184902