• DocumentCode
    26395
  • Title

    Cost Minimization for Big Data Processing in Geo-Distributed Data Centers

  • Author

    Lin Gu ; Deze Zeng ; Peng Li ; Song Guo

  • Author_Institution
    Univ. of Aizu, Aizu-Wakamatsu, Japan
  • Volume
    2
  • Issue
    3
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    314
  • Lastpage
    323
  • Abstract
    The explosive growth of demands on big data processing imposes a heavy burden on computation, storage, and communication in data centers, which hence incurs considerable operational expenditure to data center providers. Therefore, cost minimization has become an emergent issue for the upcoming big data era. Different from conventional cloud services, one of the main features of big data services is the tight coupling between data and computation as computation tasks can be conducted only when the corresponding data are available. As a result, three factors, i.e., task assignment, data placement, and data movement, deeply influence the operational expenditure of data centers. In this paper, we are motivated to study the cost minimization problem via a joint optimization of these three factors for big data services in geo-distributed data centers. To describe the task completion time with the consideration of both data transmission and computation, we propose a 2-D Markov chain and derive the average task completion time in closed-form. Furthermore, we model the problem as a mixed-integer nonlinear programming and propose an efficient solution to linearize it. The high efficiency of our proposal is validated by extensive simulation-based studies.
  • Keywords
    Big Data; Markov processes; computer centres; integer programming; minimisation; nonlinear programming; 2D Markov chain; big data processing; big data services; cost minimization; data center operational expenditure; data movement; data placement; geodistributed data centers; mixed-integer nonlinear programming; task assignment; Big data; Data handling; Data storage systems; Distributed databases; Information management; Minimization; Routing protocols; Big data; cost minimization; data flow; data placement; distributed data centers; task assignment;
  • fLanguage
    English
  • Journal_Title
    Emerging Topics in Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-6750
  • Type

    jour

  • DOI
    10.1109/TETC.2014.2310456
  • Filename
    6762920