• DocumentCode
    1679254
  • Title

    Atlas: Baidu´s key-value storage system for cloud data

  • Author

    Chunbo Lai ; Song Jiang ; Liqiong Yang ; Shiding Lin ; Guangyu Sun ; Zhenyu Hou ; Can Cui ; Cong, Jason

  • Author_Institution
    Baidu Inc., China
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    14
  • Abstract
    Users store rapidly increasing amount of data into the cloud. Cloud storage service is often characterized as having a large data set and few deletes. Hosting the service on a conventional system consisting of servers of powerful CPUs and managed by either a key-value (KV) system or a file system is not efficient. First, as demand on storage capacity grows much faster than that on CPU power, existing server configurations can lead to CPU under-utilization and inadequate storage. Second, as data durability is of paramount importance and storage capacity can be limited, a data protection scheme relying on data replication is not space efficient. Third, because of the unique distribution of data object size (mostly a few KBytes), hard disks may suffer from unnecessarily high request rate (when data is stored as KV pairs and need constant re-organization) or too many random writes (when data is stored as relatively small files). In Baidu this inefficiency has become an urgent issue as data is uploaded into the storage at an increasingly high rate and both the user population and the system are rapidly expanding. To address this issue, we adopt a customized compact server design based on the ARM processors and replace three-copy replication for data protection with erasure coding to enable low-power and high-density storage. Furthermore, there is a huge number of objects stored in the system, such as those for photos, MP3 music, and documents, but their sizes do not allow efficient operations in the conventional KV systems. To this end we propose an innovative architecture separating metadata and data managements to enable efficient data coding and storage. The resulting production system, called Atlas, is a highly scalable, reliable, and cost-effective KV store supporting Baidu´s cloud storage service.
  • Keywords
    cloud computing; meta data; storage management; ARM processors; Atlas; Baidu; cloud data; cloud storage service; customized compact server design; data coding; data durability; data managements; data protection scheme; data replication; data storage; erasure coding; hard disks; key-value storage system; metadata; production system; server configurations; storage capacity; three-copy replication; Bandwidth; Cloud computing; Compaction; Encoding; Hard disks; Servers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mass Storage Systems and Technologies (MSST), 2015 31st Symposium on
  • Conference_Location
    Santa Clara, CA
  • Type

    conf

  • DOI
    10.1109/MSST.2015.7208288
  • Filename
    7208288