• DocumentCode
    624014
  • Title

    Towards a scalable HDFS architecture

  • Author

    Azzedin, Farag

  • Author_Institution
    Inf. & Comput. Sci. Dept., King Fahd Univ. of Pet. & Miner., Dhahran, Saudi Arabia
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    155
  • Lastpage
    161
  • Abstract
    Cloud computing infrastructures allow corporations to reduce costs by outsourcing computations on-demand. One of the areas cloud computing is increasingly being utilized for is large scale data processing. Apache Hadoop is one of these large scale data processing projects that supports data-intensive distributed applications. Hadoop applications utilize a distributed file system for data storage called Hadoop Distributed File System (HDFS). HDFS architecture, by design, has only a single master node called ame ode, which manages and maintains the metadata of storage nodes, called Datanodes, in its RAM. Hence, HDFS Datanodes´ metadata is restricted by the capacity of the RAM of the HDFS´s single-point-of-failure ame ode. This paper proposes a fault tolerant, highly available and widely scalable HDFS architecture. The proposed architecture provides a distributed ame ode space eliminating the drawbacks of the current HDFS architecture. This is achieved by integrating the Chord protocol into the HDFS architecture.
  • Keywords
    cloud computing; data handling; distributed databases; fault tolerant computing; memory architecture; meta data; object-oriented databases; outsourcing; public domain software; random-access storage; Apache Hadoop; Chord protocol; HDFS Datanodes metadata; HDFS single-point-of-failure ame ode; Hadoop distributed file system; RAM; cloud computing infrastructures; computation on-demand outsourcing; cost reduction; data-intensive distributed applications; distributed ame ode space; distributed data storage; fault tolerant; large scale data processing projects; metadata maintenance; metadata management; scalable HDFS architecture; single master node; storage nodes; Availability; Computer architecture; File systems; Protocols; Random access memory; Servers; Chord; Cloud Computing Platform; Distributed NameNode; HDFS; Hadoop;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Collaboration Technologies and Systems (CTS), 2013 International Conference on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-1-4673-6403-4
  • Type

    conf

  • DOI
    10.1109/CTS.2013.6567222
  • Filename
    6567222