• DocumentCode
    2842329
  • Title

    Dynamic Random Access for Hadoop Distributed File System

  • Author

    Zhou, Wei ; Han, Jizhong ; Zhang, Zhang ; Dai, Jiao

  • Author_Institution
    Inst. of Comput. Technol., Beijing, China
  • fYear
    2012
  • fDate
    18-21 June 2012
  • Firstpage
    17
  • Lastpage
    22
  • Abstract
    Recently, Hadoop Distributed File System (HDFS) has been widely used to manage the large-scale data due to its high scalability. HDFS can natively support sequential queries, which are the most common queries in the applications. However, there still exist many applications that need to apply random queries of large-scale data. So the random queries in large-scale data are becoming more and more important. Unfortunately, the HDFS is not optimized for random reads, hence there are many disadvantages in random access to HDFS. In this paper, we present three methods to solve these issues, which can optimize the random accesses to HDFS and guarantee the sequential access performance at the same time. The methods are as follows: 1) proposing dynamic methods to set the size of data packet in transmission, 2) reusing the TCP connections in localized random accesses, 3) transferring the random accesses to the same server to make full use of the TCP connections. Experimental evaluations based on real world data show that our works are effective and our solutions efficiently support sequential access and random access compared to the original methods.
  • Keywords
    data communication; distributed databases; query processing; transport protocols; HDFS; Hadoop distributed file system; TCP connections; data packet transmission; dynamic methods; dynamic random access; large-scale data; random queries; random reads; real world data; sequential access performance; sequential queries; Data models; Distributed databases; File systems; Optimization; Protocols; Scalability; Servers; dynamic setting; random access; transfer model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems Workshops (ICDCSW), 2012 32nd International Conference on
  • Conference_Location
    Macau
  • ISSN
    1545-0678
  • Print_ISBN
    978-1-4673-1423-7
  • Type

    conf

  • DOI
    10.1109/ICDCSW.2012.74
  • Filename
    6258128