• DocumentCode
    2980911
  • Title

    NOHAA: A NOvel Framework for HPC Analytics over Windows Azure

  • Author

    Qiangju Xiao ; Jun Wang ; Yan Ma ; Lizhe Wang

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Univ. of Central Florida, Orlando, FL, USA
  • fYear
    2012
  • fDate
    17-19 Dec. 2012
  • Firstpage
    448
  • Lastpage
    455
  • Abstract
    HPC analytics has become increasingly vital to analyze the large volumes of data produced by sophisticated computing instruments. Meanwhile, with the successful development of cloud computing, more and more scientists are devoted to deploy HPC analytics in the ever-popular clouds, which poses new challenges mainly caused by different storage architectures, resource management mechanisms and programming APIs. Firstly, there exists a ``data semantics" gap between the way data are stored by Cloud platform and the way data will be accessed by the HPC Analytics. Secondly, data are mostly distributed across data nodes for in-house data-intensive clusters to achieve co-located computation and storage, however, it is challenging for the public clouds to mimic because their data are stored centrally. In this paper, we develop a new HPC analytics framework called NOHAA, to provide 1) a semantics-aware intelligent data upload interface and 2) a locality-aware hierarchical storage system in support of co-located computation and storage on Windows Azure. Our extensive real world experiments show that NOHAA significantly reduces the average data access time by up to 85% and accelerates the HPC analytics execution time by a factor of 2 to 7.
  • Keywords
    application program interfaces; cloud computing; memory architecture; mobile computing; parallel processing; HPC analytic execution time; NOHAA; Windows Azure; cloud computing; cloud platform; colocated computation; colocated storage; data nodes; data semantic gap; in-house data-intensive clusters; locality-aware hierarchical storage system; programming API; resource management mechanisms; semantic-aware intelligent data upload interface; storage architectures; Cloud computing; Computational modeling; Data models; Data preprocessing; Data transfer; Distributed databases; Semantics; Azure; Co-located Computation and Storage; Data-intensive; HDFS; HPC analytics; Hadoop; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on
  • Conference_Location
    Singapore
  • ISSN
    1521-9097
  • Print_ISBN
    978-1-4673-4565-1
  • Electronic_ISBN
    1521-9097
  • Type

    conf

  • DOI
    10.1109/ICPADS.2012.68
  • Filename
    6413664