• DocumentCode
    659496
  • Title

    Terabyte-sized image computations on Hadoop cluster platforms

  • Author

    Bajcsy, Peter ; Vandecreme, Antoine ; Amelot, Julien ; Nguyen, P. ; Chalfoun, Joe ; Brady, Mary

  • Author_Institution
    Software & Syst. Div., Nat. Inst. of Stand. & Technol., Gaithersburg, MD, USA
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    729
  • Lastpage
    737
  • Abstract
    We present a characterization of four basic Terabyte-sized image computations on a Hadoop cluster in terms of their relative efficiency according to the modified Amdahl´s law. The work is motivated by the lack of standard benchmarks and stress tests for big image processing operations on a Hadoop computer cluster platform. Our benchmark design and evaluations were performed on one of the three microscopy image sets, each consisting of over one half Terabyte. All image processing benchmarks executed on the NIST Raritan cluster with Hadoop were compared against baseline measurements, such as the Terasort/Teragen designed for Hadoop testing previously, image processing executions on a multiprocessor desktop and on NIST Raritan cluster using Java Remote Method Invocation (RMI) with multiple configurations. By applying our methodology to assessing efficiencies of computations on computer cluster configurations, we could rank computation configurations and aid scientists in measuring the benefits of running image processing on a Hadoop cluster.
  • Keywords
    Java; image processing; microscopy; multiprocessing systems; Amdahl law; Hadoop computer cluster platform; Java remote method invocation; NIST Raritan cluster; RMI; benchmark design; image processing; microscopy image sets; multiprocessor desktop; relative efficiency; terabyte-sized image computations; Benchmark testing; Computers; Feature extraction; Image segmentation; Java; Random access memory; Big Data Applications and Infrastructure; Big Data Industry Standards; Big Data Open Platform;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691645
  • Filename
    6691645