• DocumentCode
    1791633
  • Title

    Spatial computations over terabyte-sized images on hadoop platforms

  • Author

    Bajcsy, Peter ; Nguyen, P. ; Vandecreme, Antoine ; Brady, Mary

  • Author_Institution
    Software & Syst. Div., Nat. Inst. of Stand. & Technol., Gaithersburg, MD, USA
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    816
  • Lastpage
    824
  • Abstract
    Our objective is to lower the barrier of executing spatial image computations in a computer cluster/cloud environment instead of in a desktop/laptop computing environment. We research two related problems encountered during an execution of spatial computations over terabyte-sized images using Apache Hadoop running on distributed computing resources. The two problems address (a) detection of spatial computations and their parameter estimation from a library of image processing functions, and (b) partitioning of image data for spatial image computations on Hadoop cluster/cloud computing platforms in order to minimize network data transfer. The first problem is solved by designing an iterative estimation methodology. The second problem is formulated as an optimization over three partitioning schemas (physical, logical without overlap and logical with overlap), and evaluated over several system configuration parameters. Our experimental results for the two problems demonstrate 100% accuracy in detecting spatial computations in the Java Advanced Imaging and ImageJ libraries, a speed-up of 5.36 between the default Hadoop physical partitioning and developed logical image partitioning with overlap, and 3.14 times faster execution of logical partitioning with overlap than the one without overlap. The novelty of our work is in designing an extension to Apache Hadoop to run a class of spatial image processing operations efficiently on a distributed computing resource.
  • Keywords
    Java; data handling; image processing; optimisation; parallel processing; parameter estimation; Java advanced imaging; apache Hadoop; distributed computing resources; image processing functions; imageJ libraries; logical image partitioning; parameter estimation; Cloud computing; Computers; Equations; Image processing; Kernel; Libraries; Distributed computing; Hadoop; Image partition; Spatial image operations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7004311
  • Filename
    7004311