• DocumentCode
    1710162
  • Title

    Scalable Euclidean Embedding for Big Data

  • Author

    Alavi, Zohreh ; Sharma, Sagar ; Lu Zhou ; Keke Chen

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
  • fYear
    2015
  • Firstpage
    773
  • Lastpage
    780
  • Abstract
    Euclidean embedding algorithms transform data defined in an arbitrary metric space to the Euclidean space, which is critical to many visualization techniques. At big-data scale, these algorithms need to be scalable to massive data-parallel infrastructures. Designing such scalable algorithms and understanding the factors affecting the algorithms are important research problems for visually analyzing big data. We propose a framework that extends the existing Euclidean embedding algorithms to scalable ones. Specifically, it decomposes an existing algorithm into naturally parallel components and non-parallelizable components. Then, data parallel implementations such as MapReduce and data reduction techniques are applied to the two categories of components, respectively. We show that this can be possibly done for a collection of embedding algorithms. Extensive experiments are conducted to understand the important factors in these scalable algorithms: scalability, time cost, and the effect of data reduction to result quality. The result on sample algorithms: Fast Map-MR and LMDS-MR shows that with the proposed approach the derived algorithms can preserve result quality well, while achieving desirable scalability.
  • Keywords
    Big Data; data reduction; data visualisation; parallel algorithms; Big data scale; Euclidean space; FastMap-MR algorithm; LMDS-MR algorithm; arbitrary metric space; data reduction; massive data parallel infrastructure; scalable Euclidean embedding algorithm; visualization technique; Algorithm design and analysis; Approximation algorithms; Big data; Complexity theory; Measurement; Parallel processing; Scalability; Euclidean embedding algorithms; big data; data reduction; data visualization; parallel processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cloud Computing (CLOUD), 2015 IEEE 8th International Conference on
  • Conference_Location
    New York City, NY
  • Print_ISBN
    978-1-4673-7286-2
  • Type

    conf

  • DOI
    10.1109/CLOUD.2015.107
  • Filename
    7214117