• DocumentCode
    1733603
  • Title

    Distributed Kernel Matrix Approximation and Implementation Using Message Passing Interface

  • Author

    Dameh, Taher A. ; Abd-Almageed, Wael ; Hefeeda, Mohamed

  • Author_Institution
    Sch. of Comput. Sci., Simon Fraser Univ., Burnaby, BC, Canada
  • Volume
    1
  • fYear
    2013
  • Firstpage
    52
  • Lastpage
    57
  • Abstract
    We propose a distributed method to compute similarity (also known as kernel and Gram) matrices used in various kernel-based machine learning algorithms. Current methods for computing similarity matrices have quadratic time and space complexities, which make them not scalable to large-scale data sets. To reduce these quadratic complexities, the proposed method first partitions the data into smaller subsets using various families of locality sensitive hashing, including random project and spectral hashing. Then, the method computes the similarity values among points in the smaller subsets to result in approximated similarity matrices. We analytically show that the time and space complexities of the proposed method are sub quadratic. We implemented the proposed method using the Message Passing Interface (MPI) framework and ran it on a cluster. Our results with real large-scale data sets show that the proposed method does not significantly impact the accuracy of the computed similarity matrices and it achieves substantial savings in running time and memory requirements.
  • Keywords
    computational complexity; learning (artificial intelligence); matrix algebra; message passing; MPI framework; approximated similarity matrices; distributed kernel matrix approximation; gram matrix; kernel-based machine learning algorithm; large-scale data sets; locality sensitive hashing; message passing interface; quadratic time and space complexity; random project; similarity values; spectral hashing; Accuracy; Algorithm design and analysis; Approximation algorithms; Approximation methods; Clustering algorithms; Complexity theory; Kernel; Large-scale data processing; big data; distributed clustering; kernel matrix approximation; kernel-based algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2013 12th International Conference on
  • Conference_Location
    Miami, FL
  • Type

    conf

  • DOI
    10.1109/ICMLA.2013.17
  • Filename
    6784587