• DocumentCode
    2151697
  • Title

    Breaking through memory limitation in GPU parallel processing using Strassen Algorithm

  • Author

    Yugopuspito, Pujianto ; Sutrisno ; Hudi, Robertus

  • Author_Institution
    Inf. Dept., Univ. Pelita Harapan, Tangerang, Indonesia
  • fYear
    2013
  • fDate
    19-21 Nov. 2013
  • Firstpage
    201
  • Lastpage
    205
  • Abstract
    Matrix multiplication is one of the basic operations in linear algebra that mostly used in computer science. For ages, applying naive algorithm to complete it has done it, and it has a standard complexity O(n3). Many researches are concluded to find more efficient and effective algorithm to process this operation, and one day Strassen has one that overcome the naive algorithm complexity with only O(n2.8074). The basic concept of this algorithm is divide and conquer (DnC) but with adjustment, to break the limitation of memory in a GPU computation. An idea to overcome this limit is to combine CPU and GPU processing, which implement Strassen algorithm. This paper shows the result of breaking through GPU memory limitation, checking the accuracy of the algorithm, compare the running time result and the most important thing is to make sure this algorithm can process bigger matrix than the naive one. Through the created charts and tables based on each performances, it showed that the maximum size of data samples that are processed by Strassen Algorithm reach 32.768 for (2n×2n) matrix size, which is bigger than the naive algorithm 8.192 could process. Also for all attempt to matrices with larger than 2048, the running times are way faster, about 0.04 seconds to 0.2 seconds for worst case scenario, and overcome the one with naive algorithm.
  • Keywords
    computational complexity; graphics processing units; matrix multiplication; parallel algorithms; DnC algorithm; GPU parallel processing; O(n3) complexity; Strassen algorithm; algorithm complexity; divide-and-conquer algorithm; graphics processing unit; linear algebra; matrix multiplication; memory limitation; Algorithm design and analysis; Complexity theory; Computer architecture; Graphics processing units; Kernel; Linear algebra; Programming; CUDA Programming; GPU; Matrix Multiplication; Strassen Algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer, Control, Informatics and Its Applications (IC3INA), 2013 International Conference on
  • Conference_Location
    Jakarta
  • Type

    conf

  • DOI
    10.1109/IC3INA.2013.6819174
  • Filename
    6819174