• DocumentCode
    1411264
  • Title

    Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system

  • Author

    Li, Keqin ; Pan, Yi ; Zheng, Si Qing

  • Author_Institution
    State Univ. of New York, New Paltz, NY, USA
  • Volume
    9
  • Issue
    8
  • fYear
    1998
  • fDate
    8/1/1998 12:00:00 AM
  • Firstpage
    705
  • Lastpage
    720
  • Abstract
    We present efficient parallel matrix multiplication algorithms for linear arrays with reconfigurable pipelined bus systems (LARPBS). Such systems are able to support a large volume of parallel communication of various patterns in constant time. An LARPBS can also be reconfigured into many independent subsystems and, thus, is able to support parallel implementations of divide-and-conquer computations like Strassen´s algorithm. The main contributions of the paper are as follows. We develop five matrix multiplication algorithms with varying degrees of parallelism on the LARPBS computing model; namely, MM1, MM 2, MM3, and compound algorithms C1(ε)and C2(δ). Algorithm C1(ε) has adjustable time complexity in sublinear level. Algorithm C2(δ) implies that it is feasible to achieve sublogarithmic time using σ(N3) processors for matrix multiplication on a realistic system. Algorithms MM3, C1(ε), and C2(δ) all have o(𝒩3) cost and, hence, are very processor efficient. Algorithms MM1, MM3, and C1(ε) are general-purpose matrix multiplication algorithms, where the array elements are in any ring. Algorithms MM2 and C2(δ) are applicable to array elements that are integers of bounded magnitude, or floating-point values of bounded precision and magnitude, or Boolean values. Extension of algorithms MM 2 and C2(δ) to unbounded integers and reals are also discussed
  • Keywords
    computational complexity; matrix multiplication; parallel algorithms; reconfigurable architectures; LARPBS; bounded precision; floating-point values; linear arrays; matrix multiplication; parallel implementations; parallelism; reconfigurable pipelined bus systems; sublogarithmic time; time complexity; Concurrent computing; Costs; Eigenvalues and eigenfunctions; Graph theory; Optical arrays; Parallel algorithms; Parallel processing; Polynomials; Power engineering and energy; Tree graphs;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.706044
  • Filename
    706044