• DocumentCode
    1668319
  • Title

    Matrix bidiagonalization on the Trident processor

  • Author

    Soliman, Mostafa I. ; Sedukhin, Stanislav G.

  • Author_Institution
    Graduate Sch. of Comput. Sci. & Eng., Univ. of Aizu, Fukushima, Japan
  • fYear
    2003
  • Abstract
    This paper discusses the implementation and evaluation of the reduction of a dense matrix to bidiagonal form on the Trident processor. The standard Golub and Kahan Householder bidiagonalization algorithm, which is rich in matrix-vector operations, and the LAPACK subroutine _GEBRD, which is rich in a mixture of vector, matrix-vector, and matrix operations, are simulated on the Trident processor. We show how to use the Trident parallel execution units, ring, and communication registers to effectively perform vector, matrix-vector, and matrix operations needed for bidiagonalizing a matrix. The number of clock cycles per FLOP is used as a metric to evaluate the performance of the Trident processor. Our results show that increasing the number of the Trident lanes proportionally decreases the number of cycles needed per FLOP. On a 32 K×32 K matrix and 128 Trident lanes, the speedup of using matrix-vector operations on the standard Golub and Kahan algorithm is around 1.5 times over using vector operations. However, using matrix operations on the GEBRD subroutine gives speedup around 3 times over vector operations, and 2 times over using matrix-vector operations on the standard Golub and Kahan algorithm.
  • Keywords
    matrix decomposition; parallel algorithms; parallel architectures; performance evaluation; simulation; subroutines; vectors; BLAS; GEBRD; Golub and Kahan Householder algorithm; LAPACK subroutine; Trident processor; communication registers; dense matrix; matrix bidiagonalization; matrix-vector operations; parallel execution units; performance; ring; scalable architecture; simulation; speedup; Algorithms; Architecture; Cities and towns; Clocks; Computer science; Hardware; Matrix decomposition; Parallel processing; Parallel programming; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2003. Proceedings. International
  • ISSN
    1530-2075
  • Print_ISBN
    0-7695-1926-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2003.1213467
  • Filename
    1213467