DocumentCode
1398588
Title
Scalable hardware-algorithms for binary prefix sums
Author
Lin, R. ; Nakano, K. ; Olariu, S. ; Pinotti, M.C. ; Schwing, J.L. ; Zomaya, A.Y.
Author_Institution
Dept. of Comput. Sci., State Univ. of New York, Geneseo, NY, USA
Volume
11
Issue
8
fYear
2000
fDate
8/1/2000 12:00:00 AM
Firstpage
838
Lastpage
850
Abstract
We address the problem of designing efficient and scalable hardware-algorithms for computing the sum and prefix sums of a wk -bit, (k⩾2), sequence using as basic building blocks linear arrays of at most w2 shift switches, where w is a small power of 2. An immediate consequence of this feature is that in our designs broadcasts are limited to buses of length at most w2. We adopt a VLSI delay model where the “length” of a bus is proportional with the number of devices on the bus. We begin by discussing a hardware-algorithm that computes the sum of a wk-bit binary sequence in the time of 2k-2 broadcasts, while the corresponding prefix sums can be computed in the time of 3k-4 broadcasts. Quite remarkably, in spite of the fact that our hardware-algorithm uses only linear arrays of size at most w2, the total number of broadcasts involved is less than three times the number required by an “ideal” design. We then go on to propose a second hardware-algorithm, operating in pipelined fashion, that computes the sum of a kw2-bit binary sequence in the time of 3k+[logw k]=3 broadcasts. Using this design, the corresponding prefix sums can be computed in the time of 4k+[logw k]-5 broadcasts
Keywords
parallel algorithms; VLSI delay model; binary prefix sums; binary sequence; scalable hardware-algorithms; Arithmetic; Binary sequences; Broadcasting; Computer Society; Computer architecture; Concurrent computing; Delay; Hardware; Switches; Very large scale integration;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/71.877941
Filename
877941
Link To Document