A very high speed integrated circuit is required to perform a 3-pixel-by-3-pixel sliding window convolution over an image. To perform this on real-time video requires on the order of

or 71 × 10
6multiplies/sec. A device has been designed to perform 90 × 10
6multiply-accumulate operations/sec. using 8-bit input words and providing full precision output. Since the device can perform general vector multiplication, it is therefore useful for general digital filtering. Sets of devices may be used to increase accuracy or to chain together to form high-order FIR filters, This paper describes the algorithms and architecture used within the device.