Title :
Universal Numerical Encoder and Profiler Reduces Computing´s Memory Wall with Software, FPGA, and SoC Implementations
Author_Institution :
Samplify Syst., Campbell, CA, USA
Abstract :
Summary form only given. Numerical computations have accelerated significantly since 2005 thanks to two complementary, silicon-enabled trends: multi-core processing and single instruction, multiple data (SIMD) accelerators. Unfortunately, due to fundamental limitations of physics, these two trends could not be accompanied by a corresponding increase in memory, storage, and I/O bandwidth. High-performance computing (HPC) is the proverbial “canary in the coal mine” of multi-core processing. When HPC hits a multi-core will likely encounter a similar limit in few years. We describe the computationally efficient (Fig 1b) and adaptive APplication AXceleration (APAX) numerical encoding method to reduce the memory wall for integers and floating-point operands. APAX achieves encoding rates between 3:1 and 10:1 without changing the dataset´s statistical or spectral characteristics. APAX encoding takes advantage of three characteristics of all numerical sequences: peak-to-average ratio, oversampling, and effective number of bits (ENOB). Uncertainty quantification and spectral methods quantify the degree of uncertainty (accuracy) in numerical datasets. APAX profiler creates a rate-correlation graph with recommended operating signals, and fundamental limit, consumer point, provides 18 quantitative metrics comparing the original and decoded displays input and residual spectra with a residual histogram. On 24 integer and floating-point HPC datasets taken from climate, multi-physics, and seismic simulations, APAX averaged 7.95:1 encoding ratio at a Pearson´s correlation coefficient of 0. 999948, and a spectral margin (input spectrum min - residual spectrum mean) of 24 dB. HPC scientists confirmed that APAX did not change HPC simulation results DRAM and disk transfers by 8x, accelerating HPC “time to results” by 20% while reducing to 50%.
Keywords :
DRAM chips; electronic engineering computing; encoding; field programmable gate arrays; graph theory; microprocessor chips; numerical analysis; system-on-chip; APAX profiler; DRAM; ENOB; FPGA implementations; HPC; HPC scientists; I/O bandwidth; Pearson correlation coefficient; SoC implementations; adaptive APAX numerical encoding method; adaptive application axceleration numerical encoding method; climate simulation; coal mine; complementary trends; computing memory wall reduction; consumer point; decoded displays; effective number of bits; floating-point HPC datasets; floating-point operands; fundamental limit; high-performance computing; integer HPC datasets; integer operands; multicore processing; multiphysics simulation; multiple data accelerators; numerical computations; numerical datasets; peak-to-average ratio; proverbial canary; quantitative metrics; rate-correlation graph; recommended operating signals; residual histogram; residual spectra; seismic simulations; silicon-enabled trends; single instruction; spectral margin; uncertainty quantification; universal numerical encoder; universal numerical profiler; Acceleration; Encoding; Field programmable gate arrays; Market research; Multicore processing; Software; HPC; compression; high-performance computing; memory wall; real-time;