Title of article :
Scalable Parallel Coset Enumeration: Bulk Definition and the Memory Wall
Author/Authors :
Gene Cooperman، نويسنده , , Victor Grinberg، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2002
Pages :
23
From page :
563
To page :
585
Abstract :
Coset enumeration, like Gröbner bases, is a notoriously difficult algorithm to parallelize. We demonstrate a successful shared memory parallelization achieving a seven times speedup on an Origin 2000 CC-NUMA computer using 16 CPUs. We take as a testbed, an enumeration of Lyons’s group (8 835 156 cosets). This provides comparability with previous efforts in the literature (Cooperman and Havas, 1997; Havas and Ramsay, 2000) for which the best previous speedup was a factor of 4. The new parallelization depends on two new heuristics, clouds and shallow scan. Clouds is an example of bulk definition of cosets, which forms the key to our more efficient parallelization. The parallelization is implemented using TOP-C. By taking advantage of TOP-C’s option to compile for either shared or distributed memory, we also demonstrate the first efficient parallelization of a coset enumeration program using distributed memory. Our faster results expose for the first time in the context of coset enumeration the “memory wall", i.e. the latency barrier of the RAM. We verify this memory wall by showing on an Origin 2000 that a memory-bound parallel program with 64 CPUs doing nothing but randomly accessing RAM achieves at best a speedup of only 3.75 over the single-CPU version. Further, it is demonstrated that even sequential versions of coset enumerations programs are memory latency-bound in today’s technology. The lessons of bulk definition and of the memory wall carry over to related algorithms such as Gröbner bases, Knuth-Bendix, and other symbolic algebra algorithms with intermediate swell.
Journal title :
Journal of Symbolic Computation
Serial Year :
2002
Journal title :
Journal of Symbolic Computation
Record number :
805627
Link To Document :
بازگشت