Title :
Compiling Dynamic Data Structures in Python to Enable the Use of Multi-core and Many-core Libraries
Author :
Ren, Bin ; Agrawal, Gagan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Abstract :
Programmer productivity considerations are increasing the popularity of interpreted languages like Python. At the same time, for applications where performance is important, these languages clearly lack even on uniprocessors. In addition, the use of dynamic data structures in a language like Python makes it very hard to use emerging libraries for enabling the execution on multi-core and many-core architectures. This paper presents a framework for compiling Python to use multi-core and many-core libraries. The key component of our framework involves a suite of algorithms for replacing dynamic and/or nested data structures by arrays, while minimizing unnecessary data copying costs. This involves a novel use of an existing partial redundancy elimination algorithm, development of a new demand-driven interprocedural partial redundancy algorithm, a data flow formulation for determining that the contents of the data structure are of the same type, and a linearization algorithm. We have evaluated our framework using data mining and two linear algebra applications written in pure Python. The key observations were: 1) the code generated by our framework is only 10% to 20% slower compared to the hand-written C code that invokes the same libraries, 2) our optimizations turn out to be significant for improving the performance in most cases, and 3) we outperform interpreted Python and the C++ code generated by an existing tool by one to two orders of magnitude.
Keywords :
C++ language; data flow analysis; data mining; data structures; linear algebra; linearisation techniques; multiprocessing systems; optimisation; program compilers; redundancy; C++ code; Python; data copying costs; data flow formulation; data mining; demand driven interprocedural partial redundancy algorithm; dynamic data structures; handwritten C code; linear algebra; linearization algorithm; manycore libraries; multicore libraries; nested data structures; partial redundancy elimination algorithm; Algorithm design and analysis; Arrays; Heuristic algorithms; Libraries; Redundancy; Compilation for multi-core and many-core; Python; Redundancy Elimination;
Conference_Titel :
Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on
Conference_Location :
Galveston, TX
Print_ISBN :
978-1-4577-1794-9
DOI :
10.1109/PACT.2011.13