Title :
Scalable global and local hashing strategies for duplicate pruning in parallel A* graph search
Author :
Mahapatra, Nihar R. ; Dutt, Shantanu
Author_Institution :
Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA
fDate :
7/1/1997 12:00:00 AM
Abstract :
For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines. A commonly used method for duplicate pruning uses a hash function to associate with each distinct node of the search space a particular processor to which duplicate nodes arising in different processors are transmitted and thereby pruned. This approach has two major drawbacks. First, load balance is determined solely by the hash function. Second, node transmissions for duplicate pruning are global; this can lead to hot spots and slower message delivery. To overcome these problems, we propose two different duplicate pruning strategies: 1) To achieve good load balance, we decouple the task of duplicate pruning from load balancing, by using a hash function for the former and a load balancing scheme for the latter. 2) A novel search-space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in a hypercube (or disjoint processor groups in the target architecture), so that duplicate pruning is achieved with only intrasubcube or adjacent intersubcube communication. Thus message latency and hot-spot probability are greatly reduced. The above duplicate pruning schemes were implemented on an nCUBE2 hypercube multicomputer to solve the Traveling Salesman Problem (TSP). For uniformly distributed intercity costs, our strategies yield a speedup improvement of 13 to 35 percent on 1,024-processors over previous methods that do not prune any duplicates, and 13 to 25 percent over the previous hashing-only scheme. For normally distributed data the corresponding figures are 135 percent and 10 to 155 percent. Finally, we analyze the scalability of our parallel A* algorithms on k-ary n-cube networks in terms of the isoefficiency metric, and show that they have isoefficiency lower and upper bounds of Θ(P log P) and Θ(Pkn2), respectively
Keywords :
delays; file organisation; multiprocessor interconnection networks; parallel algorithms; search problems; disjoint processor groups; distributed-memory machines; duplicate pruning; hot spots; hot-spot probability; hypercube; interprocessor duplicates; intersubcube communication; load balance; load balancing; local hashing strategies; lower bounds; message delivery; message latency; nCUBE2 hypercube multicomputer; parallel A* algorithms; parallel A* graph search; scalable global strategies; search-space partitioning scheme; state space; traveling salesman problem; upper bounds; Algorithm design and analysis; Costs; Delay; Hypercubes; Load management; Scalability; State-space methods; Traveling salesman problems; Tree graphs; Upper bound;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on