• DocumentCode
    3414964
  • Title

    Designing efficient distributed algorithms using sampling techniques

  • Author

    Rajasekaran, S. ; Wei, D.S.L.

  • Author_Institution
    Florida Univ., USA
  • fYear
    1997
  • fDate
    1-5 Apr 1997
  • Firstpage
    397
  • Lastpage
    401
  • Abstract
    Shows the power of sampling techniques in designing efficient distributed algorithms. In particular, we show that, by using sampling techniques, selection can be done on some networks in such a way that the message complexity is independent of the cardinality of the set (file), provided the file size is polynomial in the network size. For example, given a file F of size n and an integer k (1⩽k⩽n), on a p-processor de Bruijn network our deterministic selection algorithm can find the kth smallest key from F using O(p log3p) messages and with a communication delay of O(log3p), and our randomized selection algorithm can finish the same task using only O(p) messages and a communication delay of O(log p) with high probability, provided the file size is polynomial in network size. Our randomized selection outperforms the existing approaches in terms of both message complexity and communication delay. The property that the number of messages needed and the communication delay are independent of the size of the file makes our distributed selection schemes extremely attractive in such domains as very large database systems. Making use of our selection algorithms to select pivot element(s), we also develop a near-optimal quicksort-based sorting scheme and a nearly-optimal enumeration sorting scheme for sorting large distributed files on the hypercube and de Bruijn networks. Our algorithms are fully distributed without any a priori central control
  • Keywords
    communication complexity; database theory; delays; deterministic algorithms; distributed algorithms; distributed databases; multiprocessor interconnection networks; randomised algorithms; sorting; very large databases; communication delay; de Bruijn network; deterministic selection algorithm; distributed selection schemes; efficient distributed algorithms; file size; hypercube networks; large distributed files; message complexity; near-optimal quicksort-based sorting scheme; nearly-optimal enumeration sorting scheme; network size; pivot element selection; randomized selection algorithm; sampling techniques; set cardinality; very large database systems; Algorithm design and analysis; Computer networks; Database systems; Delay; Distributed algorithms; Distributed computing; Hypercubes; Polynomials; Sampling methods; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing Symposium, 1997. Proceedings., 11th International
  • Conference_Location
    Genva
  • ISSN
    1063-7133
  • Print_ISBN
    0-8186-7793-7
  • Type

    conf

  • DOI
    10.1109/IPPS.1997.580932
  • Filename
    580932