• DocumentCode
    1194288
  • Title

    Distributed Garbage Collection Algorithms for Timestamped Data

  • Author

    Ramachandran, Umakishore ; Harel, Nissim ; Mandviwala, Hasnain A. ; Knobe, Kathleen

  • Author_Institution
    Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA
  • Volume
    17
  • Issue
    10
  • fYear
    2006
  • Firstpage
    1057
  • Lastpage
    1071
  • Abstract
    There is an important class of interactive multimedia applications that deals with stream data from distributed sources. Indexing the data temporally facilitates ordering individual streams as well as correlating items from different streams. The Stampede programming system organizes stream data into channels that are distributed and synchronized data structures that contain timestamped items. A stampede program is a data flow graph of threads and channels. Stampede semantics for channels allow concurrent access from multiple threads for input and output. While a channel holds timestamped items, the semantics do not place any restriction on either the production or consumption order of these items. Furthermore, timestamps of items in a channel need not be contiguous. These flexibilities are required due to the dynamic and parallel structure of stream-oriented applications targeted by the stampede system. Under such circumstances, a key issue is the "garbage collection" (GC) of channel items. In this paper, we present and compare three different GC algorithms: 1) REF is a simple algorithm that keeps a reference count on individual items; 2) TGC is a distributed algorithm for computing a global low watermark for timestamp values of interest in the entire application; 3) DGC is another distributed algorithm that uses information about the dependencies between the producers and consumers of data streams to compute a low water mark local to each node of the data flow graph. DGC can simultaneously eliminate garbage from channels and unneeded computations from threads, in tests performed using an interactive application, DGC enjoys nearly 30 percent reduction in the application memory footprint, compared, to TGC and REF. DGC and REF are also shown to be more scalable compared to TGC
  • Keywords
    data flow graphs; data structures; distributed algorithms; distributed programming; multimedia computing; storage management; DGC algorithm; REF algorithm; TGC algorithm; cluster computing; data flow graph; distributed garbage collection algorithm; distributed programming; multimedia system; stampede programming; timestamped data; ubiquitous computing; Data flow computing; Data structures; Distributed algorithms; Distributed computing; Flow graphs; Indexing; Production; Streaming media; Watermarking; Yarn; Garbage collection; cluster computing; distributed programming; logical timestamps; multimedia systems; performance evaluation; soft real-time systems; ubiquitous computing.; virtual time;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2006.138
  • Filename
    1687877