Title :
Replication in an information filtering system
Author :
Terry, Douglas B.
Author_Institution :
Xerox Palo Alto Res. Center, CA, USA
Abstract :
In the Tapestry system developed at Xerox PARC, users provide queries to filter incoming streams of documents. These queries run continuously over a growing database of electronic mail messages, news articles, and other textual documents. In the current implementation, all filter queries for all users run on a single database server. Those documents that match a user´s filter query (or queries) are queued up for the user and can be retrieved directly via an RPC interface or, as is typically the case, can be sent to the user via electronic mail. This paper shows that the design of a distributed information filtering service involves challenges not faced in other distributed applications. Replication is needed, not for fault-tolerance or performance but simply for scalability. Of course, once replication is provided, it can be used to increase the fault-tolerance of the system (with some additional work). A new technique called filter-based replication is proposed for deciding what to replicate and where
Keywords :
distributed databases; fault tolerant computing; Tapestry system; Xerox PARC; distributed information filtering service; electronic mail messages; fault-tolerance; filter-based replication; information filtering system; news articles; Databases; Distributed computing; Electronic mail; Fault tolerance; Information filtering; Information filters; Matched filters; Network servers; Scalability;
Conference_Titel :
Management of Replicated Data, 1992., Second Workshop on the
Conference_Location :
Monterey, CA
Print_ISBN :
0-8186-3170-8
DOI :
10.1109/MRD.1992.242615