Author_Institution :
Dept. of Electr. Eng., UCLA, Los Angeles, CA, USA
Abstract :
We consider a set of distributed learners that are interconnected via an exogenously-determined network. The learners observe different data streams that are related to common events of interest, which need to be detected in a timely manner. Each learner is equipped with a set of local classifiers, which generate local predictions about the common event based on the locally observed data streams. In this work, we address the following key questions: (1) Can the learners improve their detection accuracy by exchanging and aggregating information? (2) Can the learners improve the timeliness of their detections by forming clusters, i.e., by collecting information only from surrounding learners? (3) Given a specific tradeoff between detection accuracy and detection delay, is it desirable to aggregate a large amount of information, or is it better to focus on the most recent and relevant information? To address these questions, we propose a cooperative online learning scheme in which each learner maintains a set of weight vectors (one for each possible cluster), selects a cluster and the corresponding weight vector, generates a local prediction, disseminates it through the network, and combines all the received local predictions from the learners belonging to the selected cluster by using a weighted majority rule. The optimal cluster and weight vector that a learner should adopt depend on the specific network topology, on the location of the learner in the network, and on the characteristics of the data streams. To learn such optimal values, we propose a general online learning rule that exploits only the feedbacks that the learners receive. We determine an upper bound for the worst-case mis-detection probability and for the worst-case prediction delay of our scheme in the realizable case. Numerical simulations show that the proposed scheme is able to successfully adapt to the unknown characteristics of the data streams and can achieve substantial performance gains with - espect to a scheme in which the learners act individually or a scheme in which the learners always aggregate all available local predictions. We numerically evaluate the impact that different network topologies have on the final performance. Finally, we discuss several surprising existing trade-offs.
Keywords :
data mining; distributed processing; learning (artificial intelligence); pattern classification; probability; vectors; cluster selection; cooperative online learning scheme; data streams; distributed learners; exogenously-determined network; local classifiers; local prediction generation; network topology; networked learners; online learning rule; real-time distributed stream-mining solutions; timely event detection; weight vectors; weighted majority rule; worst-case mis-detection probability; worst-case prediction delay; Aggregates; Delays; Distributed databases; Estimation; Network topology; Real-time systems; Vectors; Event detection; classification; clustering; distributed learning; ensemble of classifiers; networked learners; online learning; weighted majority;