DocumentCode :
1527808
Title :
Sequential Anomaly Detection in the Presence of Noise and Limited Feedback
Author :
Raginsky, Maxim ; Willett, Rebecca M. ; Horn, Corinne ; Silva, Jorge ; Marcia, Roummel F.
Author_Institution :
Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
Volume :
58
Issue :
8
fYear :
2012
Firstpage :
5544
Lastpage :
5562
Abstract :
This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: 1) filtering, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations and 2) hedging, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.
Keywords :
convex programming; filtering theory; probability; security of data; Enron email dataset; belief signal estimation; binary vectors; data-adaptive threshold; dynamic thresholding; exponential-family models; feedback signal estimation; filtering element; hedging element; limited feedback; noise feedback; noisy observations; online convex programming; potentially noisy data; sequential anomaly detection; sequential probability assignment; sequentially observed data; slowly varying product distributions; static product distributions; time-varying threshold; universal prediction; Educational institutions; Electronic mail; Mirrors; Noise measurement; Programming; Radio frequency; USA Councils; Anomaly detection; exponential families; filtering; individual sequences; label-efficient prediction; minimax regret; online convex programming (OCP); prediction with limited feedback; sequential probability assignment; universal prediction;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2012.2201375
Filename :
6208875
Link To Document :
بازگشت