Title :
Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times
Author :
Shieh, Jin ; Keogh, Eamonn
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, CA, USA
Abstract :
Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.
Keywords :
optimisation; pattern classification; probability; resource allocation; anytime algorithm; batch optimization; computationally constrained environment; constant arrival time; data stream classification; individual object; overall classification performance; probability optimization; satisfactory performance; time sensitive; variable response time; anytime algorithms; classification; nearest neighbor; streaming data;
Conference_Titel :
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9131-5
Electronic_ISBN :
1550-4786
DOI :
10.1109/ICDM.2010.120