• DocumentCode
    2208305
  • Title

    Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times

  • Author

    Shieh, Jin ; Keogh, Eamonn

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, CA, USA
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    461
  • Lastpage
    470
  • Abstract
    Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.
  • Keywords
    optimisation; pattern classification; probability; resource allocation; anytime algorithm; batch optimization; computationally constrained environment; constant arrival time; data stream classification; individual object; overall classification performance; probability optimization; satisfactory performance; time sensitive; variable response time; anytime algorithms; classification; nearest neighbor; streaming data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.120
  • Filename
    5694000