• DocumentCode
    2721032
  • Title

    Clustering to forecast sparse time-series data

  • Author

    Jha, Abhay ; Ray, Shubhankar ; Seaman, Brian ; Dhillon, Inderjit S.

  • Author_Institution
    Smart Forecasting, WalmartLabs, USA
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    1388
  • Lastpage
    1399
  • Abstract
    Forecasting accurately is essential to successful inventory planning in retail. Unfortunately, there is not always enough historical data to forecast items individually- this is particularly true in e-commerce where there is a long tail of low selling items, and items are introduced and phased out quite frequently, unlike physical stores. In such scenarios, it is preferable to forecast items in well-designed groups of similar items, so that data for different items can be pooled together to fit a single model. In this paper, we first discuss the desiderata for such a grouping and how it differs from the traditional clustering problem. We then describe our approach which is a scalable local search heuristic that can naturally handle the constraints required in this setting, besides being capable of producing solutions competitive with well-known clustering algorithms. We also address the complementary problem of estimating similarity, particularly in the case of new items which have no past sales. Our solution is to regress the sales profile of items against their semantic features, so that given just the semantic features of a new item we can predict its relation to other items, in terms of as yet unobserved sales. Our experiments demonstrate both the scalability of our approach and implications for forecast accuracy.
  • Keywords
    constraint handling; electronic commerce; forecasting theory; inventory management; pattern clustering; retail data processing; sales management; time series; clustering algorithms; complementary problem; constraint handling; e-commerce; inventory planning; retail; sales profile; scalable local search heuristics; semantic features; similarity estimation; sparse time-series data forecasting; Clustering algorithms; Correlation; Cost function; Data models; Forecasting; Robustness; Semantics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113385
  • Filename
    7113385