• DocumentCode
    3559477
  • Title

    estMax: Tracing Maximal Frequent Item Sets Instantly over Online Transactional Data Streams

  • Author

    Woo, Ho Jin ; Lee, Won Suk

  • Author_Institution
    Dept. of Comput. Sci., Yenisei Univ., Seoul, South Korea
  • Volume
    21
  • Issue
    10
  • fYear
    2009
  • Firstpage
    1418
  • Lastpage
    1431
  • Abstract
    Frequent item set mining is one of the most challenging issues for descriptive data mining. In general, its resulting set tends to produce a large number of frequent item sets. To represent them in a more compact notation, closed or maximal frequent item sets are often used but finding such item sets over online transactional data streams is not easy due to the requirements of a data stream. For this purpose, this paper proposes a method of tracing the set of MFIs instantly over an online data stream. The method, namely estMax, maintains the set of frequent item sets by a prefix tree and extracts all MFIs without any additional superset/subset checking mechanism. Upon processing a new transaction, those frequent item sets that are matched maximally by the transaction are newly marked in their corresponding nodes of the prefix tree as candidates for MFIs. At the same time, if any subset of a newly marked item set has been already marked as a candidate MFI by a previous transaction, it is cleared as well. By employing this additional step, it is possible to extract the set of MFIs at any moment. The performance of the estMax method is comparatively analyzed by a series of experiments to identify its various characteristics.
  • Keywords
    data mining; transaction processing; descriptive data mining; estMax method; frequent item set mining; online transactional data stream; superset-subset checking mechanism; Data mining; Mining methods and algorithms; maximal frequent item sets; transactional data streams.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • Conference_Location
    12/12/2008 12:00:00 AM
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2008.233
  • Filename
    4711051