• DocumentCode
    2224128
  • Title

    Novel vertical mining on Diffsets structure

  • Author

    Consue, Wootipong ; Kurutach, Werasak

  • Author_Institution
    Dept. of Comput. Eng., Mahanakorn Univ. of Technol., Bangkok, Thailand
  • fYear
    2003
  • fDate
    13-16 Oct. 2003
  • Firstpage
    343
  • Lastpage
    349
  • Abstract
    Mining frequent patterns on the vertical data structures usually shows improvements of performance over the classical horizontal structure. This is because the vertical data structure supports fast frequency counting via intersection operations on transaction identifiers (TIDs). Recently, Diffsets by M.J. Zaki and K. Gouda (2001), a vertical data representation, has been introduced for the sake of the size of memory required to store intermediate TIDs in the mining process. In this paper, we present a new vertical mining algorithm on the Diffset structure called Fast Diffsets Vertical Mining (FDVM). Primarily, FDVM uses the concept of pattern growth on the Diffset structure, and we show that FDVM outperforms previous methods in mining the complete set of frequent patterns. Our experimental results indicate that significant performance improvement can be gained, especially for large databases, over previously proposed vertical and horizontal mining algorithms.
  • Keywords
    data mining; data structures; database theory; Diffsets structure; Fast Diffsets Vertical Mining algorithm; classical horizontal structure; data mining; fast frequency counting; horizontal mining algorithm; intersection operation; memory; pattern growth; performance improvement; transaction identifier; vertical data representation; vertical data structure; Association rules; Data engineering; Data mining; Data structures; Electronic mail; Frequency; Information technology; Itemsets; Multidimensional systems; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Agent Technology, 2003. IAT 2003. IEEE/WIC International Conference on
  • Print_ISBN
    0-7695-1931-8
  • Type

    conf

  • DOI
    10.1109/IAT.2003.1241095
  • Filename
    1241095