• DocumentCode
    3125373
  • Title

    Diverse Dimension Decomposition of an Itemset Space

  • Author

    Tsytsarau, Mikalai ; Bonchi, Francesco ; Gionis, Aristides ; Palpanas, Themis

  • fYear
    2011
  • fDate
    11-14 Dec. 2011
  • Firstpage
    725
  • Lastpage
    734
  • Abstract
    We introduce the problem of diverse dimension decomposition in transactional databases. A dimension is a set of mutually-exclusive item sets, and our problem is to find a decomposition of the item set space into dimensions, which are orthogonal to each other, and that provide high coverage of the input database. The mining framework we propose effectively represents a dimensionality-reducing transformation from the space of all items to the space of orthogonal dimensions. Our approach relies on information-theoretic concepts, and we are able to formulate the dimension-finding problem with a single objective function that simultaneously captures constraints on coverage, exclusivity and orthogonality. We describe an efficient greedy method for finding diverse dimensions from transactional databases. The experimental evaluation of the proposed approach using two real datasets, flickr and delicious, demonstrates the effectiveness of our solution. Although we are motivated by the applications in the collaborative tagging domain, we believe that the mining task we introduce in this paper is general enough to be useful in other application domains.
  • Keywords
    database management systems; transaction processing; collaborative tagging; diverse dimension decomposition; information theoretic concepts; itemset space; orthogonal dimensions; transactional databases; Art; Data mining; Entropy; Itemsets; Joints; Mutual information; dimensionality reduction; itemset mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2011 IEEE 11th International Conference on
  • Conference_Location
    Vancouver,BC
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4577-2075-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2011.58
  • Filename
    6137277