• DocumentCode
    3104687
  • Title

    Cluster Ranking with an Application to Mining Mailbox Networks

  • Author

    Bar-Yossef, Ziv ; Guy, Ido ; Lempel, Ronny ; Maarek, Yoëlle S. ; Soroka, Vladimir

  • Author_Institution
    Dept. of Electr. Eng., Technion & Google Inc., Haifa
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    63
  • Lastpage
    74
  • Abstract
    We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters - the integrated cohesion - which is applicable to arbitrary weighted networks. We then present C-Rank: a new cluster ranking algorithm. Given a network with arbitrary pairwise similarity weights, C-Rank creates a list of overlapping clusters and ranks them by their integrated cohesion. We provide extensive theoretical and empirical analysis of C-Rank and show that it is likely to have high precision and recall. Our experiments focus on mining mailbox networks. A mailbox network is an egocentric social network, consisting of contacts with whom an individual exchanges email. Ties among contacts are represented by the frequency of their co-occurrence on message headers. C-Rank is well suited to mine such networks, since they are abundant with overlapping communities of highly variable strengths. We demonstrate the effectiveness of C-Rank on the Enron data set, consisting of 130 mailbox networks.
  • Keywords
    data mining; electronic mail; pattern clustering; Enron data set; arbitrary weighted network; cluster ranking algorithm; egocentric social network; mailbox network mining; Application software; Clustering algorithms; Clustering methods; Computer science; Frequency; Information retrieval; Needles; Particle separators; Partitioning algorithms; Social network services;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2006. ICDM '06. Sixth International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2701-7
  • Type

    conf

  • DOI
    10.1109/ICDM.2006.35
  • Filename
    4053035