• DocumentCode
    1643872
  • Title

    Discovery of email communication networks from the Enron corpus with a genetic algorithm using social network analysis

  • Author

    Wilson, Garnett ; Banzhaf, Wolfgang

  • Author_Institution
    Dept. of Comput. Sci., Memorial Univ. of Newfoundland, St. John´´s, NL
  • fYear
    2009
  • Firstpage
    3256
  • Lastpage
    3263
  • Abstract
    During the legal investigation of Enron Corporation, the U.S. Federal Regulatory Commission (FERC) made public a substantial data set of the company´s internal corporate emails. This work presents a genetic algorithm (GA) approach to social network analysis (SNA) using the Enron corpus. Three SNA metrics, degree, density, and proximity prestige, were applied to the detection of networks with high email activity and presence of important actors with respect to email transactions. Quantitative analysis revealed that density and proximity prestige captured networks of high activity more so than degree. Subsequent qualitative analysis indicated that there were trade-offs in the selection of SNA metrics. Examination of the discovered social networks showed that density and proximity prestige isolated networks involving key actors to a greater extent than degree. In particular, density picked out interesting patterns in terms of email volume, while proximity prestige better isolated key actors at Enron. The roles of the particular actors picked out by the networks as reasons for their prominence are also discussed.
  • Keywords
    electronic mail; genetic algorithms; social networking (online); Enron corpus; email communication networks; genetic algorithm; qualitative analysis; quantitative analysis; social network analysis; Algorithm design and analysis; Communication networks; Electronic mail; Evolutionary computation; Genetic algorithms; Law; Legal factors; Natural language processing; Social network services; Sparse matrices;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation, 2009. CEC '09. IEEE Congress on
  • Conference_Location
    Trondheim
  • Print_ISBN
    978-1-4244-2958-5
  • Electronic_ISBN
    978-1-4244-2959-2
  • Type

    conf

  • DOI
    10.1109/CEC.2009.4983357
  • Filename
    4983357