• DocumentCode
    3425899
  • Title

    Gender identification from E-mails

  • Author

    Cheng, Na ; Chen, Xiaoling ; Chandramouli, R. ; Subbalakshmi, K.P.

  • Author_Institution
    Dept. of ECE, Stevens Inst. of Technol., Hoboken, NJ
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    154
  • Lastpage
    158
  • Abstract
    In this paper, we investigate the topic of gender identification for short length, multi-genre, content-free e-mails. We introduce for the first time (to our knowledge), psycholinguistic and gender-linked cues for this problem, along with traditional stylometric features. Decision tree and support vector machines learning algorithms are used to identify the gender of the author of a given e-mail. The experiment results show that our approach is promising with an average accuracy of 82.2%.
  • Keywords
    decision trees; electronic mail; learning (artificial intelligence); support vector machines; content-free e-mail; decision tree; gender identification; gender-linked cues; multi-genre e-mail; short length e-mail; stylometric features; support vector machines learning algorithms; Computer mediated communication; Decision trees; Electronic mail; Helium; Internet; Machine learning; Natural languages; Psychology; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2765-9
  • Type

    conf

  • DOI
    10.1109/CIDM.2009.4938643
  • Filename
    4938643