• DocumentCode
    2052167
  • Title

    Age Detection in Chat

  • Author

    Tam, Jenny ; Martell, Craig H.

  • Author_Institution
    Dept. of Comput. Sci., Naval Postgrad. Sch., Monterey, CA, USA
  • fYear
    2009
  • fDate
    14-16 Sept. 2009
  • Firstpage
    33
  • Lastpage
    39
  • Abstract
    This paper presents the results of using statistical analysis and automatic text categorization to identify an author´s age group based on the author´s online chat posts. A naive Bayesian classifier and support vector machine (SVM) model were used. The SVM model experiments generated an f-score measurement of 0.996 on test data distinguishing teens from adults. We also introduce an alternative method for generating ldquostop wordsrdquo that chooses n-grams based on their relative distribution across the classes.
  • Keywords
    Bayes methods; electronic messaging; pattern classification; statistical analysis; support vector machines; text analysis; age detection; author online chat posts; automatic text categorization; f-score measurement; naive Bayesian classifier; statistical analysis; stop words generation; support vector machine model; Bayesian methods; Computer science; Internet; Law enforcement; Niobium compounds; Statistical analysis; Support vector machine classification; Support vector machines; Testing; Text categorization; Naïve Bayesian Classifier; Support Vector Machine; age classification; online chat; stop words;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing, 2009. ICSC '09. IEEE International Conference on
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-4962-0
  • Electronic_ISBN
    978-0-7695-3800-6
  • Type

    conf

  • DOI
    10.1109/ICSC.2009.37
  • Filename
    5298540