Title :
Age Detection in Chat
Author :
Tam, Jenny ; Martell, Craig H.
Author_Institution :
Dept. of Comput. Sci., Naval Postgrad. Sch., Monterey, CA, USA
Abstract :
This paper presents the results of using statistical analysis and automatic text categorization to identify an author´s age group based on the author´s online chat posts. A naive Bayesian classifier and support vector machine (SVM) model were used. The SVM model experiments generated an f-score measurement of 0.996 on test data distinguishing teens from adults. We also introduce an alternative method for generating ldquostop wordsrdquo that chooses n-grams based on their relative distribution across the classes.
Keywords :
Bayes methods; electronic messaging; pattern classification; statistical analysis; support vector machines; text analysis; age detection; author online chat posts; automatic text categorization; f-score measurement; naive Bayesian classifier; statistical analysis; stop words generation; support vector machine model; Bayesian methods; Computer science; Internet; Law enforcement; Niobium compounds; Statistical analysis; Support vector machine classification; Support vector machines; Testing; Text categorization; Naïve Bayesian Classifier; Support Vector Machine; age classification; online chat; stop words;
Conference_Titel :
Semantic Computing, 2009. ICSC '09. IEEE International Conference on
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-4962-0
Electronic_ISBN :
978-0-7695-3800-6
DOI :
10.1109/ICSC.2009.37