DocumentCode
2372903
Title
Author detection by using different term weighting schemes
Author
Tufekci, P. ; Uzun, Ersin
Author_Institution
Bilgisayar Muhendisligi Bolumu, Namik Kemal Univ., Tekirdag, Turkey
fYear
2013
fDate
24-26 April 2013
Firstpage
1
Lastpage
4
Abstract
In this study, the impact of term weighting on author detection as a type of text classification is investigated. The feature vector being used to represent texts, consists of stem words as features and their weight values, which are obtained by applying 14 different term weighting schemes. The performances of these feature vectors for 3 different datasets in the author detection are tested with some classification methods such as Naïve Bayes Multinominal (NBM), and Support Vector Machine (SVM), Decision Tree (C4.5), and Random Forrest (RF), and are compared with each other. As a result of that, the most successful classifier, which can predict the author of an article, is found as SVM classifier with 98.75% mean accuracy; the most successful term weighting scheme is found as ACTF.IDF.(ICF+1) with 91.54% general mean accuracy.
Keywords
Bayes methods; authoring systems; decision trees; support vector machines; text analysis; C4.5 method; NBM method; Naive Bayes multinominal method; RF method; SVM classifier; author detection; decision tree; feature vector; random forest method; stem words; support vector machine; term weighting schemes; text classification; text representation; weight values; Accuracy; Educational institutions; Feature extraction; Radio frequency; Support vector machine classification; Text categorization; author detection; term weighting schemes; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Communications Applications Conference (SIU), 2013 21st
Conference_Location
Haspolat
Print_ISBN
978-1-4673-5562-9
Electronic_ISBN
978-1-4673-5561-2
Type
conf
DOI
10.1109/SIU.2013.6531190
Filename
6531190
Link To Document