DocumentCode
3138431
Title
A New Approach to Email Classification Using Concept Vector Space Model
Author
Zeng, Chao ; Lu, Zhao ; Gu, Junzhong
Author_Institution
Inst. of Comput. Applic., East China Normal Univ., Shanghai
Volume
3
fYear
2008
fDate
13-15 Dec. 2008
Firstpage
162
Lastpage
166
Abstract
Email classification methods based on the content general use vector space model. The model is constructed based on the frequency of every independent word appearing in Email content. Frequency based VSM does not take the context environment of the word into account, thus the feature vectors can not accurately represent Email content, which will result in the inaccurate of classification. This paper presents a new approach to Email classification based on the concept vector space model using WordNet. In our approach, based on WordNet we extract the high-level information on categories during training process by replacing terms in the feature vector with synonymy sets and considering the hypernymy-hyponymy relation between synonymy sets. We design a Email classification system based on the concept VSM and carry on a series of experiments. The results show that our approach could improve the accuracy of Email classification especially when the size of training set is small.
Keywords
electronic mail; WordNet; concept vector space model; email classification; hypernymy-hyponymy relation; synonymy sets; Classification algorithms; Classification tree analysis; Computer applications; Conferences; Data mining; Decision trees; Filtering; Frequency; Neural networks; Statistics; Concept Vector; Email Classification; VSM; WordNet;
fLanguage
English
Publisher
ieee
Conference_Titel
Future Generation Communication and Networking Symposia, 2008. FGCNS '08. Second International Conference on
Conference_Location
Sanya
Print_ISBN
978-1-4244-3430-5
Electronic_ISBN
978-0-7695-3546-3
Type
conf
DOI
10.1109/FGCNS.2008.7
Filename
4813571
Link To Document