DocumentCode
3228821
Title
Binary Cybergenre Classification Using Theoretic Feature Measures
Author
Dong, Lie ; Walters, Christine ; Duffy, Jack ; Shepherd, Michael
Author_Institution
Dalhousie Univ., Halifax, NS
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
313
Lastpage
316
Abstract
In this study, we conducted an investigation on automatic genre classification for three common types of Web pages addressing the effect of three theoretic feature selection measures, a range of feature set size, and three machine classifiers on the accuracy of the Web page classification in the context of a set of controlled experiments. Our results are encouraging and we conclude that for binary classification tasks, at least for these Web page genres, it is possible to reach satisfying results with small content-based feature sets generated with a sound feature selection measure and furthermore there is no evidence of interaction between these feature selection measures and the machine classifiers used
Keywords
Internet; classification; feature extraction; information retrieval; search engines; support vector machines; automatic Web page genre classification; binary cybergenre classification; content-based feature sets; machine classifier; theoretic feature selection measure; Automatic control; HTML; Information retrieval; Lifting equipment; Robustness; Search engines; Size control; Size measurement; Uniform resource locators; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
Conference_Location
Hong Kong
Print_ISBN
0-7695-2747-7
Type
conf
DOI
10.1109/WI.2006.50
Filename
4061384
Link To Document