DocumentCode
1921600
Title
The importance of stop word removal on recall values in text categorization
Author
Silva, Catarina ; Ribeiro, Bemardete
Author_Institution
Dept. de Engenharia Inf., Coimbra Univ., Portugal
Volume
3
fYear
2003
fDate
20-24 July 2003
Firstpage
1661
Abstract
Given a data set and a learning task such as classification, there are two prime motives for executing some kind of data set reduction. On one hand there is the possible algorithm performance improvement. On the other hand the decrease in the overall size of the data set can bring advantages in storage space used and time spent computing. Our purpose is to determine the importance of several basic reduction techniques on Support Vector Machines, by comparing their relative performance improvement when applied on the standard REUTERS-21578 benchmark.
Keywords
classification; data reduction; indexing; information retrieval; support vector machines; text editing; REUTERS-21578 benchmark; algorithm performance improvement; data set reduction; learning task; stop word removal; storage space; support vector machines; text categorization; text classification; Humans; Information retrieval; Internet; Large-scale systems; Support vector machine classification; Support vector machines; Taxonomy; Text categorization; Text mining; Text processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2003. Proceedings of the International Joint Conference on
ISSN
1098-7576
Print_ISBN
0-7803-7898-9
Type
conf
DOI
10.1109/IJCNN.2003.1223656
Filename
1223656
Link To Document