Title :
Text classification in the Turkish marketing domain for context sensitive ad distribution
Author :
Engin, Melih ; Can, Tolga
Author_Institution :
Dept. of Comput. Eng., Middle East Tech. Univ., Ankara, Turkey
Abstract :
In this paper, we construct and compare several feature extraction approaches in order to find a better solution for classification of Turkish Web documents in the marketing domain. We produce our feature extraction techniques using characteristics of the Turkish language, structures of Web documents and online content in the marketing domain. We form datasets in different feature spaces and we apply several support vector machine (SVM) configurations on these datasets. We conduct our study considering the performance needs of practical context sensitive systems. Our results show that linear kernel classifiers achieve the best performance in terms of accuracy and speed on text documents expressed as keyword root features.
Keywords :
Internet; data mining; document handling; information retrieval; learning (artificial intelligence); marketing data processing; natural language processing; support vector machines; text analysis; Turkish Web documents; Turkish language; Turkish marketing domain; context sensitive ad distribution; data mining; feature extraction techniques; information retrieval; linear kernel classifiers; machine learning; support vector machine; text classification; Advertising; Data mining; Feature extraction; Information retrieval; Internet; Kernel; Merchandise; Support vector machine classification; Support vector machines; Text categorization; Artificial Intelligence; Data Mining; Information Retrieval; Machine Learning; Text Classification; World Wide Web;
Conference_Titel :
Computer and Information Sciences, 2009. ISCIS 2009. 24th International Symposium on
Conference_Location :
Guzelyurt
Print_ISBN :
978-1-4244-5021-3
Electronic_ISBN :
978-1-4244-5023-7
DOI :
10.1109/ISCIS.2009.5291861