Title :
Dynamic text classifier based on search engine features
Author :
Machhour, Hamid ; Kassou, Ismail
Author_Institution :
ENSIAS, Dept. of Comput. Sci. & Decision Support, Mohammed V Souissi Univ., Rabat, Morocco
Abstract :
Search engines and text categorization are two research areas almost inseparable. Where one is studied, the other is referred sooner or later. Automatic text categorization became more important with the enormous increase of the online information, and text classifiers are often there to help search engines classify indexed documents. The main idea presented in this paper consists of using a search engine as a text classifier. A search engine can take advantage of its scoring performances to categorize a new document without requiring building and using other categorization model. K Nearest Neighbors (KNN) principal based on search engine score as similarity measure was used. This approach is highly dependent on the scoring quality of the used search engine. It is a simple approach but can be competitive to other more complex categorization models. Also, this method is useful as a kind of categorization on the fly when indexing a new document. Through its evolving index, the search engine becomes a dynamic classifier of the fact that any document, recently joining the index, participate in the categorization of other new documents.
Keywords :
indexing; pattern classification; search engines; text analysis; KNN principal; automatic text categorization; document indexing; dynamic text classifier; indexed document classification; k-nearest neighbors principal; online information; scoring performances; search engine features; search engine score; Indexing; Search engines; Support vector machines; Text categorization; Training; Vectors; K Nearest Neighbors; Search Engine; Text Categorization; Text Indexing;
Conference_Titel :
ISKO-Maghreb, 2013 3rd International Symposium
Conference_Location :
Marrakech
Print_ISBN :
978-1-4799-3391-4
DOI :
10.1109/ISKO-Maghreb.2013.6728125