DocumentCode
185586
Title
Developing a text classifier with constrained development and execution time
Author
Budiselic, I. ; Delac, G. ; Vladimir, Klimenko
Author_Institution
Fac. of Electr. Eng. & Comput., Univ. of Zagreb, Zagreb, Croatia
fYear
2014
fDate
26-30 May 2014
Firstpage
1170
Lastpage
1175
Abstract
The aim of this paper is to show that an accurate and efficient text classifier for relatively simple problem domains can be created in only a few hours of development time. The motivating example discussed in the paper is a recent HackerRank competition problem that tasked competitors with creating a classifier for questions from the popular question and answer platform StackExchange. The paper describes the key components of one solution to this problem, and briefly overviews the naive Bayes classifier that is the basis of the solution. The discussion is focused on feature selection and example representation which were the key challenges to be addressed during the development of this classifier. We also analyze the effect of the number of features on accuracy, training and classification time and the size of the resulting classifier and the representation of the training examples which were all important characteristics for the competition. The described classifier achieved slightly over 89% accuracy on the hidden question set, while the winning submission achieved around 92%.
Keywords
Bayes methods; pattern classification; question answering (information retrieval); text analysis; HackerRank competition problem; constrained development; example representation; execution time; feature selection; naive Bayes classifier; question and answer platform StackExchange; text classifier; Accuracy; Computer architecture; Computers; Generators; Organizations; Text categorization; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014 37th International Convention on
Conference_Location
Opatija
Print_ISBN
978-953-233-081-6
Type
conf
DOI
10.1109/MIPRO.2014.6859745
Filename
6859745
Link To Document