• DocumentCode
    3011934
  • Title

    Performance maximization for question classification by subset tree kernel using support vector machines

  • Author

    Rahman, Muhammad Arifur ; Scurtu, Vitalie

  • Author_Institution
    Dept. of Phys., Jahangirnagar Univ., Dhaka
  • fYear
    2008
  • fDate
    24-27 Dec. 2008
  • Firstpage
    230
  • Lastpage
    235
  • Abstract
    Question answering systems use information retrieval (IR) and information extraction (IE) methods to retrieve documents containing a valid answer. Question classification plays an important role in the question answer frame to reduce the gap between question and answer. This paper presents our research work on automatic question classification through machine learning approaches. We have experimented with machine learning algorithms Support Vector Machines (SVM) using kernel methods. An effective way to integrate syntactic structures for question classification in machine learning algorithms is the use of tree kernel (TK) functions. Here we use SubSet Tree kernel with Bag of words. Trade-off between training error and margin, Cost-factor and the decay factor has significant impact when we use SVM for the mentioned kernel type. The experiments determined the individual impact for Trade-off between training error and margin, Cost-factor and the decay factor and later the combined effect for Trade-off between training error and margin, Cost-factor. Depending on these result we also figure out some hyperplanes which can maximize the performance. Based on some standard data set outcomes of our experiment for question classification is promising.
  • Keywords
    classification; error statistics; information retrieval; learning (artificial intelligence); optimisation; support vector machines; trees (mathematics); automatic question classification; cost-factor; decay factor; document retrieval; information extraction; information retrieval; machine learning approach; maximisation; question answering system; subset tree kernel function; support vector machine; training error statistics; Classification tree analysis; Computer interfaces; Information retrieval; Kernel; Machine learning; Machine learning algorithms; Optical computing; Support vector machine classification; Support vector machines; Text categorization; Precision; Question Answering; Question Classification; Recall; SST; SVM; kernel;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology, 2008. ICCIT 2008. 11th International Conference on
  • Conference_Location
    Khulna
  • Print_ISBN
    978-1-4244-2135-0
  • Electronic_ISBN
    978-1-4244-2136-7
  • Type

    conf

  • DOI
    10.1109/ICCITECHN.2008.4802979
  • Filename
    4802979