• DocumentCode
    1661402
  • Title

    Applying machine learning to subject classification and subject description for information retrieval

  • Author

    Cunningham, Sally Jo ; Summers, Brent

  • Author_Institution
    Dept. of Comput. Sci., Waikato Univ., Hamilton, New Zealand
  • fYear
    1995
  • Firstpage
    243
  • Lastpage
    246
  • Abstract
    This paper describes an experiment in applying a standard supervised machine learning algorithm (C4.5) to the problem of developing subject classification rules for documents. This algorithm is found to produce surprisingly concise models of document classifications. While the models are highly accurate on the training sets, evaluation over test sets or through cross-validation shows a significant decrease in classification accuracy. Given the difficult nature of the experimental task, however, the results of this investigation are promising and merit further study. An additional algorithm, 1R, is shown to be highly effective in generating lists of candidate terms for subject descriptions
  • Keywords
    classification; indexing; information retrieval; information retrieval systems; learning (artificial intelligence); vocabulary; 1R; C4.5; classification accuracy; classification rules; cross-validation; document classifications; experiment; information retrieval; machine learning; subject classification; subject description; supervised learning; training sets; Buildings; Computer science; Degradation; Information retrieval; Keyword search; Machine learning; Machine learning algorithms; Neural networks; Standards development; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Neural Networks and Expert Systems, 1995. Proceedings., Second New Zealand International Two-Stream Conference on
  • Conference_Location
    Dunedin
  • Print_ISBN
    0-8186-7174-2
  • Type

    conf

  • DOI
    10.1109/ANNES.1995.499481
  • Filename
    499481