• DocumentCode
    629746
  • Title

    Thresholding strategies for large scale multi-label text classifier

  • Author

    Draszawka, Karol ; Szymanski, Janusz

  • Author_Institution
    Gdansk Univ. of Technol., Gdansk, Poland
  • fYear
    2013
  • fDate
    6-8 June 2013
  • Firstpage
    350
  • Lastpage
    355
  • Abstract
    This article presents an overview of thresholding methods for labeling objects given a list of candidate classes´ scores. These methods are essential to multi-label classification tasks, especially when there are a lot of classes which are organized in a hierarchy. Presented techniques are evaluated using the state-of-the-art dedicated classifier on medium scale text corpora extracted from Wikipedia. Obtained results show that the classification performance can be improved with the use of new class-specific thresholding methods, which set decision values depending on each candidate class separately.
  • Keywords
    encyclopaedias; text analysis; Wikipedia; candidate classes scores; class-specific thresholding methods; decision values; large scale multilabel text classifier; medium scale text corpora; objects labeling; Decision support systems; Electronic publishing; Encyclopedias; Internet; Training; Vectors; LSHTC multi-label classification; score-based classifier; thresholding strategies;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Human System Interaction (HSI), 2013 The 6th International Conference on
  • Conference_Location
    Sopot
  • ISSN
    2158-2246
  • Print_ISBN
    978-1-4673-5635-0
  • Type

    conf

  • DOI
    10.1109/HSI.2013.6577846
  • Filename
    6577846