• DocumentCode
    1787406
  • Title

    QuIET: A Text Classification Technique Using Automatically Generated Span Queries

  • Author

    Polychronopoulos, Vassilis ; Pendar, Nick ; Jeffery, Shawn R.

  • Author_Institution
    Univ. of California, Santa Cruz, Santa Cruz, CA, USA
  • fYear
    2014
  • fDate
    16-18 June 2014
  • Firstpage
    52
  • Lastpage
    59
  • Abstract
    We propose a novel algorithm, QuIET, for binary classification of texts. The method automatically generates a set of span queries from a set of annotated documents and uses the query set to categorize unlabeled texts. QuIET generates models that are human understandable. We describe the method and evaluate it empirically against Support Vector Machines, demonstrating a comparable performance for a known curated dataset and a superior performance for some categories of noisy local businesses data. We also describe an active learning approach that is applicable to QuIET and can boost its performance.
  • Keywords
    learning (artificial intelligence); pattern classification; query processing; support vector machines; text analysis; QuIET technique; active learning approach; annotated documents; automatically generated span queries; noisy local businesses data; support vector machines; text binary classification; text categorization; text classification technique; Arrays; Business; Feature extraction; Measurement; Support vector machines; Text categorization; Training; automatically generated; human understandable; span queries; text categorization; text classification; text classifier; text tagging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2014 IEEE International Conference on
  • Conference_Location
    Newport Beach, CA
  • Print_ISBN
    978-1-4799-4002-8
  • Type

    conf

  • DOI
    10.1109/ICSC.2014.18
  • Filename
    6882001