DocumentCode
1787406
Title
QuIET: A Text Classification Technique Using Automatically Generated Span Queries
Author
Polychronopoulos, Vassilis ; Pendar, Nick ; Jeffery, Shawn R.
Author_Institution
Univ. of California, Santa Cruz, Santa Cruz, CA, USA
fYear
2014
fDate
16-18 June 2014
Firstpage
52
Lastpage
59
Abstract
We propose a novel algorithm, QuIET, for binary classification of texts. The method automatically generates a set of span queries from a set of annotated documents and uses the query set to categorize unlabeled texts. QuIET generates models that are human understandable. We describe the method and evaluate it empirically against Support Vector Machines, demonstrating a comparable performance for a known curated dataset and a superior performance for some categories of noisy local businesses data. We also describe an active learning approach that is applicable to QuIET and can boost its performance.
Keywords
learning (artificial intelligence); pattern classification; query processing; support vector machines; text analysis; QuIET technique; active learning approach; annotated documents; automatically generated span queries; noisy local businesses data; support vector machines; text binary classification; text categorization; text classification technique; Arrays; Business; Feature extraction; Measurement; Support vector machines; Text categorization; Training; automatically generated; human understandable; span queries; text categorization; text classification; text classifier; text tagging;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing (ICSC), 2014 IEEE International Conference on
Conference_Location
Newport Beach, CA
Print_ISBN
978-1-4799-4002-8
Type
conf
DOI
10.1109/ICSC.2014.18
Filename
6882001
Link To Document