DocumentCode
134276
Title
Document classification based on c
Author
Rong Liu ; Dong Wang ; Chao Xing
Author_Institution
Center for Speech & Language Technol. (CSLT), Tsinghua Univ., Beijing, China
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
413
Lastpage
413
Abstract
Summary form only given. This paper proposes a document classification approach based on word vectors. By learning context relationships, word vectors may represent fine-grained semantic elements. We assume that these low-level semantic implementation of words can be composed to represent high-level semantic concepts, and thus the semantic content of a document can be derived from those of the words it involves. Our experiments confirm that, even with the simplest pooling method, the document representation based on word vectors can deliver good performance on text classification tasks. When compared to the conventional LDA-based approach, the word vector approach is more stable, efficient and generalizable.
Keywords
document handling; learning (artificial intelligence); pattern classification; word processing; context relationship learning; document classification approach; document representation; fine-grained semantic element representation; high-level semantic concept representation; low-level semantic word implementation; pooling method; text classification tasks; word composition; word vector approach; Abstracts; Educational institutions; Petroleum; Semantics; Software; Speech; Vectors; LDA; document classification; topic model; word vector;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936669
Filename
6936669
Link To Document