DocumentCode :
134276
Title :
Document classification based on c
Author :
Rong Liu ; Dong Wang ; Chao Xing
Author_Institution :
Center for Speech & Language Technol. (CSLT), Tsinghua Univ., Beijing, China
fYear :
2014
fDate :
12-14 Sept. 2014
Firstpage :
413
Lastpage :
413
Abstract :
Summary form only given. This paper proposes a document classification approach based on word vectors. By learning context relationships, word vectors may represent fine-grained semantic elements. We assume that these low-level semantic implementation of words can be composed to represent high-level semantic concepts, and thus the semantic content of a document can be derived from those of the words it involves. Our experiments confirm that, even with the simplest pooling method, the document representation based on word vectors can deliver good performance on text classification tasks. When compared to the conventional LDA-based approach, the word vector approach is more stable, efficient and generalizable.
Keywords :
document handling; learning (artificial intelligence); pattern classification; word processing; context relationship learning; document classification approach; document representation; fine-grained semantic element representation; high-level semantic concept representation; low-level semantic word implementation; pooling method; text classification tasks; word composition; word vector approach; Abstracts; Educational institutions; Petroleum; Semantics; Software; Speech; Vectors; LDA; document classification; topic model; word vector;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/ISCSLP.2014.6936669
Filename :
6936669
Link To Document :
بازگشت