• DocumentCode
    134276
  • Title

    Document classification based on c

  • Author

    Rong Liu ; Dong Wang ; Chao Xing

  • Author_Institution
    Center for Speech & Language Technol. (CSLT), Tsinghua Univ., Beijing, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    413
  • Lastpage
    413
  • Abstract
    Summary form only given. This paper proposes a document classification approach based on word vectors. By learning context relationships, word vectors may represent fine-grained semantic elements. We assume that these low-level semantic implementation of words can be composed to represent high-level semantic concepts, and thus the semantic content of a document can be derived from those of the words it involves. Our experiments confirm that, even with the simplest pooling method, the document representation based on word vectors can deliver good performance on text classification tasks. When compared to the conventional LDA-based approach, the word vector approach is more stable, efficient and generalizable.
  • Keywords
    document handling; learning (artificial intelligence); pattern classification; word processing; context relationship learning; document classification approach; document representation; fine-grained semantic element representation; high-level semantic concept representation; low-level semantic word implementation; pooling method; text classification tasks; word composition; word vector approach; Abstracts; Educational institutions; Petroleum; Semantics; Software; Speech; Vectors; LDA; document classification; topic model; word vector;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936669
  • Filename
    6936669