DocumentCode
3746245
Title
Distributed keyword vector representation for document categorization
Author
Yu-Lun Hsieh;Shih-Hung Liu;Yung-Chun Chang;Wen-Lian Hsu
Author_Institution
Social Networks and Human-Centered Computing Program, TIGP, IIS, Academia Sinica, Taiwan
fYear
2015
Firstpage
245
Lastpage
251
Abstract
In the age of information explosion, efficiently categorizing the topic of a document can assist our organization and comprehension of the vast amount of text. In this paper, we propose a novel approach, named DKV, for document categorization using distributed real-valued vector representation of keywords learned from neural networks. Such a representation can project rich context information (or embedding) into the vector space, and subsequently be used to infer similarity measures among words, sentences, and even documents. Using a Chinese news corpus containing over 100,000 articles and five topics, we provide a comprehensive performance evaluation to demonstrate that by exploiting the keyword embeddings, DKV paired with support vector machines can effectively categorize a document into the predefined topics. Results demonstrate that our method can achieve the best performances compared to several other approaches.
Keywords
"Computational modeling","Manuals"
Publisher
ieee
Conference_Titel
Technologies and Applications of Artificial Intelligence (TAAI), 2015 Conference on
Electronic_ISBN
2376-6824
Type
conf
DOI
10.1109/TAAI.2015.7407126
Filename
7407126
Link To Document