DocumentCode :
3155291
Title :
Document similarity estimation for sentiment analysis using neural network
Author :
Yanagimoto, Hidekazu ; Shimada, Masanobu ; Yoshimura, Akira
Author_Institution :
Sch. of Eng., Osaka Prefecture Univ., Sakai, Japan
fYear :
2013
fDate :
16-20 June 2013
Firstpage :
105
Lastpage :
110
Abstract :
It is important to classify documents according to their contents because of finding necessary documents efficiently. To achieve good classification document similarity estimation is one of key techniques since classification is executed based on the document similarity. In natural language processing bag-of-words model is used to extract features from documents and term occurrence frequency based value is used as a weight of each features. However, the term weight methodologies usually use predefined models and include some limitations. New approaches to construct feature vectors based on data distribution are desired to achieve high performance of natural language processing. These days many researchers pay attention to deep learning. Deep learning is a new approach to transform raw data to feature vectors using many unlabeled data. This characteristics is desirable to satisfy a previous need. In natural language processing a main aim is to construct a language model on a deep architecture neural network. In this paper we use a deep architecture neural network to estimate document similarity. To obtain good article similarity estimation we have to generate good article vectors that can represent all article characteristics. Hence, we use many stock market news to train the deep architecture neural network and generate article vectors with the trained neural network. And we calculate cosine similarity between labeled articles and discuss performance of the deep architecture neural network. In evaluation we do not focus on articles´ contents but on their sentiment polarity. Hence, we discuss whether the proposed method classifies articles according to their sentiment polarity or not. We confirmed though the proposed method is an unsupervised learning approach, it achieves good performance in stock market news similarity estimation. The results show a deep architecture neural network can be applied to more natural language processing tasks.
Keywords :
document handling; feature extraction; natural language processing; neural nets; pattern classification; unsupervised learning; bag-of-words model; data distribution; deep architecture neural network; deep learning; document classification; document similarity estimation; feature extraction; feature vectors; language model; natural language processing; sentiment analysis; sentiment polarity; stock market news similarity estimation; term occurrence frequency based value; term weight methodologies; unsupervised learning approach; Biological neural networks; Estimation; Feature extraction; Natural language processing; Neurons; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Science (ICIS), 2013 IEEE/ACIS 12th International Conference on
Conference_Location :
Niigata
Type :
conf
DOI :
10.1109/ICIS.2013.6607825
Filename :
6607825
Link To Document :
بازگشت