Title :
Experimental Study on Sentiment Classification of Chinese Review using Machine Learning Techniques
Author :
Li, Jun ; Sun, Maosong
Author_Institution :
Tsinghua Univ., Beijing
fDate :
Aug. 30 2007-Sept. 1 2007
Abstract :
Machine learning method in text classification has expanded from topic identification to more challenging tasks such as sentiment classification, and it is valuable to explore, compare methods applied in sentiment classification and investigate relevant influence factors. The chief aim of the present work is to compare four machine learning methods to sentiment classification of Chinese review. The corpus is made up of 16000 reviews from website. We investigate the factors which affect the performance: namely feature representation via Word-Based Unigram (WBU), Bigram (WBB) and Chinese Character-Based Bigram (CBB), Trigram (CBT); feature weighting schemes and feature dimensionality. Experimental evaluations show that performance depends on different settings. As a result, we draw a conclusion that Naive Bayes (NB) classifier obtains the best averaging performance when using WBB, CBT as features with bool weighting under different dimensionality to the task.
Keywords :
learning (artificial intelligence); natural language processing; pattern classification; text analysis; Chinese character-based bigram; Chinese character-based trigram; Chinese review; Naive-Bayes classifier; machine learning techniques; sentiment classification; text classification; word-based bigram; word-based unigram; Computer science; Data mining; Learning systems; Machine learning; Motion pictures; Niobium; Support vector machine classification; Support vector machines; Text categorization; Thumb;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
DOI :
10.1109/NLPKE.2007.4368061