Title :
Study on feature selection and machine learning algorithms for Malay sentiment classification
Author :
Alsaffar, Ahmed ; Omar, Nazlia
Author_Institution :
Center for AI Technol., Univ. Kebangsaan Malaysia, Bangi, Malaysia
Abstract :
Online social media is used to show the sentiments of different individuals about various subjects. Sentiment analysis or opinion mining has recently been considered as one of the highly dynamic research fields in natural language processing, Web mining, and machine learning. There has been a very limited amount of research that focuses on sentiment analysis in the Malay language. This study investigates how feature selection methods contribute to the improvement of Malay sentiment classification performance. Three supervised machine-learning classifiers and seven feature selection methods are used to conduct a series of experiments for the effective selection of the appropriate methods for the automatic sentiment classification of online Malay-written reviews. Findings show that the classifications of Malay sentiment improve using feature selections approaches. This work demonstrates that all feature reduction methods generally improve classifier performance. Support Vector Machine (SVM) approach provide the highest accuracy performance of features selection in order to classify Malay sentiment comparing with other classifications approaches such as PCA and CHI square. SVM records 87% as experimental accuracy result of feature selection.
Keywords :
feature selection; learning (artificial intelligence); natural language processing; pattern classification; social networking (online); support vector machines; Malay language; Malay sentiment classification performance improvement; SVM approach; automatic sentiment classification; feature reduction methods; feature selection; online Malay-written reviews; online social media; opinion mining; sentiment analysis; supervised machine-learning classifiers; support vector machine approach; Classification algorithms; Niobium; Principal component analysis; Sentiment analysis; Support vector machine classification; Training; Classifications; Feature Selection; Machine Learning; NLP; Sentiment analysis;
Conference_Titel :
Information Technology and Multimedia (ICIMU), 2014 International Conference on
DOI :
10.1109/ICIMU.2014.7066643