DocumentCode :
3723105
Title :
Using Feature Selection in Combination with Ensemble Learning Techniques to Improve Tweet Sentiment Classification Performance
Author :
Joseph D. Prusa;Taghi M. Khoshgoftaar;Amri Napolitano
Author_Institution :
Florida Atlantic Univ., Boca Raton, FL, USA
fYear :
2015
Firstpage :
186
Lastpage :
193
Abstract :
Performing sentiment analysis of tweets by training a classifier is a challenging and complex task, requiring that the classifier can correctly and reliably identify the emotional polarity of a tweet. Poor data quality, due to class imbalance or mislabeled instances, may negatively impact classification performance. Ensemble learning techniques combine multiple models in an attempt to improve classification performance, especially on poor quality or imbalanced data, however, these techniques do not address the concern of high dimensionality present in tweets sentiment data and may require a prohibitive amount of resources to train on high dimensional data. This work addresses these issues by studying bagging and boosting combined with feature selection. These two techniques are denoted as Select-Bagging and Select-Boost, and seek to address both poor data quality and high dimensionality. We compare the performance of Select-Bagging and Select-Boost against feature selection alone. These techniques are tested with four base learners, two datasets and ten feature subset sizes. Our results show that Select-Boost offers the highest performance, is significantly better than using no ensemble technique, and is significantly better than Select-Bagging for most learners on both datasets. To the best of our knowledge, this is the first study to focus on the effects of using ensemble learning in combination with feature selection for the purpose of tweet sentiment classification.
Keywords :
"Boosting","Bagging","Training","Feature extraction","Robustness","Decision trees","Support vector machines"
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2015 IEEE 27th International Conference on
ISSN :
1082-3409
Type :
conf
DOI :
10.1109/ICTAI.2015.39
Filename :
7372135
Link To Document :
بازگشت