Title of article :
Identifying Persian bots on Twitter; which feature is more important: Account Information or Tweet Contents?
Author/Authors :
Mazoochi ، Mojtaba Information Technology Research Faculty - ICT Research Institute , Asadi ، Nasrin Development and Innovation Center for AI - ICT Research Institute , Rahmani ، Farzaneh Development and Innovation Center for AI - ICT Research Institute , Rabiei ، Leila Information Technology Research Faculty - ICT Research Institute
Abstract :
The spread of internet and smartphones in recent years has led to the popularity and easy accessibility of social networks among users. Despite the benefits of these networks, such as ease of interpersonal communication and providing a space for free expression of opinions, they also provide the opportunity for destructive activities such as spreading false information or using fake accounts for fraud intentions. Fake accounts are mainly managed by bots. So, identifying bots and suspending them could very much help to increase the popularity and favorability of social networks. In this paper, we try to identify Persian bots on Twitter. This seems to be a challenging task in view of the problems pertinent to processing colloquial Persian. To this end, a set of features based on user account information and activity of users added to content features of tweets to classify users by several machine learning algorithms like Random Forest, Logistic Regression and SVM. The results of experiments on a dataset of Persian-language users show the proper performance of the proposed methods. It turns out that, achieving a balanced-accuracy of 93.86%, Random Forest is the most accurate classifier among those mentioned above.
Keywords :
social networks , Twitter , bot detection , classification , Persian language
Journal title :
International Journal of Information and Communication Technology Research
Journal title :
International Journal of Information and Communication Technology Research