شماره ركورد كنفرانس :
4650
عنوان مقاله :
Spam Filtering in SMS using Recurrent Neural Networks
پديدآورندگان :
Taheri Rahim Shiraz University of Technology , Javidan Reza Shiraz University of Technology
كليدواژه :
Prediction , RNNs , SMS , Spam , Ham
عنوان كنفرانس :
نوزدهمين كنفرانس بين المللي هوش مصنوعي و پردازش سيگنال
چكيده فارسي :
Short Message Service (SMS) is one of the mobile communication services that allow easy and inexpensive communication. Producing unwanted messages with the aim of advertising or harassment and sending these messages on SMS have become the biggest challenge in this service. Various methods have been presented to detect unsolicited short messages, many of which are based on machine learning. Neural Networks have been applied to separate the unwanted text messages (known as spam) from normal short messages (known as ham) in SMS. To the best of our knowledge, Recurrent Neural Network (RNN) has not been used in this issue yet. In this paper, we propose a method which utilizes RNN to separate the ham and spam. RNN allows for variable length sequences. Even though we are using a fixed sequence length, it is usually preferred to use the RNN. The method achieved an accuracy of 98.11, indicating a considerable improvement compared to support vector machine (SVM), tokenbased SVM and Bayesian algorithms with accuracies f 97.81, 97.64, and 80.54, respectively.
چكيده لاتين :
Short Message Service (SMS) is one of the mobile communication services that allow easy and inexpensive communication. Producing unwanted messages with the aim of advertising or harassment and sending these messages on SMS have become the biggest challenge in this service. Various methods have been presented to detect unsolicited short messages, many of which are based on machine learning. Neural Networks have been applied to separate the unwanted text messages (known as spam) from normal short messages (known as ham) in SMS. To the best of our knowledge, Recurrent Neural Network (RNN) has not been used in this issue yet. In this paper, we propose a method which utilizes RNN to separate the ham and spam. RNN allows for variable length sequences. Even though we are using a fixed sequence length, it is usually preferred to use the RNN. The method achieved an accuracy of 98.11, indicating a considerable improvement compared to support vector machine (SVM), tokenbased SVM and Bayesian algorithms with accuracies of 97.81, 97.64, and 80.54, respectively.