A survey of Emerging Techniques in Detecting SMS Spam

  • Sahar Alqahtani King Abdulaziz University
  • Daniyal Alghazzawi Department of Information Systems, Faculty of Computing and Information Technology at King Abdulaziz University, Jeddah, Saudi Arabia

Abstract

In the past years, spammers have focused their attention on sending spam through short messages services (SMS) to mobile users. They have had some success because of the lack of appropriate tools to deal with this issue. This paper is dedicated to review and study the relative strengths of various emerging technologies to detect spam messages sent to mobile devices. Machine Learning methods and topic modelling techniques have been remarkably effective in classifying spam SMS. Detecting SMS spam suffers from a lack of the availability of SMS dataset and a few numbers of features in SMS. Various features extracted and dataset used by the researchers with some related issues also discussed. The most important measurements used by the researchers to evaluate the performance of these techniques were based on their recall, precision, accuracies and CAP Curve. In this review, the performance achieved by machine learning algorithms was compared, and we found that Naive Bayes and SVM produce effective performance.

References

(1) S. J. Delany, M. Buckley, and D. Greene, “SMS spam filtering: Methods and data,” Expert Syst. Appl., vol. 39, no. 10, pp. 9899–9908, 2012.

(2) S. Stolfo, A. Stavrou and C. Wright, Research in attacks, intrusions and defenses.

(3) M. Gupta, A. Bakliwal, S. Agarwal, and P. Mehndiratta, “A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers,” 2018 11th Int. Conf. Contemp. Comput. IC3 2018, pp. 1–7, 2018.

(4) N. Choudhary and A. Jain, "Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique", Communications in Computer and Information Science, pp. 18-30, 2017. Available: 10.1007/978-981-10-5780-9_2 [Accessed 7 April 2019].

(5) P. Sethi, V. Bhandari, and B. Kohli, “SMS spam detection and comparison of various machine learning algorithms,” 2017 Int. Conf. Comput. Commun. Technol. Smart Nation, IC3TSN 2017, vol. 2017–Octob, pp. 28–31, 2018.

(6) W. Li and S. Zeng, “A Vector Space Model based spam SMS filter,” ICCSE 2016 - 11th Int. Conf. Comput. Sci. Educ., no. Iccse, pp. 553–557, 2016.

(7) M. Popovac, M. Karanovic, S. Sladojevic, M. Arsenovic, and A. Anderla, “Convolutional Neural Network Based SMS Spam Detection,” 2018 26th Telecommun. Forum, TELFOR 2018 - Proc., pp. 1–4, 2019.

(8) N. Sulaiman and M. Jali, "A New SMS Spam Detection Method Using Both Content-Based and Non Content-Based Features", Lecture Notes in Electrical Engineering, pp. 505-514, 2015.

(9) N. K. Nagwani, “A Bi-Level Text Classification Approach for SMS Spam Filtering and Identifying Priority Messages,” Int. Arab J. Inf. Technol., vol. 14, no. 4, pp. 473–480, 2017.

(10) N. Al Moubayed, T. Breckon, P. Matthews and S. McGough, "SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder", Lecture Notes in Computer Science, 2016.

(11) J. Ma, Y. Zhang, J. Liu, K. Yu, and X. Wang, “Intelligent SMS spam filtering using topic model,” Proc. - 2016 Int. Conf. Intell. Netw. Collab. Syst. IEEE INCoS 2016, pp. 380–383, 2016.

(12) D. Fernandes, K. A. P. Da Costa, T. A. Almeida, and J. P. Papa, “SMS spam filtering through optimum-path forest-based classifiers,” Proc. - 2015 IEEE 14th Int. Conf. Mach. Learn. Appl. ICMLA 2015, pp. 133–137, 2016.

(13) M. Horný, “Bayesian Networks,” Boston University, scool of public health, 5, 2017.

(14) R. Gandhi, “Support Vector Machine — Introduction to Machine Learning Algorithms,” Towards Data Science, 07-Jun-2018. [Online]. Available: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47. [Accessed: 13-Mar-2019].

(15) “Introduction to Decision Tree Learning – Heartbeat.” [Online]. Available: https://heartbeat.fritz.ai/introduction-to-decision-tree-learning-cd604f85e236. [Accessed: 13-Mar-2019].

(16) R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction. Cambridge: Cambridge University Press, 2014.

(17) A. C. Rencher and G. B. Schaalje, Linear models in statistics, 2nd ed. Hoboken, N.J: Wiley-Interscience, 2008.

(18) R. F. de Mello and M. A. Ponti, Machine Learning - A Practical Approach on the Statistical Learning Theory. 2018.

(19) P. Kim, MATLAB deep learning: With Machine Learning, Neural Networks and Artificial Intelligence. 2017.

(20) “k-nearest neighbors algorithm,” Wikipedia, 17-Feb-2019. [Online]. Available: https://en.wikipedia.org/w/index.php?title=K-nearest_neighbors_algorithm&oldid=883754172. [Accessed: 13-Mar-2019].

(21) Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.

Published
2019-11-08