Unsupervised Machine Learning Techniques for Detecting Malware Applications in Wireless Devices
It is no doubt that we are in the era of ‘big data’, and different machines and tools are being developed every day to enable users to effectively access, manipulate and process data to provide timely information needed for decision making. The situation has led to increasingly use of wireless devices including smartphones, tablets, pacemakers, etc., with different platforms. As professionals including doctors, engineers, scientists, artists, etc., use these devices in accessing, process and disseminating information services are available, so also malware attackers are strategizing. Hence the last one decade has witnessed constant literatures in the design and development of both supervised and unsupervised machine learning algorithms to checkmate malware applications in wireless devices. In this paper, we study the properties of unsupervised learning algorithms; in particular, we quantify the performance of these algorithms under two scenarios; using data sets from unknown attackers and data sets from known attackers. Our findings show that the recently -algorithm appears superior to the other unsupervised algorithms investigated.
. Bishop, C., Pattern recognition and machine learning, Springer New York, 2006.
. Murphy, K., Machine learning: A Probabilistic perspective, MIT Press, Cambridge, MA, 2012.
. Pearl, J., Reverend Bayes on inference engines: A distributed hierarchical approach, In Proceedings of National Conference on Artificial Intelligence, 1982, pp. 133-136.
. Quinlan, J., Induction of decision trees, Machine learning 1(1), 1986, pp. 81-106.
. Fix, E., Hodges, J. L., Discriminatory analysis: Nonparametric discrimination: Small sample Performance, Technical Report Project 21-49-004, Report number 11, 1952.
. Vapnik, V., The nature of statistical learning theory, Springer, 2000.
. Russell, S. and Norving, P., Artificial Intelligence: A
Modern approach, 2nd Edition, Prentice Hall, Upper Saddle River, New Jersey 07458, 2003.
. Leonid, P., Leazar, E., and Salvatore, J., Instruction Detection with unlabeled Data using Clustering. In Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA 2001) Philadelphia, PA, 2001.
. Duda, R., Hart, P., and Stork, D., Pattern Classification, Second Edition, John Willey & Sons, 2001.
. Ian, H., Eibe, F., and Hall, M., Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, Morgan Kaufman Publishers, Burlington, MA 01803, USA, 2011.
. Harmeling, S., Dornhege, G., Tax, D., Meinecke, F., and Miller, K., From Outliers to Prototypes: Ordering Data, Neurocomputing Vol. 69, pp. 1608-1618, 2006.
. Jain, A., Murty, M., and Flynn, P., Data clustering: A review, ACM Computing Surveys, Vol. 31, No. 3, pp. 264–323, September 1999.
. Berkhin, P., Survey of clustering data mining techniques, Research paper, Accrue Software, http://www.accrue.com/products/researchpapers.html, 2002.
. Kaufman, L. and Rousseeuw, P., Finding groups in data,
Wiley, New York, NY, 1990.
. Laskov, P., Schafer, C., and Kotenko, I., Intrusion Detection in Unlabeled Data with Quarter-sphere Support Vector Machines. In proceedings DIMVA, pp. 71-82, 2004.
. Borja, S., Igor, S., Javier, N., Carlos, L., Inigo, A., and Pablo, G., MADS: Malicious Android Applications Detection through String Analysis. Lecture Notes in Computer Science, Vol. 7873, pp. 178-191, 2013.
. Laskov, P., Diissel, P., Schafer, C., and Rieck, K., Learning Instruction Detection: Supervised or Unsupervised?. Fraunhofer-FIRST IDA, 12489 Berlin, Germany, 2006.
. Zami, A., and Zawi, W., Permission-Based Android Malware Detection. International Journal of Scientific and Technology Research, Vol. 2, Issue 3, pp. 228-234, 2013.
. Juniper Networks: 2011 Mobile threats report, February 2012.
. Muanya, C., Smartphones, wireless pacemakers, turned into portable medical devices. The Guardian, p.31, Thursday, March 27, 2014.
. Marshland, S., Online novelty detection through self-organization with application to inspection robots. Ph.D. Thesis, University of Manchester, 2001.
. Scho, B., J. Shawe-Taylor, P., Smola, A., and Williamson, R., Estimating the support of a high-dimensional distribution, Neural Computation Vol. 13 Issue 7, pp. 1443-1471, 2001.
. Campbell, C., and Bennett, K., A linear programming approach to novelty detection. Advances in Neural Information Processing Systems, Vol. 13, MIT Press, Cambridge, MA, pp. 395-401, 2001.
. Tax, D., and Duin, R., Uniform object generation for optimizing one-class classifiers, J. Mach. Learn. Research, pp. 155-173, 2001.
. Weston, J., Chapelle, O., and Guyon, I., Data cleaning algorithms with applications to micro-array experiments, Technical Report, BIOwulf Technologies, 2001.
. Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge, U. Press, 2004.
. Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L., and Jordan, M., Learning the kernel matrix with semidefinite programming, Journal of Machine Learning Research, 2004.
. Xu, L.; Neufeld, J.; Larson, B.; and Schuurmans, D., Maximum margin clustering. In Advances in Neural Information Processing Systems 17 (NIPS-04), 2004.
. De Bie, T., and Cristianini, N., Convex methods for transduction, In Advances in Neural Information Processing, 16 (NIPS-03), 2003.
. Santos, I., Laorden, C., and Bringas, P., Collective classification for unknown malware detection, In Proceedings of the 6th International Conference on Security and cryptography (SECRYPT), 2011.
. Y. Ye, Y., Wang, D., Li, T., and Ye, D., IMDS: Intelligent malware detection system, In Proceedings of the 13th ACM SIGKDD International conference on Knowledge discovery and data mining, ACM, pp. 1043-1047, 2007.
. Rieck, K., Holz, T., Willems, C., Dussel, P., and Laskov,
P., Learning and classification of malware behavior, In Proceedings of the 2008 Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), pp. 108-125, 2008.
. Tian, R., Batten, L., Islam, R., and Versteeg, S., An automated classification system based on the strings of trojan and virus families, In Malicious and Unwanted Software MALWARE), 2009 4th International Conference on IEEE, pp. 23-30, 2009.
. Shabtai, A., Fledel, Y., and Elovici, Y., Automated Static Code analysis for classifying Android applications using machine learning,” 2010 International Conference on Computational Intelligence and Security, pp. 329–333, 2010.