Classification of Encrypted Texts using Deep Learning

  • Zeinab Nazemi Absardi School of Computer and Information Technology, Shiraz University of Technology, Shiraz, Iran
  • Reza Javidan School of Computer and Information Technology, Shiraz University of Technology, Shiraz, Iran
Keywords: Text classification, Encrypted texts, Deep learning, Encryption algorithms

Abstract

The most widely used cryptographic systems can identify cryptographic algorithms and identify encryption keys.   Statistical methods and learning a variety of machines have been used to identify cryptographic algorithms, each of which has its own advantages and disadvantages. This paper seeks to provide a method for identifying the algorithm used for encrypted texts in text files. Since the volume of this kind of data is very big and increases at any given moment, then the accuracy is calculated by voting of these classifiers. The process of identifying the encryption algorithm is also known from the encrypted texts as the classification of text. So, three methods of encryption AES, RC5, BLOWFISH have been used to evaluate system performance. A three class’s classifier is needed, for this purpose, k-nearest neighbor’s algorithm has been used. This article is based on a deep learning approach, provides a new method for identifying the pattern in cryptographic texts and learning them by methods of representing features. The proposed method, consists of four parts of the preprocessing, feature learning, data classification and voting. The proposed system's efficiency in algorithm classification is 99.1%.

References

(1) Murphy Kevin P. Machine learning: a probabilistic perspective, MIT press, 2012. p. 119-121.

(2) Bengio Y. et al, Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell, 2013. 35(4): p. 1798–828,.

(3) LeCun Y. et al, Deep learning, Nature, 2015. p. 436–44.

(4) Graepel T. et al , ML Confidential: Machine Learning on Encrypted Data. Information Security and Cryptology ICISC, 2012.

(5) Liwen Peng and Yongguo Liu, Feature Selection and Overlapping Clustering-Based Multilabel Classification Model, Mathematical Problems in Engineering, vol. 2018, 281489.

(6) H. Peng, F. Long, and C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005. 27(8):p. 1226–1238.

(7) J. Nayak, et. al, Fuzzy C-means (FCM) clustering algorithm: a decade review from 2000 to 2014 in Computational Intelligence in Data Mining—Volume 2, L. C. Jain, H. S. Behera, J. K. Mandal, and D. P. Mohapatra, Eds., vol. 32 of Smart Innovation, Systems and Technologies, Springer, 2015. p. 133–149.

(8) D. Mena. et al, An overview of inference methods in probabilistic classifier chains for multilabel classification, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2016. 36(6):p. 215–230.

(9) F. Chamroukhi, D. Nguyen, Model-Based Clustering and Classification of Functional Data, 2018, arXiv: 1803.00276[cs.CR] .

(10) Xiaohong, G., et al. A method of vessel tracking for vessel diameter measurement on retinal images. in Image Processing. Proceedings. International Conference on, 2001.

(11) J. Lee and D. W. Kim, Memetic feature selection algorithm for multi-label classification, Information Sciences, 2015. 293(3):p. 80–96.

(12) M. Lotfollahi. et al , Deep Packet: A Novel Approach For Encrypted Traffic Classification Using Deep Learning, 2017. arXiv:1709.02656 [cs.LG].

(13) T. Li, et al. Outsourced privacy-preserving classification service over encrypted data, Journal of Network and Computer Applications,2018. p. 100-110.

(14) Ehsan Hesamifard. et al, CryptoDL: Deep Neural Networks over Encrypted Data, arXiv:1711.05189 [cs.CR], 2017. p. 347-364,.

Published
2018-07-08