Phishing Websites Detection Using Data Mining Classification Model
AbstractPhishing is a significant security threat to the Internet; it is an electronic online identity theft in which the attackers use spoofing techniques like fake websites that mimic legal websites to trick users into revealing their private information. Many of successful phishing attacks do exist and subsequently a considerable number of anti-phishing methods have been proposed. However, they vary in terms of their accuracy and error rate. This paper proposes an algorithm for phishing websites detection using data mining classification model. It is implemented and experimented using a dataset composed of 20 different webpage features and 1,000 instances. The experimental results showed that the proposed algorithm outperforms the original one in terms of the number of classification rules, accuracy (87%) and less error rate (0.1 %).
(1) APWG. Phishing Activity Trends: Technical report, Anti Phishing Working Group, [online], http://www.antiphishing.org/reports/apwg_trends, 2013.
(2) Cendrowska, Jadzia, PRISM: An algorithm for inducing modular rules. International Journal of Man- Machine Studies, 1987. 27(4): P. 349-370.
(3) Medvet, E., Kirda, E., and Kruegel, C., Visual-Similarity-Based Phishing Detection. In Proceedings of the 4th international conference on Security and privacy in communication networks, ACM 22, Istanbul, Turkey, 22 – 25, September, 2008.
(4) Jain, A., and Richariya, V., Implementing a Web Browser with Phishing Detection Techniques. World of Computer Science and Information Technology Journal (WCSIT), 2011. 1 (7):p. 289-291.
(5) Afroz, S. and Greenstadt, R., PhishZoo: Detecting Phishing Websites By Looking at Them. In Proceedings of the Semantic Computing (ICSC), Fifth IEEE International Conference , 18-21 September, 2011, 368 – 375.
(6) Xiang, G. and Hong, J.I., A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval, In Proceedings of the 18th International World Wide Web Conference (IW3C2), ACM, Madrid, Spain, 2009, p. 571- 580.
(7) Kumaraguru, P. Rhee, Y. Acquisti, A. Cranor, L. F. Hong, J. and Nunge, E., Protecting people from phishing: The design and evaluation of an embedded training email system. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07), ACM, New York, NY, USA, 2007, p. 905-914.
(8) Herzberg A. and Gbara, A Security and Identification Indicators for Browsers against Spoofing and Phishing Attacks, Journal of ACM Transactions on Internet Technology (TOIT), 8(4): p. 1-36.
(9) Ye Z. and Smith, S. Trusted paths for browsers, In Proceedings of the 11th Usenix Security Symposium, ACM NY, USA, 2005, p. 263-279.
(10) Wenyin, L. Huang, G. Xiaoyue, L. Min, Z. and Deng, X., Detection of Phishing Webpages based on Visual Similarity, In Proceedings WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web, ACM, May, 2005, p. 1060-1061.
(11) Zhang, Y. Hong, J. and Cranor, L., CANTINA: A Content-Based Approach to Detecting Phishing Web Sites. In Proceedings of the 16th international conference on World Wide Web, , ACM, Banff, Alberta, Canada, 8-12, 2007, p. 639-648.
(12) Woo, J. Choi, H.J. and Kim, H.K., An automatic and proactive identity theft detection model in MMORPGs, .Applied Mathematics and Information Sciences, 2012, 6(1S) : p. 291S-302S.
(13) Aburrous, M. Hossain, M. A. Thabtah, F. and Dahal, K., Intelligent Detection System for e-banking Phishing websites using Fuzzy Data Mining, International Conference on CyberWorlds, IEEE Conference Publications , 2009, 37(12): p. 265-272.
(14) Abdelhamid,N., Ayash, A., Tabatah, F., Phishing Detection Based Associative Classification Data Mining, Expert System with Application, 2014, 41: p., 5948-5959.
(15) PhishTank, Out of the Net, into the Tank, http://www.phishtank.com/developer_info.php, 2012
(16) Aburrous, M. Hossain, M. A. Dahal, K. Thabtah, F., Predicting Phishing Websites using Classification Mining Techniques with Experimental Case Studies, In Proceedings of the 2010 Seventh International Conference on Information Technology: New Generations (ITNG '10), IEEE, Las Vegas, Nevada, USA, April 2010, p. 176-181.