Phishing Attack Detection through Advanced Natural Language Processing Methods

Balaji Venkateswaran

doi:10.52783/anvi.v28.6103

PDF

Published: Mar 26, 2025

DOI: https://doi.org/10.52783/anvi.v28.6103

Keywords:

Suspicious Human Activities, Video Streaming, Real Time Detection, Deep Learning

Balaji Venkateswaran, Uttam Kumar Singh, Kameshwar Singh, Surendra Singh Chauhan, Ashish Jolly, Shakti Kumar

Abstract

The rapid escalation of phishing attacks poses a significant threat to cybersecurity, necessitating the development of automated and intelligent detection mechanisms. This paper introduces an advanced Natural Language Processing (NLP)-based framework for identifying phishing attempts within emails, websites, and online communications. By leveraging deep learning-driven text analysis, semantic representation, and contextual understanding, the proposed system effectively differentiates between legitimate and malicious content. Key linguistic and structural features are extracted and modeled to capture subtle phishing indicators such as deceptive intent, abnormal lexical patterns, and misleading hyperlinks. Publicly available benchmark datasets, including phishing email and URL repositories, are utilized to evaluate the framework across diverse real-world scenarios. Experimental results reveal that the proposed approach surpasses traditional machine learning and rule-based methods in terms of accuracy, precision, recall, and F1-score. Moreover, the system demonstrates near real-time detection efficiency, making it suitable for large-scale deployment in cybersecurity infrastructures. These findings highlight the robustness and scalability of the framework as a reliable defense against evolving phishing threats.

Issue

Vol. 28 No. 7s (2025)

Section

Articles

References

T. Abu-Nimeh, D. Nappa, X. Wang, and S. Nair, “A comparison of machine learning techniques for phishing email detection,” Proceedings of the Anti-Phishing Working Groups 2nd Annual eCrime Researchers Summit, Pittsburgh, PA, USA, 2007, pp. 60–69.

A. Bergholz, J. De Beer, S. Glahn, M. Moens, G. Paaß, and S. Strobel, “New filtering approaches for phishing email,” Journal of Computer Security, vol. 18, no. 1, pp. 7–35, 2010.

R. Basnet, S. Mukkamala, and A. H. Sung, “Detection of phishing attacks: A machine learning approach,” Soft Computing Applications in Industry, vol. 226, pp. 373–383, 2014.

A. Abdelhamid, A. Ayesh, and F. Thabtah, “Phishing email detection: A new approach using a deep learning model,” Proceedings of International Conference on Intelligent Systems Design and Applications (ISDA), pp. 469–478, 2017.

R. S. Rao and S. Ali, “Phishing detection using natural language processing techniques: A deep learning perspective,” IEEE Access, vol. 7, pp. 1–9, 2019.

A. K. Jain and B. B. Gupta, “Phishing detection: Analysis of machine learning and heuristic-based approaches,” Journal of Information Security and Applications, vol. 46, pp. 13–24, 2019.

Y. Liu, Z. Lin, and W. Xu, “BERT-based phishing email detection,” Proceedings of the IEEE International Conference on Communications (ICC), Dublin, Ireland, 2020, pp. 1–6.

R. Verma and A. Das, “Combining machine learning with phishing detection: A hybrid approach,” Computers & Security, vol. 105, p. 102244, 2021.

C. Li, Y. Ma, and H. Yang, “Phishing website detection via attention-based deep neural networks,” Expert Systems with Applications, vol. 187, p. 115819, 2022.

M. Alsharnouby, H. F. Atlam, and M. Alenezi, “Phishing detection using deep contextualized embeddings: An NLP approach,” Future Generation Computer Systems, vol. 137, pp. 1–12, 2023.

Y. Adwan and A. M. Abuhasan, “An Intelligent Classification Model for Phishing Email Detection,” International Journal of Network Security & Its Applications (IJNSA), vol. 8, no. 4, July 2016.

A. A. Akinyelu and A. O. Adewumi, “Classification of Phishing Email using Random Forest Machine Learning,” Hindawi Publishing Corporation, vol. 2014, Article ID 425731, April 2014.

E. Yerli and I. Sogukpinar, “Email Phishing Detection and Prevention by using Data Mining Techniques,” IEEE Xplore, November 2017.

F. Toolan and J. Carthy, “Phishing Detection using Classifier Ensembles,” 2009 eCrime Researchers Summit, Tacoma, WA, USA, 2009.

M. Nguyen, T. Nguyen, and T. H. Nguyen, “A Deep Learning Model with Hierarchical LSTMs and Supervised Attention for Anti-Phishing,” CEUR Workshop Proceedings, May 2018.

N. Moradpoor, B. Clavie, and B. Buchanan, “Employing Machine Learning Techniques for Detection and Classification of Phishing Emails,” Computing Conference 2017, London, UK, pp. 149–156, July 2017.

S. Aggarwal, V. Kumar, and S. D. Sudarsan, “Identification and Detection of Phishing Emails using

S. K. Tuteja and N. Bogiri, “Email Spam Filtering using BPNN Classification Algorithm,” 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), 2016.

S. Karri and S. U. Devi N, “Framework for Phishing Detection in Email under Heave using Conceptual Similarity,” International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), vol. 2, issue 8, August 2014.

S. Rawal, B. Rawal, A. Shaheen, and S. Malik, “Phishing Detection in Emails using Machine Learning,” International Journal of Applied Information Systems (IJAIS), vol. 12, no. 7, October 2017.

T. Peng, I. G. Harris, and Y. Sawa, “Detecting Phishing Attacks using Natural Language Processing and Machine Learning,” 12th IEEE International Conference on Semantic Computing, 2018.

Year	Rate
2022	22.6%
2021	34.3%
2020	37.9%

Article Sidebar

Main Article Content

Abstract

Article Details

References