MACHINE LEARNING BASED PHISHING WEBSITE DETECTION SYSTEM VIA CLOUD

Authors

  • Misbah Jahagirdar Author
  • Rizwan Malik Author

Keywords:

Phishing website detection, cloud deployment, AWS EC2, Random Forest, XG-Boost, machine learning, Web-Space-Kit, secure web application,,

Abstract

In this digital era, the risk of cyberattacks such as phishing has risen significantly. Phishing attacks 
trick users into revealing sensitive information by disguising malicious websites as legitimate ones. 
This project focuses on detecting phishing websites using a machine learning-based approach 
hosted entirely in the cloud. The system is deployed on an AWS EC2 instance and integrated with 
a custom domain through the Web-Space-Kit platform, providing a seamless and secure web 
interface for real-time URL analysis.  The dataset used in this study comprises 11,000 samples 
with 33 features extracted from URLs, encompassing both structural and content-based attributes. 
Logistic Regression, Decision Tree, Support Vector Machine (SVM), Random Forest, and XG
Boost were among the few supervised machine learning algorithms that were implemented and 
evaluated. The models were evaluated using accuracy as the primary performance metric. 
Experimental results showed that Logistic Regression achieved 93.42% accuracy, Decision Tree 
achieved 92.15%, SVM reached 91.64%, Random Forest attained 97.82%, and XG-Boost 
achieved 96.99%. Among them, Random Forest emerged as the most reliable model due to its 
ability to handle complex feature interactions and deliver the highest prediction accuracy. The 
system’s cloud-based deployment allows users to enter any URL via a secure HTTPS web portal, 
instantly obtain phishing or legitimate classification, view an explanation of the decision, and 
monitor response times. This approach demonstrates how machine learning models, combined 
with scalable cloud infrastructure, can effectively mitigate phishing risks and support a safer online 
environment. Future enhancements could include integrating deep learning models, continuous 
learning to detect new phishing patterns, and browser extension integration for real-time 
protection. 

,

References

C. Gu, 2021, “A Lightweight Phishing Website Detection Algorithm by Machine Learning,”

International Conference on Signal Processing and Machine Learning (CONF-SPML), pp.

–249.

J. Tanimu and S. Shiaeles, 2022, “Phishing Detection Using Machine Learning Algorithm,”

IEEE International Conference on Cyber Security and Resilience (CSR), pp. 282–287.

U. Zara, K. Ayub, H. U. Khan, A. Daud, et al., 2024, “Phishing Website Detection Using Deep

Learning Models,” IEEE Access, vol. 12, pp. 1–12.

R. S. Rao and S. T. Ali, 2015, “PhishShield: A Desktop Application for Detecting Phishing

Webpages Using Heuristic Techniques,” Procedia Computer Science, vol. 54, pp. 147–156.

H. Sampat, M. Shankar, A. Pandey, and H. Lopes, 2018, “Detection of Phishing Websites

Using Machine Learning Approaches,” International Research Journal of Engineering and

Technology (IRJET), vol. 5, no. 3, pp. 2500–2504.

S. C. Jeeva and E. B. R. Singh, 2016, “Intelligent Phishing URL Detection Using Association

Rule Mining,” International Journal of Computer Applications, Karunya University, India.

S. A. Al-Saaidah, 2017, “Detecting Phishing Emails Using Machine Learning Techniques,”

Middle East University, Department of Computer Science.

R. B. Basnet, A. H. Sung, and Q. Liu, 2014, “Learning to Detect Phishing URLs

International Journal of Research in Engineering and Technology (IJRET), Colorado

Mesa

University, USA.

A. K. Jain and B. B. Gupta, 2017, “Phishing Detection: Analysis of Visual Similarity-Based

Approaches,” Security and Communication Networks, vol. 2017, Hindawi.

J. Mao, J. Bian, W. Tian, S. Zhu, T. Wei, A. Li, and Z. Liang, 2018, “Detecting Phishing

Websites via Aggregation Analysis of Page Layouts,” Procedia Computer Science, vol. 129, pp.

–230.

R. Kiruthiga and D. Akila, 2019, “Phishing Website Detection Using Machine Learning

Techniques,” International Journal of Recent Technology and Engineering (IJRTE), vol. 8, no.

S11, pp. 123–127.

M. Chatterjee and A. S. Namin, 2019, “Detecting Phishing Websites through Deep

Reinforcement Learning,” IEEE 43rd Annual Computer Software and Applications Conference

(COMPSAC), pp. 536–541.

M. E. Pratiwi, 2018, “Phishing Site Detection Analysis Using Artificial Neural Networks,”

Journal of Physics: Conference Series, vol. 1140, no. 1, doi:10.1088/1742

/1140/1/012048. [14] R. Mahajen and I. Siddavatam, 2018, “Detection of Phishing

Websites Using Machine Learning Algorithms,” International Journal of Computer

Applications (IJCA), vol. 182, no. 12, pp. 1–6.

A. K. Dutta, 2021, “Phishing Website Detection Using Machine Learning Techniques,”

Open Access Journal of Information Security, vol. 12, pp. 15–22.

N. Md. Norzaidah and M. N. Bin, 2021, “Phishing Website Detection Using Random Forest

in Cloud-Based Environments,” 2nd International Conference on Artificial Intelligence and

Data Sciences (AiDAS), pp. 134 139.

A. Al swailem and B. Alabdullah, 2020, “Deep Learning Approach for Phishing Website

Detection in Cloud Platforms,” International Journal of Engineering Research & Technology

(IJERT), vol. 9, no. 5, pp. 120–125.

Downloads.

Published

2025-09-01

How to Cite

MACHINE LEARNING BASED PHISHING WEBSITE DETECTION SYSTEM VIA CLOUD. (2025). Phoenix: International Multidisciplinary Research Journal ( Peer Reviewed High Impact Journal ), 3(3.1), 33-43. https://pimrj.org/index.php/pimrj/article/view/117