0% found this document useful (0 votes)
4 views3 pages

Ai Phishing Report

Uploaded by

chandanvishwa15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Ai Phishing Report

Uploaded by

chandanvishwa15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Project Report: AI-Powered Phishing Website Detection System

1. Project Title AI-Powered Phishing Website Detection System

2. Team Members - Aniket Jaiswal — Backend Developer & Integration Specialist - Anupam Tiwari —
Project Lead & Machine Learning Developer - Ayush Vishwakarma — Frontend Developer & UI/UX
Designer

3. Problem Statement Phishing websites mimic legitimate ones to steal sensitive information like login
credentials and financial data. Traditional detection methods, such as blacklists, fail to identify new and
evolving threats, exposing users to data theft and financial loss.

4. Objectives - Detect phishing websites using Machine Learning (ML). - Integrate heuristic features and
APIs (VirusTotal, WHOIS, Google Safe Browsing). - Develop a real-time Chrome extension for user
protection. - Ensure transparency through explainable AI (LIME, SHAP). - Provide an admin dashboard with
visual analytics and reporting tools.

5. System Architecture Components: - Frontend: Chrome Extension (JavaScript, HTML, CSS), Streamlit
Dashboard (Python) - Backend: Flask REST API - Database: Firebase for logging and real-time sync - ML
Model: Random Forest, XGBoost, SVM using Scikit-learn - APIs: WHOIS, VirusTotal, Google Safe Browsing

Architecture Diagram: (Include flowchart visualizing URL processing through frontend, backend, ML
prediction, and browser alert)

6. Dataset and Feature Engineering Sources: - PhishTank - Kaggle Phishing Dataset - Alexa Top 1M (for
legitimate URLs)

Preprocessing Steps: - Clean and normalize data - Encode categorical variables - Handle class imbalance
with SMOTE or undersampling

Feature Categories: - URL-Based: URL length, presence of '@', redirections - Domain-Based: Age,
expiration time, WHOIS data - Security-Based: HTTPS presence, SSL certificate validity - Content-Based:
HTML pattern analysis (optional)

Feature Table Example: | Feature | Type | Example Value | |--------------------------|-------------|-----------------| |


URL Length | Numerical | 75 | | HTTPS Present | Boolean | 1 | | Domain Age | Numerical | 365 | |
Contains '@' Character | Boolean | 0 |

1
7. Machine Learning Models - Algorithms Used: Random Forest, XGBoost, SVM - Evaluation Metrics:
Accuracy, Precision, Recall, F1-Score, ROC-AUC

Model Performance Table: | Model | Accuracy | Precision | Recall | F1-Score |


|---------------|----------|-----------|--------|----------| | Random Forest | 95.6% | 94.8% | 96.1% | 95.4% | | XGBoost
| 96.2% | 95.1% | 97.0% | 96.0% | | SVM | 92.5% | 91.3% | 93.4% | 92.3% |

Graph: (Include bar chart comparing model metrics)

8. Browser Extension Workflow - Monitors every opened URL - Sends URL to backend for classification -
Backend returns prediction - Displays red warning screen for phishing

Screenshot/Diagram: (Include UI of warning screen)

9. Admin Dashboard - Built with Streamlit - Visualizes logs, user-submitted suspicious links - Shows model
predictions, charts, and live analytics

Chart: (Include pie chart of phishing vs safe URLs, trend line of daily detection counts)

10. Future Scope - Deep Learning integration for advanced pattern detection - Voice-based phishing alerts
for accessibility - Integration into antivirus suites or secure browsers - Continuous model retraining with live
data - Telegram/WhatsApp bot for URL checks

11. Timeline | Phase | Duration (Weeks) | |------------------------------|------------------| | Research & Dataset


Collection| 2 | | Feature Engineering & Models | 2 | | Model Tuning & Evaluation | 1 | | Extension
Development | 2 | | Dashboard & Explainability | 1 | | Testing & Final Report | 2 |

Gantt Chart: (Include timeline visual)

12. Conclusion Phishing attacks are growing in scale and sophistication. Our hybrid system leverages ML,
heuristics, and APIs to detect phishing in real-time, preventing user harm proactively. With a Chrome
extension and explainable model ready for deployment, this project bridges academic learning with real-
world impact.

13. References - PhishTank: https://www.phishtank.com/ - Kaggle Phishing Dataset - Alexa Top 1M List -
VirusTotal API - WHOIS API - Google Safe Browsing API

2
14. Appendix (Include graphs, confusion matrix, screenshots of extension, dashboard UI, LIME/SHAP
visualization samples)

Prepared by: Ayush Vishwakarma & Team Institution: Deen Dayal Upadhyay Gorakhpur University
Department: Computer Science and Engineering Semester: 6th Year: 2025

You might also like