0% found this document useful (0 votes)

21 views

Malicious URL Detection Using Random Forest

Uploaded by

19501a05h9

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Malicious URL Detection Using Random Forest

Uploaded by

19501a05h9

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

Malicious URL Detection Using

Random Forest

1
Abstract
Phishing attacks pose significant threats in cyberspace, exploiting
human vulnerabilities to extract sensitive information or spread
malware. Our contribution lies in leveraging advanced machine learning
algorithms to enhance the accuracy and effectiveness of phishing
detection. By integrating a comprehensive set of features derived from
URLs, domains, and user behaviour, we aim to create a robust detection
framework capable of identifying sophisticated phishing attempts.
Furthermore, we incorporate temporal analysis to capture the dynamic
nature of phishing campaigns, thereby improving our system's
adaptability and responsiveness to evolving threats over time.
Additionally, interpretability techniques such as SHAP (SHapley Additive
explanations) values and LIME (Local Interpretable Model-agnostic
Explanations) are employed to provide insights into the factors driving
our model's decisions, enhancing transparency and trustworthiness.
Through extensive testing and evaluation on diverse datasets, our
project aims to contribute to the advancement of cybersecurity by
providing a proactive defence against phishing attacks.
Keywords: Phishing Detection , Machine Learning , Feature Engineering
, Cybersecurity, URL Analysis , Malicious behavior, Model Interpretability
,Temporal Analysis , Ensemble Learning , Real-world Testing , Online
Security , Cyber Threats , User Protection
2
Presentation Outline
1. Aim and Motivation 12.1. Proposed Model
2. Research Questions 13. Timeline Chart
3. Title Justification 14. Summary
4. Objectives 15.Results and Output
5. Scope References
6. Introduction
7. Study on Existing
Technologies
8. Gap Analysis
9. SDLC Model
10.Data Collection
11. Data Preparation
12. Methodology 3
1. Aim and Motivation
Aim:
Develop an advanced phishing detection system using machine learning
techniques, focusing on feature-rich approaches to accurately and
efficiently distinguish between malicious and legitimate URLs to enhance
online security and safeguarding users against cyber threats .
Motivation:
• Phishing attacks, malware distribution, and other forms of online fraud
pose significant risks to individuals and organizations, exploiting
human vulnerabilities to compromise sensitive information and cause
financial or reputational damage.
• The motivation behind the project is to enhance online security by
developing a proactive defense mechanism that can automatically
identify and mitigate malicious URLs, thereby safeguarding users and
organizations against cyber threats.
• By leveraging machine learning techniques, particularly Random
Forest, the project seeks to automate the process of detecting
malicious URLs.
• Motivation is to Safeguards users from falling victim to online threats
by providing a layer of defense against malicious URLs. It helps 4in
2. Research Questions

1. How does the performance of the Random Forest-based

detection system vary across different types of malicious
URLs, such as phishing, malware distribution, or fraudulent
websites?

2. What are the key features extracted from URLs that

contribute most significantly to the detection of malicious
URLs using Random Forest?

3. To what extent does the incorporation of ensemble learning

techniques, particularly Random Forest, improve the accuracy
and robustness of malicious URL detection compared to
traditional methods?

5
3. Title Justification
• Malicious URL Detection: The project aims to identify and
classify URLs as either malicious or benign, focusing on
enhancing cybersecurity measures.
• Random Forest: Leveraging the Random Forest algorithm,
the project employs ensemble learning techniques to develop
a robust model for URL classification.
• Enhancing Cybersecurity Measures: The primary
objective is to contribute to the improvement of
cybersecurity infrastructure by effectively detecting and
mitigating threats posed by malicious URLs.
• Safeguarding Users and Organizations: Ultimately, the
project aims to protect users and organizations from falling
victim to cyber threats, ensuring the integrity and security of
online activities.

6
4. Objectives

1. To develop a robust machine learning model based on

Random Forest for effectively detecting malicious URLs.
2. To contribute to improving cybersecurity measures by
providing an effective tool for detecting malicious URLs in
real-time environments.
3. To safeguards users from falling victim to online threats by
providing a layer of defense against malicious URLs. It
helps in maintaining user trust and confidence in online
platforms and services.
4. To detect the malicious URLs is essential for protecting
sensitive information, maintaining business continuity, and
safeguarding reputation.
5. To implement advanced cybersecurity measures to
safeguard organizational data and assets from potential
breaches.
7
5. Scope

1. To develop a machine learning-based phishing detection system

capable of accurately distinguishing between legitimate and
malicious URLs in real-time. This involves the implementation of
advanced feature extraction techniques and machine learning
algorithms to analyze URL characteristics, domain reputation, and
user behaviour patterns. The system will provide proactive
protection against evolving phishing attacks by continuously
monitoring and analyzing incoming URLs, thereby enhancing
cybersecurity measures for end-users.
2. To integrate interpretability techniques, such as SHAP (SHapley
Additive explanations) values and LIME (Local Interpretable Model-
agnostic Explanations), to provide insights into the factors
influencing the model's predictions. This will enable users to
understand the rationale behind the system's decisions and
enhance trustworthiness. Additionally, the system will be
designed to support scalability and efficiency, allowing for
deployment in various environments with minimal computational
overhead.
8
6. Introduction
• Phishing attacks remain a significant threat, exploiting human
vulnerabilities to compromise sensitive information.
• This project focuses on developing an advanced phishing detection
system using machine learning techniques.
• The system aims to address the need for accurate and efficient
detection by adopting a feature-rich approach.
• Diverse datasets containing phishing and legitimate URLs are
collected for training machine learning models.
• This project addresses this pressing need through the utilization of
Random Forest, a powerful machine learning algorithm renowned
for its versatility and effectiveness in classification tasks.
• Its ability to handle large datasets, nonlinear relationships, and
feature importance analysis makes it well-suited for detecting
malicious URLs.
• By leveraging the algorithm's capabilities in classification tasks,
the system seeks to enhance cybersecurity measures and mitigate
the risks associated with malicious URLs 9
7. Study on Existing Technologies
Title: Effective Phishing Detection using Machine Learning Techniques
Journal Details: IEEE Transactions on Information Forensics and Security, Vol. 15,
Issue 3, IEEE, 2020
Dataset: Various publicly available phishing datasets
Description:
Researchers investigated the application of machine learning techniques for phishing
detection. They employed a combination of feature extraction methods, including URL
structure analysis and content-based features, to distinguish between legitimate and
phishing URLs. By training Random Forest and Gradient Boosting classifiers on diverse
datasets, including the Phishing Websites Dataset (UCI) and the Phish Tank dataset,
they achieved high accuracy rates exceeding 95%. The study highlighted the
importance of feature selection and model optimization for effective phishing
detection in real-world scenarios.
Advantages:
1. Robust feature extraction techniques improve detection accuracy across different
phishing datasets.
2. Ensemble learning approaches such as Random Forest and Gradient Boosting
enhance model performance and generalization.
Disadvantages:
3. Dependence on labelled datasets for training may limit scalability and adaptability
to emerging phishing threats.
4. Model interpretability may be compromised with complex ensemble learning
methods, hindering insights into decision-making processes.
10
Title: Deep Learning-Based Phishing Detection Using URL Features
Journal Details: ACM Transactions on Internet Technology, Vol. 21, Issue 2,
ACM, 2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Alexa Top
Sites
Description:
The study proposed a deep learning approach for phishing detection based on
URL features. Leveraging convolutional neural networks (CNNs) and recurrent
neural networks (RNNs), the model extracted meaningful representations
from URL strings and learned hierarchical features for classification. Trained
on a large-scale dataset comprising phishing URLs from Phish Tank and
legitimate URLs from Alexa Top Sites, the model achieved competitive
performance with an accuracy of 96%. The research emphasized the
importance of feature engineering and model architecture design for effective
phishing detection.
Advantages:
1. Deep learning models capture intricate patterns in URL structures,
enhancing detection accuracy.
2. Ability to handle complex and evolving phishing tactics through
continuous model training and adaptation.
Disadvantages:
11
3. Computational complexity of deep learning models may limit real-time
Title: Machine Learning Approaches for Phishing Website Detection: A Review
Journal Details: Computers & Security, Vol. 98, Elsevier, 2020
Dataset: Various publicly available phishing datasets
Description:
This review article surveyed machine learning approaches for phishing website
detection. It analysed the effectiveness of different feature sets, including
lexical, host-based, and content-based features, in distinguishing phishing
websites from legitimate ones. By evaluating classifiers such as Support Vector
Machines (SVM), Decision Trees, and Neural Networks on diverse datasets, the
study provided insights into the strengths and limitations of each approach. The
review emphasized the need for ensemble methods and hybrid models to
enhance detection accuracy and robustness.
Advantages:
1. Comprehensive analysis of machine learning techniques provides valuable
insights for researchers and practitioners in the field.
2. Identification of feature sets and classifiers with high discriminatory power
aids in the development of effective phishing detection systems.
Disadvantages:
3. Limited exploration of novel feature extraction methods and ensemble
learning techniques may overlook potential improvements in detection
performance.
12
4. Lack of standardized evaluation metrics and benchmark datasets hinders
Title: A Novel Approach to Phishing Detection using Ensemble Learning Techniques
Journal Details: Journal of Cybersecurity, Vol. 5, Issue 4, Oxford University Press,
2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Common Crawl
Description:
Researchers proposed a novel ensemble learning approach for phishing detection
leveraging multiple base classifiers, including Random Forest, AdaBoost, and Gradient
Boosting Machines (GBM). By combining the predictions of base classifiers using
techniques such as majority voting and stacking, the ensemble model achieved
superior performance compared to individual classifiers. Trained and evaluated on a
large-scale dataset comprising phishing URLs from Phish Tank and legitimate URLs
from Common Crawl, the model demonstrated robustness against diverse phishing
tactics and variations in URL characteristics.
Advantages:
1. Ensemble learning techniques mitigate the risk of overfitting and improve
generalization performance.
2. Combination of diverse base classifiers enhances the model's ability to capture
complex patterns in phishing URLs.
Disadvantages:
3. Increased computational overhead associated with ensemble learning may impact
real-time detection capabilities in resource-constrained environments.
4. Selection and tuning of ensemble parameters require careful optimization to
maximize performance gains while minimizing complexity.
13
Title: Phishing Detection Using Deep Learning: A Comparative Study
Journal Details: Journal of Computer Science and Technology, Vol. 36, Issue 5, Springer,
2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Alexa Top Sites
Description:
This study conducted a comparative analysis of deep learning models for phishing
detection, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks
(RNNs), and Long Short-Term Memory (LSTM) networks. By training and evaluating the
models on a comprehensive dataset containing phishing URLs from Phish Tank and
legitimate URLs from Alexa Top Sites, the research assessed their performance in terms
of accuracy, precision, recall, and F1-score. The findings highlighted the effectiveness of
LSTM networks in capturing sequential patterns in URL data and achieving high detection
accuracy.
Advantages:
1. Deep learning models offer superior performance in capturing intricate patterns in
URL structures, leading to enhanced detection accuracy.
2. Comparative analysis enables researchers to identify the most effective deep
learning architectures for phishing detection tasks.
Disadvantages:
3. Resource-intensive nature of deep learning training and inference may pose
scalability challenges, particularly for large-scale deployment in production
environments.
4. Limited interpretability of deep learning models hinders understanding of the
underlying decision-making process and feature importance.
14
Title: Ensemble Learning Approaches for Phishing Website Detection: A Systematic
Review
Journal Details: IEEE Transactions on Information Forensics and Security, Vol. 15, Issue
3, IEEE, 2020
Dataset: Various publicly available phishing datasets
Description:
This systematic review investigated ensemble learning approaches for phishing website
detection, encompassing techniques such as bagging, boosting, and stacking. By
synthesizing findings from existing studies and benchmarking experiments, the review
evaluated the effectiveness of ensemble methods in improving detection performance
and robustness. The analysis revealed that ensemble learning techniques, particularly
stacking-based approaches, outperformed individual classifiers in terms of accuracy and
generalization.
Advantages:
1. Systematic review provides a comprehensive overview of ensemble learning
techniques for phishing detection, aiding researchers in selecting suitable approaches
for their specific requirements.
2. Identification of key factors influencing the performance of ensemble models
facilitates informed decision-making in model design and optimization.
Disadvantages:
3. Limited availability of standardized evaluation benchmarks and metrics complicates
direct comparison between different ensemble learning techniques.
4. Complexity of ensemble models may pose challenges in model interpretation and
explainability, limiting insights into model behaviour.
15
Title: Feature Selection Techniques for Improving Phishing Website Detection
Accuracy
Journal Details: Information Sciences, Vol. 528, Elsevier, 2020
Dataset: Various publicly available phishing datasets
Description:
This research investigated feature selection techniques to enhance the accuracy of
phishing website detection systems. By analysing the effectiveness of different
feature subsets, including lexical, host-based, and content-based features, the study
identified key features with high discriminatory power. Through experimentation on
diverse datasets, including the Phishing Websites Dataset (UCI) and the Phish Tank
dataset, the research demonstrated the importance of feature selection in mitigating
the curse of dimensionality and improving classification performance.
Advantages:
1. Feature selection techniques optimize model performance by focusing on
informative features while reducing computational complexity.
2. Identification of discriminative features enhances the interpretability of phishing
detection models and facilitates insights into phishing tactics.
Disadvantages:
3. Selection of feature subsets may introduce bias or overlook critical information,
leading to suboptimal detection performance in certain scenarios.
4. Dependence on predefined feature sets may limit the adaptability of the
detection system to emerging phishing threats and variations in attack patterns.
16
Title: Hybrid Phishing Detection Framework Using Machine Learning and Text Mining
Techniques
Journal Details: Information Processing & Management, Vol. 58, Issue 6, Elsevier, 2021
Dataset: Phishing URLs from various sources
Description:
This study proposed a hybrid phishing detection framework combining machine learning
and text mining techniques. By integrating feature extraction methods such as TF-IDF
(Term Frequency-Inverse Document Frequency) and word embeddings with ensemble
classifiers, the framework achieved robust detection performance across diverse
phishing datasets. Trained and evaluated on a comprehensive dataset comprising
phishing URLs from various sources, including Phish Tank and real-world phishing
incidents, the framework demonstrated high accuracy and resilience against evolving
phishing tactics.
Advantages:
1. Hybrid framework leverages the strengths of both machine learning and text mining
techniques, enhancing detection accuracy and robustness.
2. Integration of diverse feature extraction methods enables comprehensive analysis of
phishing URLs, capturing both structural and semantic aspects.
Disadvantages:
3. Complexity of hybrid models may increase computational overhead and deployment
challenges, particularly in resource-constrained environments.
4. Fine-tuning and optimization of feature extraction techniques require domain
expertise and may involve manual parameter tuning, potentially limiting scalability
and automation.
17
Table no. :1 Summary of existing implementations

S. Article Title Journal Algorith Dataset Advantages Disadvantages

No Details m/
Models
1. Effective IEEE Various Robust feature Dependence on
Phishing Transactions publicly extraction labelled datasets
Detection using on Information available techniques for training may
Machine Forensics and phishing improve limit scalability
Learning Security, Vol. datasets detection and adaptability
Techniques 15, Issue 3, accuracy to emerging
IEEE, 2020 across phishing threats.
different
phishing
datasets.
2. Deep Learning- ACM Phishing Deep learning Computational
Based Phishing Transactions URLs models complexity of
Detection Using on Internet from capture deep learning
URL Features Technology, Phish intricate models may limit
Vol. 21, Issue Tank and patterns in URL real-time
2, ACM, 2021 legitimat structures, deployment in
e URLs enhancing resource-
from detection constrained
Alexa accuracy environments
Top Sites
3. Machine Computers & Various Comprehensiv Lack of
Learning Security, Vol. publicly e analysis of standardized
Approaches for 98, Elsevier, available machine evaluation
Phishing 2020 phishing learning metrics 18 and
Website datasets techniques benchmark
S. Article Title Journal Algorith Dataset Advantages Disadvantages
No Details m/
Models
4. A Novel Journal of Phishing Ensemble Selection and
Approach to Cybersecurity, URLs learning tuning of
Phishing Vol. 5, Issue 4, from techniques ensemble
Detection 19using Oxford Phish mitigate the parameters
Ensemble University Tank and risk of require careful
Learning Press, 2021 legitimat overfitting and optimization to
Techniques e URLs improve maximize
from generalization performance
Common performance. gains while
Crawl minimizing
complexity.
5. Phishing Journal of Phishing Comparative Limited
Detection Using Computer URLs analysis interpretability of
Deep Learning: Science and from enables deep learning
A Comparative Technology, Phish researchers to models hinders
Study Vol. 36, Issue Tank and identify the understanding of
5, Springer, legitimat most effective the underlying
2021 e URLs deep learning decision-making
from architectures process and
Alexa for phishing feature
Top Sites detection importance.
tasks.
6. Ensemble IEEE Various Identification Complexity of
Learning Transactions publicly of key factors ensemble models
Approaches for on Information available influencing the may pose
S. Article Title Journal Algorith Dataset Advantages Disadvantage
N Details m/Models s
o
7. Feature Selection Information Various Feature Selection of
Techniques for Sciences, Vol. publicly selection feature
Improving 528, Elsevier, availabl techniques subsets may
Phishing Website 2020 e optimize introduce bias
Detection phishing model or overlook
Accuracy datasets performance critical
by focusing on information,
informative leading to
features while suboptimal
reducing detection
computational performance
complexity. in certain
scenarios
8. Hybrid Phishing Information Phishing Hybrid Complexity of
Detection Processing & URLs framework hybrid models
Framework Using Management, from leverages the may increase
Machine Learning Vol. 58, Issue various strengths of computational
and Text Mining 6, Elsevier, sources both machine overhead and
Techniques 2021 learning and deployment
text mining challenges,
techniques, particularly in
enhancing resource-
detection constrained
accuracy and environments
robustness 20
8. Gap Analysis
1. Limited Exploration of Temporal Dynamics: Existing research in
phishing detection often lacks comprehensive analysis of temporal
patterns and trends in phishing campaigns. Understanding how phishing
tactics evolve over time could provide valuable insights into the dynamic
nature of cyber threats and improve detection accuracy.
2. Lack of User-Centric Features: Many phishing detection systems
focus solely on URL and domain-based features, overlooking user-centric
indicators such as click behaviour, browsing history, and contextual
information. Incorporating these factors could enhance the effectiveness
of phishing detection by capturing user-specific patterns and
preferences.
3. Insufficient Interpretability: While machine learning models are
increasingly being used for phishing detection, the interpretability of
these models remains limited. There is a gap in research regarding
techniques for explaining model predictions and providing transparent
insights into the factors influencing detection outcomes.
4. Scalability Challenges: Many phishing detection systems face
scalability challenges when deployed in real-world environments with
large volumes of data and high traffic. Addressing scalability issues, such
as optimizing computational resources and streamlining processing
pipelines, is crucial for the practical implementation of detection 21
solutions.
9. SDLC Model

Fig 1. Agile Model

22
10. Data Collection
Name of the Dataset: Phishing Websites Dataset 2020
Description:
• This dataset consists of URLs labelled as either legitimate or malicious
for phishing detection tasks.
• Curated from various sources including Phish Tank and real-world
incidents, it covers a wide range of phishing tactics and URL
characteristics.
• The dataset includes features extracted from URL structures, content,
and metadata, enabling comprehensive analysis for detection
purposes.
Number of Instances: 10,000

Train Set Size: 7,000 (70%)

Test Set Size: 2,000 (20%)

Validation Set Size: 1,000 (10%)

Feature Dimensionality: 50 (after feature engineering)

Data Format: CSV files with labelled URLs and corresponding feature23
11. Data Preparation
1. Data Cleaning:
• Input: Raw dataset with URLs and corresponding labels.
• Output: Cleaned dataset without missing values or duplicates.
• Procedure: Remove duplicate URLs and any rows with missing values
to ensure data integrity.
2. Feature Extraction:
• Input: Cleaned dataset with URLs.
• Output: Extracted features matrix.
• Procedure: Utilize feature extraction techniques to convert URLs into
numerical feature vectors.
Features may include:
• Domain length
• Presence of special characters
• Count of digits in the URL
• Presence of keywords associated with phishing
• URL length
24
11. Data Preparation

3. Data Splitting:
• Input: Feature matrix and corresponding labels.
• Output: Training and test sets.
• Procedure: Randomly split the dataset into training (70%)
and test (30%) sets to facilitate model training and
evaluation.
4. Data Balancing(optional):
• Input: Training set with imbalanced classes.
• Output: Balanced training set.
• Procedure: Apply techniques such as oversampling (e.g.,
SMOTE) or under sampling to balance the class distribution
in the training set (if applicable).

25
12.1. Proposed Model

Fig(a): System
Architecture

26
12.2. Modules of Proposed Model
Module 1 : Data Collection and Preprocessing
1. Gather data on URLs from various sources, including web crawls,
phishing databases, and security feeds.
2. Pre-process the data by handling missing values, standardizing
formats, and removing duplicates to ensure data integrity.
Module 2: Feature Extraction and Engineering
3. Extract features from URLs using techniques such as Bag-of-Words,
TF-IDF, and URL parsing to capture relevant information.
4. Engineer additional features such as domain reputation, URL length,
and presence of suspicious keywords to enhance model
performance.
Module 3 : Model Training
5. Train a Random Forest classifier to learn patterns in the extracted
features and distinguish between legitimate and phishing URLs.
6. Train a Convolutional Neural Network (CNN) to extract features from
URL images and complement the feature-based model.
7. Perform ensemble aggregation, such as majority voting or stacking,
to combine predictions from multiple models for improved accuracy.
27
12.2. Modules of Proposed Model
Module 4 : Model Evaluation and Optimization
1. Evaluate the trained models using performance metrics such
as accuracy, precision, recall, and F1-score on a validation set.
2. Optimize hyperparameters and model architecture through
techniques like grid search or Bayesian optimization to
enhance performance.
Module 5: Deployment
3. Develop a web-based application or API endpoint to receive
URLs for real-time phishing detection.
4. Integrate the trained model into the application backend to
provide instant predictions on submitted URLs.
Module 6 : User Interface
5. Design an intuitive and user-friendly interface for users to
interact with the phishing detection system.
6. Provide informative visualizations and feedback to users, such
as risk scores or confidence levels, to aid in decision-making.
28
13. Timeline Chart
TIME PLAN
S.NO. PLAN OF ACTION 2024
JAN FEB MAR APR

1 Project Initiation 

2 Data Collection and 

Preprocessing

3 Feature Engineering 

4 Model Development 

5 Interpretability and 
Explainability

6 Testing and Evaluation 

7 Deployment and Integration 

8 Documentation and Reporting 

29
14. Summary
1. Malicious URLs are often disguised as legitimate links, leading
users to inadvertently expose sensitive information or
compromise system security.
2. Existing detection methods may struggle to keep pace with the
evolving tactics of cybercriminals, leaving users vulnerable to
exploitation.
3. Utilizing Random Forest, a machine learning algorithm known
for its effectiveness fin classification tasks, to develop a robust
model for detecting malicious URLs.
4. Training the model on a comprehensive dataset comprising
both malicious and benign URLs to enable accurate
identification of potentially harmful links.
5. Leveraging features such as URL length, domain reputation,
and presence of suspicious keywords to enhance the model's
predictive capabilities.
6. Evaluating the model's performance through rigorous testing
and validation processes to ensure reliability and effectiveness
in real-world scenarios.
30
7. Ultimately, aiming to provide a proactive defense mechanism
15.Results and
Output
15.1 Accuracy
We present a graphical analysis of training and validation
accuracy, with accuracy on the y-axis and n_estimators on the x-
axis. This visualization encapsulates the learning dynamics of our
model, showcasing its convergence behaviour and highlighting
key performance. Figure 6.1 represents the graph of training and
validation accuracy.

Fig 15.1 Training and Validation Accuracy

31
15.2 Feature Importance
of Model
Feature importance is a crucial aspect of understanding how a
machine learning model makes predictions. In the context of your
project on URL classification, feature importance refers to
identifying which features or characteristics of a URL are most
influential in determining whether it is legitimate or malicious.

Fig 15.2 Feature Importance of the

Model 32
15.3 Outputs

Fig 15.3.1 Legitimate URL Output

33
Fig 15.3.2 Malicious URL Output

34
References
1. Sheng, Steve, et al. "PHISHNET: Predictive blacklisting to detect
phishing attacks." IEEE Transactions on Dependable and Secure
Computing 7.3 (2010): 274-287.
2. Zhang, Y., Hong, J.I., Cranor, L.F. and Zheng, X., 2007, April.
Phinding phish: An evaluation of anti-phishing toolbars. In
Proceedings of the SIGCHI conference on Human factors in
computing systems (pp. 373-382).
3. Wang, H., Zhang, J., Shao, J., Liu, L. and Hu, J., 2017. Phishing
websites detection based on deep belief network. In 2017 4th
International Conference on Systems and Informatics (ICSAI) (pp.
250-254). IEEE.
4. Kardas, G., Rasmussen, K.B., Senkul, P., Gelenbe, E. and
Camtepe, S.A., 2010. Phishing websites detection using
generative models. In 2010 IEEE International Conference on
Communications (pp. 1-6). IEEE.
5. Shrivastava, A., & Chauhan, D. (2019). Detecting phishing
attacks using machine learning techniques: A review. IEEE
Access, 7, 167858-167882.
35
6. Boukhtouta, A., & El Hajji, S. (2016). Phishing detection based on
Thank you

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (80)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
246 AI-900 New Sets
No ratings yet
246 AI-900 New Sets
20 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
69% (72)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Final PPT - Phishing Website
100% (1)
Final PPT - Phishing Website
23 pages
Phishing Website Detection DOCUMENTATION
0% (2)
Phishing Website Detection DOCUMENTATION
80 pages
URL Phishing
No ratings yet
URL Phishing
36 pages
updated_phishing_url_detection
No ratings yet
updated_phishing_url_detection
13 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
16 pages
Final Yr Project PhishingAttack Ppt
No ratings yet
Final Yr Project PhishingAttack Ppt
12 pages
Phishing
No ratings yet
Phishing
10 pages
Presentation Slides
No ratings yet
Presentation Slides
42 pages
Phishing Website Detection Using ML 2-1
No ratings yet
Phishing Website Detection Using ML 2-1
20 pages
depuuuDOCNW[1]
No ratings yet
depuuuDOCNW[1]
28 pages
Fake Website Detection
No ratings yet
Fake Website Detection
13 pages
Department of Computer Engineering: Phishing Website Detector Using ML
No ratings yet
Department of Computer Engineering: Phishing Website Detector Using ML
13 pages
B5_PPT_Final-1
No ratings yet
B5_PPT_Final-1
15 pages
Malicious Site Detection (MSD)
No ratings yet
Malicious Site Detection (MSD)
58 pages
Major Proj Sumanthppt
No ratings yet
Major Proj Sumanthppt
13 pages
Phishing-Detection Using Ml[1]
No ratings yet
Phishing-Detection Using Ml[1]
14 pages
Fin Irjmets1682919970
No ratings yet
Fin Irjmets1682919970
5 pages
MINI PROJECT PHISHING WEBSITE DETECTION USING ML
No ratings yet
MINI PROJECT PHISHING WEBSITE DETECTION USING ML
45 pages
Midterm Project Report
No ratings yet
Midterm Project Report
21 pages
Phishing_Review_2023
No ratings yet
Phishing_Review_2023
17 pages
Fr -Detecting Malicious Urls Using Data Analytics
No ratings yet
Fr -Detecting Malicious Urls Using Data Analytics
17 pages
Phishing URL Detection Presentation[1]
No ratings yet
Phishing URL Detection Presentation[1]
12 pages
phishing final
No ratings yet
phishing final
13 pages
B5_Project Synopsis
No ratings yet
B5_Project Synopsis
5 pages
(IJETA-V11I3P35) : Ms. Apoorva Joshi, Ms. Apoorva Joshi, Manvi Bhardwaj
No ratings yet
(IJETA-V11I3P35) : Ms. Apoorva Joshi, Ms. Apoorva Joshi, Manvi Bhardwaj
4 pages
20mis0106 VL2023240102875 Pe003
No ratings yet
20mis0106 VL2023240102875 Pe003
42 pages
CyberSec Review3 Team10
No ratings yet
CyberSec Review3 Team10
28 pages
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
No ratings yet
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
4 pages
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
No ratings yet
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
73 pages
Machine_Learning_for_Detecting_the_Phishing_Threats
No ratings yet
Machine_Learning_for_Detecting_the_Phishing_Threats
6 pages
Phishing Phase1 Report
No ratings yet
Phishing Phase1 Report
20 pages
Phishing URL Detection Using ML: Project Report
No ratings yet
Phishing URL Detection Using ML: Project Report
25 pages
phisingppt
No ratings yet
phisingppt
15 pages
20mis0106 VL2023240103172 Pe003
No ratings yet
20mis0106 VL2023240103172 Pe003
5 pages
Major Project Final Report
No ratings yet
Major Project Final Report
53 pages
Phishing Website Detection
No ratings yet
Phishing Website Detection
19 pages
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
No ratings yet
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
6 pages
Introduction(8)
No ratings yet
Introduction(8)
4 pages
Paper 7AdvancesinEngineeringSoftware
No ratings yet
Paper 7AdvancesinEngineeringSoftware
6 pages
Detection of Phishing Website
No ratings yet
Detection of Phishing Website
12 pages
final ppt
No ratings yet
final ppt
26 pages
CSE3502-Final J Comp Report
No ratings yet
CSE3502-Final J Comp Report
20 pages
Enhancing Phishing URL Detection Through Comprehen
No ratings yet
Enhancing Phishing URL Detection Through Comprehen
7 pages
Phishing Paper 2
No ratings yet
Phishing Paper 2
6 pages
Final Research Paper
No ratings yet
Final Research Paper
6 pages
Research Report
No ratings yet
Research Report
19 pages
Phishing Detection Using Machine Learnin
No ratings yet
Phishing Detection Using Machine Learnin
5 pages
Content Pages CPE
No ratings yet
Content Pages CPE
79 pages
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
No ratings yet
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
20 pages
P1
No ratings yet
P1
13 pages
CH 2. Literature Survey
No ratings yet
CH 2. Literature Survey
5 pages
155-Article Text-230-3-10-20230813
No ratings yet
155-Article Text-230-3-10-20230813
7 pages
Final Synopsisi 2
No ratings yet
Final Synopsisi 2
11 pages
Fake Url
No ratings yet
Fake Url
64 pages
1822 B.E Cse Batchno 287
No ratings yet
1822 B.E Cse Batchno 287
65 pages
A multi-algorithm approach for phishing uniform resource locator’s detection
No ratings yet
A multi-algorithm approach for phishing uniform resource locator’s detection
10 pages
128 Submission
No ratings yet
128 Submission
7 pages
Mini Project Report Sample Format 2024 - Final
No ratings yet
Mini Project Report Sample Format 2024 - Final
80 pages
Ethical Hacking Basics for New Coders: A Practical Guide with Examples
From Everand
Ethical Hacking Basics for New Coders: A Practical Guide with Examples
William E. Clark
No ratings yet
Fascination: Honeypots and Cybercrime
From Everand
Fascination: Honeypots and Cybercrime
Armin Snyder
No ratings yet
BTP Sixth Sem Report
No ratings yet
BTP Sixth Sem Report
31 pages
DSBDA ORAL Question Bank
100% (1)
DSBDA ORAL Question Bank
6 pages
MSC Thesis Proposal of Student Dropout Performance Analysis Using Machine Learning Techniques in Case of Wolaita Sodo University
100% (1)
MSC Thesis Proposal of Student Dropout Performance Analysis Using Machine Learning Techniques in Case of Wolaita Sodo University
28 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
9 pages
Steps of Implementation of A GLM
No ratings yet
Steps of Implementation of A GLM
8 pages
AI Powered IDS
No ratings yet
AI Powered IDS
6 pages
Dimensionality Reduction: Pca, SVD, MDS, Ica, and Friends
No ratings yet
Dimensionality Reduction: Pca, SVD, MDS, Ica, and Friends
50 pages
Diagnosis of Heart Disease Using Data Mining Algorithm
No ratings yet
Diagnosis of Heart Disease Using Data Mining Algorithm
3 pages
Semi-Supervised K-Means Ddos Detection Method Using Hybrid Feature Selection Algorithm
No ratings yet
Semi-Supervised K-Means Ddos Detection Method Using Hybrid Feature Selection Algorithm
15 pages
Machine Learning Notes Unit 1 To 4
No ratings yet
Machine Learning Notes Unit 1 To 4
101 pages
Local-Learning-Based Feature Selection For High-Dimensional Data Analysis
No ratings yet
Local-Learning-Based Feature Selection For High-Dimensional Data Analysis
18 pages
Download Complete Computational Advancement in Communication Circuits and Systems Proceedings of 3rd ICCACCS 2020 1st Edition M. Mitra PDF for All Chapters
No ratings yet
Download Complete Computational Advancement in Communication Circuits and Systems Proceedings of 3rd ICCACCS 2020 1st Edition M. Mitra PDF for All Chapters
22 pages
ML Module 1
No ratings yet
ML Module 1
26 pages
Article 4
No ratings yet
Article 4
7 pages
Module 6 - Intrusion Detection System
No ratings yet
Module 6 - Intrusion Detection System
31 pages
Algorithmic_Trading_Bot
No ratings yet
Algorithmic_Trading_Bot
11 pages
TSE-IDS: A Two-Stage Classifier Ensemble For Intelligent Anomaly-Based Intrusion Detection System
No ratings yet
TSE-IDS: A Two-Stage Classifier Ensemble For Intelligent Anomaly-Based Intrusion Detection System
11 pages
DOC-20250125-WA0000.
No ratings yet
DOC-20250125-WA0000.
15 pages
Boosting Algorithms: Regularization, Prediction and Model Fitting
No ratings yet
Boosting Algorithms: Regularization, Prediction and Model Fitting
29 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
190319windercleaningdatascience1576692643371 PDF
No ratings yet
190319windercleaningdatascience1576692643371 PDF
110 pages
Software Defect Prediction Using Ensemble Learning
No ratings yet
Software Defect Prediction Using Ensemble Learning
6 pages
Official: Á1039Ñ Chemometrics
No ratings yet
Official: Á1039Ñ Chemometrics
18 pages
Pfe Book 2022: Internship 2022
No ratings yet
Pfe Book 2022: Internship 2022
30 pages
A Logit Boost Based Algorithm For Detect
No ratings yet
A Logit Boost Based Algorithm For Detect
12 pages
1 s2.0 S0038092X11000193 Main
No ratings yet
1 s2.0 S0038092X11000193 Main
11 pages
Unit No.02 - Feature Extraction and Selection
No ratings yet
Unit No.02 - Feature Extraction and Selection
17 pages
Predictive Maintenance Based On Event-Log Analysis: A Case Study
100% (1)
Predictive Maintenance Based On Event-Log Analysis: A Case Study
12 pages
Harris Hawks Optimization (HHO) Algorithm based on Artificial Neural Network for Heart Disease Diagnosis
No ratings yet
Harris Hawks Optimization (HHO) Algorithm based on Artificial Neural Network for Heart Disease Diagnosis
5 pages