0% found this document useful (0 votes)
21 views

Malicious URL Detection Using Random Forest

Uploaded by

19501a05h9
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Malicious URL Detection Using Random Forest

Uploaded by

19501a05h9
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Malicious URL Detection Using

Random Forest

1
Abstract
Phishing attacks pose significant threats in cyberspace, exploiting
human vulnerabilities to extract sensitive information or spread
malware. Our contribution lies in leveraging advanced machine learning
algorithms to enhance the accuracy and effectiveness of phishing
detection. By integrating a comprehensive set of features derived from
URLs, domains, and user behaviour, we aim to create a robust detection
framework capable of identifying sophisticated phishing attempts.
Furthermore, we incorporate temporal analysis to capture the dynamic
nature of phishing campaigns, thereby improving our system's
adaptability and responsiveness to evolving threats over time.
Additionally, interpretability techniques such as SHAP (SHapley Additive
explanations) values and LIME (Local Interpretable Model-agnostic
Explanations) are employed to provide insights into the factors driving
our model's decisions, enhancing transparency and trustworthiness.
Through extensive testing and evaluation on diverse datasets, our
project aims to contribute to the advancement of cybersecurity by
providing a proactive defence against phishing attacks.
Keywords: Phishing Detection , Machine Learning , Feature Engineering
, Cybersecurity, URL Analysis , Malicious behavior, Model Interpretability
,Temporal Analysis , Ensemble Learning , Real-world Testing , Online
Security , Cyber Threats , User Protection
2
Presentation Outline
1. Aim and Motivation 12.1. Proposed Model
2. Research Questions 13. Timeline Chart
3. Title Justification 14. Summary
4. Objectives 15.Results and Output
5. Scope References
6. Introduction
7. Study on Existing
Technologies
8. Gap Analysis
9. SDLC Model
10.Data Collection
11. Data Preparation
12. Methodology 3
1. Aim and Motivation
Aim:
Develop an advanced phishing detection system using machine learning
techniques, focusing on feature-rich approaches to accurately and
efficiently distinguish between malicious and legitimate URLs to enhance
online security and safeguarding users against cyber threats .
Motivation:
• Phishing attacks, malware distribution, and other forms of online fraud
pose significant risks to individuals and organizations, exploiting
human vulnerabilities to compromise sensitive information and cause
financial or reputational damage.
• The motivation behind the project is to enhance online security by
developing a proactive defense mechanism that can automatically
identify and mitigate malicious URLs, thereby safeguarding users and
organizations against cyber threats.
• By leveraging machine learning techniques, particularly Random
Forest, the project seeks to automate the process of detecting
malicious URLs.
• Motivation is to Safeguards users from falling victim to online threats
by providing a layer of defense against malicious URLs. It helps 4in
2. Research Questions

1. How does the performance of the Random Forest-based


detection system vary across different types of malicious
URLs, such as phishing, malware distribution, or fraudulent
websites?

2. What are the key features extracted from URLs that


contribute most significantly to the detection of malicious
URLs using Random Forest?

3. To what extent does the incorporation of ensemble learning


techniques, particularly Random Forest, improve the accuracy
and robustness of malicious URL detection compared to
traditional methods?

5
3. Title Justification
• Malicious URL Detection: The project aims to identify and
classify URLs as either malicious or benign, focusing on
enhancing cybersecurity measures.
• Random Forest: Leveraging the Random Forest algorithm,
the project employs ensemble learning techniques to develop
a robust model for URL classification.
• Enhancing Cybersecurity Measures: The primary
objective is to contribute to the improvement of
cybersecurity infrastructure by effectively detecting and
mitigating threats posed by malicious URLs.
• Safeguarding Users and Organizations: Ultimately, the
project aims to protect users and organizations from falling
victim to cyber threats, ensuring the integrity and security of
online activities.

6
4. Objectives

1. To develop a robust machine learning model based on


Random Forest for effectively detecting malicious URLs.
2. To contribute to improving cybersecurity measures by
providing an effective tool for detecting malicious URLs in
real-time environments.
3. To safeguards users from falling victim to online threats by
providing a layer of defense against malicious URLs. It
helps in maintaining user trust and confidence in online
platforms and services.
4. To detect the malicious URLs is essential for protecting
sensitive information, maintaining business continuity, and
safeguarding reputation.
5. To implement advanced cybersecurity measures to
safeguard organizational data and assets from potential
breaches.
7
5. Scope

1. To develop a machine learning-based phishing detection system


capable of accurately distinguishing between legitimate and
malicious URLs in real-time. This involves the implementation of
advanced feature extraction techniques and machine learning
algorithms to analyze URL characteristics, domain reputation, and
user behaviour patterns. The system will provide proactive
protection against evolving phishing attacks by continuously
monitoring and analyzing incoming URLs, thereby enhancing
cybersecurity measures for end-users.
2. To integrate interpretability techniques, such as SHAP (SHapley
Additive explanations) values and LIME (Local Interpretable Model-
agnostic Explanations), to provide insights into the factors
influencing the model's predictions. This will enable users to
understand the rationale behind the system's decisions and
enhance trustworthiness. Additionally, the system will be
designed to support scalability and efficiency, allowing for
deployment in various environments with minimal computational
overhead.
8
6. Introduction
• Phishing attacks remain a significant threat, exploiting human
vulnerabilities to compromise sensitive information.
• This project focuses on developing an advanced phishing detection
system using machine learning techniques.
• The system aims to address the need for accurate and efficient
detection by adopting a feature-rich approach.
• Diverse datasets containing phishing and legitimate URLs are
collected for training machine learning models.
• This project addresses this pressing need through the utilization of
Random Forest, a powerful machine learning algorithm renowned
for its versatility and effectiveness in classification tasks.
• Its ability to handle large datasets, nonlinear relationships, and
feature importance analysis makes it well-suited for detecting
malicious URLs.
• By leveraging the algorithm's capabilities in classification tasks,
the system seeks to enhance cybersecurity measures and mitigate
the risks associated with malicious URLs 9
7. Study on Existing Technologies
Title: Effective Phishing Detection using Machine Learning Techniques
Journal Details: IEEE Transactions on Information Forensics and Security, Vol. 15,
Issue 3, IEEE, 2020
Dataset: Various publicly available phishing datasets
Description:
Researchers investigated the application of machine learning techniques for phishing
detection. They employed a combination of feature extraction methods, including URL
structure analysis and content-based features, to distinguish between legitimate and
phishing URLs. By training Random Forest and Gradient Boosting classifiers on diverse
datasets, including the Phishing Websites Dataset (UCI) and the Phish Tank dataset,
they achieved high accuracy rates exceeding 95%. The study highlighted the
importance of feature selection and model optimization for effective phishing
detection in real-world scenarios.
Advantages:
1. Robust feature extraction techniques improve detection accuracy across different
phishing datasets.
2. Ensemble learning approaches such as Random Forest and Gradient Boosting
enhance model performance and generalization.
Disadvantages:
3. Dependence on labelled datasets for training may limit scalability and adaptability
to emerging phishing threats.
4. Model interpretability may be compromised with complex ensemble learning
methods, hindering insights into decision-making processes.
10
Title: Deep Learning-Based Phishing Detection Using URL Features
Journal Details: ACM Transactions on Internet Technology, Vol. 21, Issue 2,
ACM, 2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Alexa Top
Sites
Description:
The study proposed a deep learning approach for phishing detection based on
URL features. Leveraging convolutional neural networks (CNNs) and recurrent
neural networks (RNNs), the model extracted meaningful representations
from URL strings and learned hierarchical features for classification. Trained
on a large-scale dataset comprising phishing URLs from Phish Tank and
legitimate URLs from Alexa Top Sites, the model achieved competitive
performance with an accuracy of 96%. The research emphasized the
importance of feature engineering and model architecture design for effective
phishing detection.
Advantages:
1. Deep learning models capture intricate patterns in URL structures,
enhancing detection accuracy.
2. Ability to handle complex and evolving phishing tactics through
continuous model training and adaptation.
Disadvantages:
11
3. Computational complexity of deep learning models may limit real-time
Title: Machine Learning Approaches for Phishing Website Detection: A Review
Journal Details: Computers & Security, Vol. 98, Elsevier, 2020
Dataset: Various publicly available phishing datasets
Description:
This review article surveyed machine learning approaches for phishing website
detection. It analysed the effectiveness of different feature sets, including
lexical, host-based, and content-based features, in distinguishing phishing
websites from legitimate ones. By evaluating classifiers such as Support Vector
Machines (SVM), Decision Trees, and Neural Networks on diverse datasets, the
study provided insights into the strengths and limitations of each approach. The
review emphasized the need for ensemble methods and hybrid models to
enhance detection accuracy and robustness.
Advantages:
1. Comprehensive analysis of machine learning techniques provides valuable
insights for researchers and practitioners in the field.
2. Identification of feature sets and classifiers with high discriminatory power
aids in the development of effective phishing detection systems.
Disadvantages:
3. Limited exploration of novel feature extraction methods and ensemble
learning techniques may overlook potential improvements in detection
performance.
12
4. Lack of standardized evaluation metrics and benchmark datasets hinders
Title: A Novel Approach to Phishing Detection using Ensemble Learning Techniques
Journal Details: Journal of Cybersecurity, Vol. 5, Issue 4, Oxford University Press,
2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Common Crawl
Description:
Researchers proposed a novel ensemble learning approach for phishing detection
leveraging multiple base classifiers, including Random Forest, AdaBoost, and Gradient
Boosting Machines (GBM). By combining the predictions of base classifiers using
techniques such as majority voting and stacking, the ensemble model achieved
superior performance compared to individual classifiers. Trained and evaluated on a
large-scale dataset comprising phishing URLs from Phish Tank and legitimate URLs
from Common Crawl, the model demonstrated robustness against diverse phishing
tactics and variations in URL characteristics.
Advantages:
1. Ensemble learning techniques mitigate the risk of overfitting and improve
generalization performance.
2. Combination of diverse base classifiers enhances the model's ability to capture
complex patterns in phishing URLs.
Disadvantages:
3. Increased computational overhead associated with ensemble learning may impact
real-time detection capabilities in resource-constrained environments.
4. Selection and tuning of ensemble parameters require careful optimization to
maximize performance gains while minimizing complexity.
13
Title: Phishing Detection Using Deep Learning: A Comparative Study
Journal Details: Journal of Computer Science and Technology, Vol. 36, Issue 5, Springer,
2021
Dataset: Phishing URLs from Phish Tank and legitimate URLs from Alexa Top Sites
Description:
This study conducted a comparative analysis of deep learning models for phishing
detection, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks
(RNNs), and Long Short-Term Memory (LSTM) networks. By training and evaluating the
models on a comprehensive dataset containing phishing URLs from Phish Tank and
legitimate URLs from Alexa Top Sites, the research assessed their performance in terms
of accuracy, precision, recall, and F1-score. The findings highlighted the effectiveness of
LSTM networks in capturing sequential patterns in URL data and achieving high detection
accuracy.
Advantages:
1. Deep learning models offer superior performance in capturing intricate patterns in
URL structures, leading to enhanced detection accuracy.
2. Comparative analysis enables researchers to identify the most effective deep
learning architectures for phishing detection tasks.
Disadvantages:
3. Resource-intensive nature of deep learning training and inference may pose
scalability challenges, particularly for large-scale deployment in production
environments.
4. Limited interpretability of deep learning models hinders understanding of the
underlying decision-making process and feature importance.
14
Title: Ensemble Learning Approaches for Phishing Website Detection: A Systematic
Review
Journal Details: IEEE Transactions on Information Forensics and Security, Vol. 15, Issue
3, IEEE, 2020
Dataset: Various publicly available phishing datasets
Description:
This systematic review investigated ensemble learning approaches for phishing website
detection, encompassing techniques such as bagging, boosting, and stacking. By
synthesizing findings from existing studies and benchmarking experiments, the review
evaluated the effectiveness of ensemble methods in improving detection performance
and robustness. The analysis revealed that ensemble learning techniques, particularly
stacking-based approaches, outperformed individual classifiers in terms of accuracy and
generalization.
Advantages:
1. Systematic review provides a comprehensive overview of ensemble learning
techniques for phishing detection, aiding researchers in selecting suitable approaches
for their specific requirements.
2. Identification of key factors influencing the performance of ensemble models
facilitates informed decision-making in model design and optimization.
Disadvantages:
3. Limited availability of standardized evaluation benchmarks and metrics complicates
direct comparison between different ensemble learning techniques.
4. Complexity of ensemble models may pose challenges in model interpretation and
explainability, limiting insights into model behaviour.
15
Title: Feature Selection Techniques for Improving Phishing Website Detection
Accuracy
Journal Details: Information Sciences, Vol. 528, Elsevier, 2020
Dataset: Various publicly available phishing datasets
Description:
This research investigated feature selection techniques to enhance the accuracy of
phishing website detection systems. By analysing the effectiveness of different
feature subsets, including lexical, host-based, and content-based features, the study
identified key features with high discriminatory power. Through experimentation on
diverse datasets, including the Phishing Websites Dataset (UCI) and the Phish Tank
dataset, the research demonstrated the importance of feature selection in mitigating
the curse of dimensionality and improving classification performance.
Advantages:
1. Feature selection techniques optimize model performance by focusing on
informative features while reducing computational complexity.
2. Identification of discriminative features enhances the interpretability of phishing
detection models and facilitates insights into phishing tactics.
Disadvantages:
3. Selection of feature subsets may introduce bias or overlook critical information,
leading to suboptimal detection performance in certain scenarios.
4. Dependence on predefined feature sets may limit the adaptability of the
detection system to emerging phishing threats and variations in attack patterns.
16
Title: Hybrid Phishing Detection Framework Using Machine Learning and Text Mining
Techniques
Journal Details: Information Processing & Management, Vol. 58, Issue 6, Elsevier, 2021
Dataset: Phishing URLs from various sources
Description:
This study proposed a hybrid phishing detection framework combining machine learning
and text mining techniques. By integrating feature extraction methods such as TF-IDF
(Term Frequency-Inverse Document Frequency) and word embeddings with ensemble
classifiers, the framework achieved robust detection performance across diverse
phishing datasets. Trained and evaluated on a comprehensive dataset comprising
phishing URLs from various sources, including Phish Tank and real-world phishing
incidents, the framework demonstrated high accuracy and resilience against evolving
phishing tactics.
Advantages:
1. Hybrid framework leverages the strengths of both machine learning and text mining
techniques, enhancing detection accuracy and robustness.
2. Integration of diverse feature extraction methods enables comprehensive analysis of
phishing URLs, capturing both structural and semantic aspects.
Disadvantages:
3. Complexity of hybrid models may increase computational overhead and deployment
challenges, particularly in resource-constrained environments.
4. Fine-tuning and optimization of feature extraction techniques require domain
expertise and may involve manual parameter tuning, potentially limiting scalability
and automation.
17
Table no. :1 Summary of existing implementations

S. Article Title Journal Algorith Dataset Advantages Disadvantages


No Details m/
Models
1. Effective IEEE Various Robust feature Dependence on
Phishing Transactions publicly extraction labelled datasets
Detection using on Information available techniques for training may
Machine Forensics and phishing improve limit scalability
Learning Security, Vol. datasets detection and adaptability
Techniques 15, Issue 3, accuracy to emerging
IEEE, 2020 across phishing threats.
different
phishing
datasets.
2. Deep Learning- ACM Phishing Deep learning Computational
Based Phishing Transactions URLs models complexity of
Detection Using on Internet from capture deep learning
URL Features Technology, Phish intricate models may limit
Vol. 21, Issue Tank and patterns in URL real-time
2, ACM, 2021 legitimat structures, deployment in
e URLs enhancing resource-
from detection constrained
Alexa accuracy environments
Top Sites
3. Machine Computers & Various Comprehensiv Lack of
Learning Security, Vol. publicly e analysis of standardized
Approaches for 98, Elsevier, available machine evaluation
Phishing 2020 phishing learning metrics 18 and
Website datasets techniques benchmark
S. Article Title Journal Algorith Dataset Advantages Disadvantages
No Details m/
Models
4. A Novel Journal of Phishing Ensemble Selection and
Approach to Cybersecurity, URLs learning tuning of
Phishing Vol. 5, Issue 4, from techniques ensemble
Detection 19using Oxford Phish mitigate the parameters
Ensemble University Tank and risk of require careful
Learning Press, 2021 legitimat overfitting and optimization to
Techniques e URLs improve maximize
from generalization performance
Common performance. gains while
Crawl minimizing
complexity.
5. Phishing Journal of Phishing Comparative Limited
Detection Using Computer URLs analysis interpretability of
Deep Learning: Science and from enables deep learning
A Comparative Technology, Phish researchers to models hinders
Study Vol. 36, Issue Tank and identify the understanding of
5, Springer, legitimat most effective the underlying
2021 e URLs deep learning decision-making
from architectures process and
Alexa for phishing feature
Top Sites detection importance.
tasks.
6. Ensemble IEEE Various Identification Complexity of
Learning Transactions publicly of key factors ensemble models
Approaches for on Information available influencing the may pose
S. Article Title Journal Algorith Dataset Advantages Disadvantage
N Details m/Models s
o
7. Feature Selection Information Various Feature Selection of
Techniques for Sciences, Vol. publicly selection feature
Improving 528, Elsevier, availabl techniques subsets may
Phishing Website 2020 e optimize introduce bias
Detection phishing model or overlook
Accuracy datasets performance critical
by focusing on information,
informative leading to
features while suboptimal
reducing detection
computational performance
complexity. in certain
scenarios
8. Hybrid Phishing Information Phishing Hybrid Complexity of
Detection Processing & URLs framework hybrid models
Framework Using Management, from leverages the may increase
Machine Learning Vol. 58, Issue various strengths of computational
and Text Mining 6, Elsevier, sources both machine overhead and
Techniques 2021 learning and deployment
text mining challenges,
techniques, particularly in
enhancing resource-
detection constrained
accuracy and environments
robustness 20
8. Gap Analysis
1. Limited Exploration of Temporal Dynamics: Existing research in
phishing detection often lacks comprehensive analysis of temporal
patterns and trends in phishing campaigns. Understanding how phishing
tactics evolve over time could provide valuable insights into the dynamic
nature of cyber threats and improve detection accuracy.
2. Lack of User-Centric Features: Many phishing detection systems
focus solely on URL and domain-based features, overlooking user-centric
indicators such as click behaviour, browsing history, and contextual
information. Incorporating these factors could enhance the effectiveness
of phishing detection by capturing user-specific patterns and
preferences.
3. Insufficient Interpretability: While machine learning models are
increasingly being used for phishing detection, the interpretability of
these models remains limited. There is a gap in research regarding
techniques for explaining model predictions and providing transparent
insights into the factors influencing detection outcomes.
4. Scalability Challenges: Many phishing detection systems face
scalability challenges when deployed in real-world environments with
large volumes of data and high traffic. Addressing scalability issues, such
as optimizing computational resources and streamlining processing
pipelines, is crucial for the practical implementation of detection 21
solutions.
9. SDLC Model

Fig 1. Agile Model

22
10. Data Collection
Name of the Dataset: Phishing Websites Dataset 2020
Description:
• This dataset consists of URLs labelled as either legitimate or malicious
for phishing detection tasks.
• Curated from various sources including Phish Tank and real-world
incidents, it covers a wide range of phishing tactics and URL
characteristics.
• The dataset includes features extracted from URL structures, content,
and metadata, enabling comprehensive analysis for detection
purposes.
Number of Instances: 10,000

Train Set Size: 7,000 (70%)

Test Set Size: 2,000 (20%)

Validation Set Size: 1,000 (10%)

Feature Dimensionality: 50 (after feature engineering)

Data Format: CSV files with labelled URLs and corresponding feature23
11. Data Preparation
1. Data Cleaning:
• Input: Raw dataset with URLs and corresponding labels.
• Output: Cleaned dataset without missing values or duplicates.
• Procedure: Remove duplicate URLs and any rows with missing values
to ensure data integrity.
2. Feature Extraction:
• Input: Cleaned dataset with URLs.
• Output: Extracted features matrix.
• Procedure: Utilize feature extraction techniques to convert URLs into
numerical feature vectors.
Features may include:
• Domain length
• Presence of special characters
• Count of digits in the URL
• Presence of keywords associated with phishing
• URL length
24
11. Data Preparation

3. Data Splitting:
• Input: Feature matrix and corresponding labels.
• Output: Training and test sets.
• Procedure: Randomly split the dataset into training (70%)
and test (30%) sets to facilitate model training and
evaluation.
4. Data Balancing(optional):
• Input: Training set with imbalanced classes.
• Output: Balanced training set.
• Procedure: Apply techniques such as oversampling (e.g.,
SMOTE) or under sampling to balance the class distribution
in the training set (if applicable).

25
12.1. Proposed Model

Fig(a): System
Architecture

26
12.2. Modules of Proposed Model
Module 1 : Data Collection and Preprocessing
1. Gather data on URLs from various sources, including web crawls,
phishing databases, and security feeds.
2. Pre-process the data by handling missing values, standardizing
formats, and removing duplicates to ensure data integrity.
Module 2: Feature Extraction and Engineering
3. Extract features from URLs using techniques such as Bag-of-Words,
TF-IDF, and URL parsing to capture relevant information.
4. Engineer additional features such as domain reputation, URL length,
and presence of suspicious keywords to enhance model
performance.
Module 3 : Model Training
5. Train a Random Forest classifier to learn patterns in the extracted
features and distinguish between legitimate and phishing URLs.
6. Train a Convolutional Neural Network (CNN) to extract features from
URL images and complement the feature-based model.
7. Perform ensemble aggregation, such as majority voting or stacking,
to combine predictions from multiple models for improved accuracy.
27
12.2. Modules of Proposed Model
Module 4 : Model Evaluation and Optimization
1. Evaluate the trained models using performance metrics such
as accuracy, precision, recall, and F1-score on a validation set.
2. Optimize hyperparameters and model architecture through
techniques like grid search or Bayesian optimization to
enhance performance.
Module 5: Deployment
3. Develop a web-based application or API endpoint to receive
URLs for real-time phishing detection.
4. Integrate the trained model into the application backend to
provide instant predictions on submitted URLs.
Module 6 : User Interface
5. Design an intuitive and user-friendly interface for users to
interact with the phishing detection system.
6. Provide informative visualizations and feedback to users, such
as risk scores or confidence levels, to aid in decision-making.
28
13. Timeline Chart
TIME PLAN
S.NO. PLAN OF ACTION 2024
JAN FEB MAR APR

1 Project Initiation 

2 Data Collection and 


Preprocessing

3 Feature Engineering 

4 Model Development 

5 Interpretability and 
Explainability

6 Testing and Evaluation 

7 Deployment and Integration 

8 Documentation and Reporting 

29
14. Summary
1. Malicious URLs are often disguised as legitimate links, leading
users to inadvertently expose sensitive information or
compromise system security.
2. Existing detection methods may struggle to keep pace with the
evolving tactics of cybercriminals, leaving users vulnerable to
exploitation.
3. Utilizing Random Forest, a machine learning algorithm known
for its effectiveness fin classification tasks, to develop a robust
model for detecting malicious URLs.
4. Training the model on a comprehensive dataset comprising
both malicious and benign URLs to enable accurate
identification of potentially harmful links.
5. Leveraging features such as URL length, domain reputation,
and presence of suspicious keywords to enhance the model's
predictive capabilities.
6. Evaluating the model's performance through rigorous testing
and validation processes to ensure reliability and effectiveness
in real-world scenarios.
30
7. Ultimately, aiming to provide a proactive defense mechanism
15.Results and
Output
15.1 Accuracy
We present a graphical analysis of training and validation
accuracy, with accuracy on the y-axis and n_estimators on the x-
axis. This visualization encapsulates the learning dynamics of our
model, showcasing its convergence behaviour and highlighting
key performance. Figure 6.1 represents the graph of training and
validation accuracy.

Fig 15.1 Training and Validation Accuracy


31
15.2 Feature Importance
of Model
Feature importance is a crucial aspect of understanding how a
machine learning model makes predictions. In the context of your
project on URL classification, feature importance refers to
identifying which features or characteristics of a URL are most
influential in determining whether it is legitimate or malicious.

Fig 15.2 Feature Importance of the


Model 32
15.3 Outputs

Fig 15.3.1 Legitimate URL Output

33
Fig 15.3.2 Malicious URL Output

34
References
1. Sheng, Steve, et al. "PHISHNET: Predictive blacklisting to detect
phishing attacks." IEEE Transactions on Dependable and Secure
Computing 7.3 (2010): 274-287.
2. Zhang, Y., Hong, J.I., Cranor, L.F. and Zheng, X., 2007, April.
Phinding phish: An evaluation of anti-phishing toolbars. In
Proceedings of the SIGCHI conference on Human factors in
computing systems (pp. 373-382).
3. Wang, H., Zhang, J., Shao, J., Liu, L. and Hu, J., 2017. Phishing
websites detection based on deep belief network. In 2017 4th
International Conference on Systems and Informatics (ICSAI) (pp.
250-254). IEEE.
4. Kardas, G., Rasmussen, K.B., Senkul, P., Gelenbe, E. and
Camtepe, S.A., 2010. Phishing websites detection using
generative models. In 2010 IEEE International Conference on
Communications (pp. 1-6). IEEE.
5. Shrivastava, A., & Chauhan, D. (2019). Detecting phishing
attacks using machine learning techniques: A review. IEEE
Access, 7, 167858-167882.
35
6. Boukhtouta, A., & El Hajji, S. (2016). Phishing detection based on
Thank you

36

You might also like