0% found this document useful (0 votes)

8 views

39 (1)

This research presents a dual approach for detecting phishing domains using AI/ML techniques, focusing on URL detection and image analysis to enhance cybersecurity. The proposed system aims to accurately identify fraudulent websites that closely mimic legitimate ones, addressing the limitations of traditional methods like blacklist-based systems. Through rigorous experimentation, the study demonstrates the effectiveness of integrating these techniques to improve detection accuracy and reduce false positives.

Uploaded by

Dipanjn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

39 (1)

Uploaded by

Dipanjn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

AI/ML Dual Approach for Phishing Domain

Detection: URL and ImageAnalysis

Souvik Karmakar Dipanjan Santra Arijit Tewary

Dept. of Information Dept. of Information Dept. of Information
TechnologyTechno International TechnologyTechno International TechnologyTechno International
New Town Kolkata, India New Town Kolkata, India New Town Kolkata, India
[email protected] [email protected] [email protected]

Abstract— Phishing attacks remain a major threat to online security by taking advantage of the
similarity between fake domains and authentic ones.. To address this challenge, this research
introduces an intelligent system leveraging AI/ML techniques for detecting phishing domains that
closely mimic the look and feel of genuine websites. The proposed system employs a two-pronged
approach: URL detection and image analysis. The URL detection mechanism scrutinizes the
structural attributes of web addresses, while the image analysis module examines visual similarities
between phishing and authentic domains. To enhance detection accuracy, our system incorporates
advanced machine learning models trained using comprehensive datasets encompassing diverse
phishing tactics as well as legitimate websitedesigns. Furthermore, to overcome the limitations of
conventional methods, our approach emphasizes the importanceof detecting subtle visual cues and
nuances that distinguish phishing websites from their genuine counterparts. By harnessing the power
of AI/ML, our system achieves robust detection capabilities, effectively thwarting sophisticated
phishing attempts. Through rigorous experimentation and evaluation, we demonstrate the
efficacy of our systemin effectively detecting phishing domains and reducing false positives. Our
findings underscore the significance of integrating URL detection and image analysis within an AI-
driven framework to combat evolving phishing threats in cyberspace.
Keywords—Phishing, Detection, AI/ML, URL Detection, Image Analysis, Cybersecurity
I. INTRODUCTION
In today’s digital age, where online services like e-banking and e-commerce have become integral to
daily life, the threatof phishing attacks looms larger than ever[1]. Cybercriminals leverage sophisticated
tactics to deceive unsuspecting individ-uals into divulging sensitive information, posing significant risks to
personal security and financial well-being. Against this backdrop, the necessity for strong phishing detection
systems is more critical than ever.
Traditionally, phishing detection has relied on blacklist- based systems to identify and block known
malicious websites. However, these systems have been shown to be inadequate in staying ahead of the
constantly changing strategies of cybercriminals, exposing users to emerging threats. Additionally, the high
costs associated with maintaining and updating blacklists present a significant challenge for organizations
tasked with safeguarding against phishing attacks.[1]
To address these challenges, researchers have investigated alternative approaches to phishing detection,
ranging from feature- based analysis to visual similarity-based methods. While feature-based approaches

ISBN : 978-81-978522-2-0 https://doi.org/10.37285/bsp.oaisem2025.39 320

Optimization and Artificial Intelligent Strategies for Engineering and Management

examine attributes such as URL length and website content, visual similarity-based methods leverage image
analysis to identify phishing websites that closely mimic the appearance of legitimate ones.[2]
This paper introduces a new approach based on visual similarity phishing detection scheme that addresses
the limitations of existing methods. Our approach harnesses the power of advanced machine learning
techniques to analyze images and accuratelydetect phishing domains that replicate the appearance and design
of genuine websites. By integrating comprehensive datasets and leveraging AI/ML algorithms, our scheme
aims to achieve high detection accuracy while minimizing false positives[3].
The objectives of our research are twofold[2]: first, to develop a robust phishing detection system capable
of accurately identifying fraudulent domains; and second, to demonstrate the effectiveness of our approach
through rigorous experimenta-tion and evaluation.
The structure of this paper is designed to provide a comprehensive understanding of our proposed
phishing detection scheme. Section II offers an overview of background techniques and discusses
conventional phishing detection methods. In Section III, we highlight the shortcomings of existing
approaches, laying the groundwork for our proposed solution. Section IV delves into the intricacies of our
proposed methodology, outlining the integration of advanced machine learning techniques. Section V
presents the results of our experiments, followed by a discussion and avenues for future research as
detailed inSection VI. Finally, In conclusion, Section VII wraps up the paper.
II. BACKGROUND
Phishing attacks represent a major category of cyber threats, designed to to trick users into disclosing
onfidential information such as login details, financial data, or personal identifiers. These attacks often
involve fraudulent websites or emails that closelymimic the appearance of legitimate ones, making them
difficult to detect[4].
Traditionally, phishing detection has relied on methods such as blacklist-based systems, which keep a
record of known malicious sites and URLs. While effective to some extent,these systems often fail to keep
pace with the continually evolving strategies of cybercriminals. Moreover, they often fail to detect zero-
day attacks or new phishing websites that have not yetbeen added to the blacklist[5].
Feature-based analysis is another common approach[2], which examines attributes like URL length,
domain age, and content to identify phishing attempts. However, these methods may miss subtle variations
or sophisticated techniques used by attackers to evade detection[6].
Visual similarity-based methods have gained traction in recent years[2], leveraging image analysis to
compare the visual appearance of websites and detect phishing attempts. By analyzing visual elements such
as logos, layouts, and colors, these methods can identify phishing websites that closely resemble legitimate
ones.[3]
Despite advancements in phishing detection techniques, cybercriminals continue to develop more
sophisticated attacks, challenging the effectiveness of existing methods. Moreover, the increasing use
of AI and machine learning by attackersfurther complicates the detection process, as they can generate
convincing phishing websites at scale.
To address these issues, there is an increasing demand for more sophisticated phishing detection methods
that can accurately identify fraudulent websites while reducing false positives. This research aims to address
this need by proposing a dual approachthat combines URL detection and image classification within an
AI/ML framework to enhance cybersecurity and protect users from phishing attacks[7].

ISBN : 978-81-978522-2-0 321

Optimization and Artificial Intelligent Strategies for Engineering and Management

III. PROBLEM STATEMENT

Despite the availability of various phishing detection techniques, existing methods suffer from limitations
in accurately identifying phishing websites, especially those employing sophisticated tactics to mimic
genuine ones. Traditional approaches, such as blacklist-based systems and feature-based analysis, and may
struggle to keep up with the rapid evolution of phishing attacks, potentially failing to detect newly emerging
threats. Additionally, these methods often result in high false positiverates, leading to user distrust and
increased operational costs for organizations[1].
Visual similarity-based methods offer a promising solution by analyzing the visual elements of websites
to detect phishing attempts. However, current approaches often lack robustness and scalability, making them
less effective in real-world scenarios.Furthermore, the integration of multiple detection techniques, such as
URL analysis and image classification, within a unified framework remains a challenge[2].
This research aims to create a more effective and thorough phishing detection system that accurately
identifies fraudulent websites and reduces false positives. By combining advanced machine learning
techniques for URL detection and image classification, we hope to improve accuracy and adaptability to new
phishing tactics. Ultimately, the goal is to enhance cybersecurity and lessen the risks of phishing attacks in
the digital world.[2].
IV. PROPOSED METHODOLOGY
Phishing attacks remain a serious risk to online security, as cybercriminals use advanced methods to trick
users. To address this challenge, we propose a dual approach for detecting phishing domains that closely
mimic genuine websites. The proposed methodology combines URL detection and image classification
within an AI/ML framework.
A. URL Detection Model
URL detection aims to scrutinize the structural attributes of web addresses to identify phishing domains.
This componentinvolves the following steps[8][6]:
1) Data CollecƟon: The ini al step involves collec ng a comprehensive dataset of URLs. These URLs
are labeled as eitherphishing or legi mate, forming the basis for training and assessing the machine
learning model.
2) Data Cleaning and Preprocessing: The data cleaning and preprocessing steps include
tokeniza on, stemming, andfeature extrac on. A detailed descrip on of each step follows:
• Tokeniza on: The URLs are tokenized using a regular expression tokenizer (RegexpTokenizer)
to extract words fromthe text, which helps in breaking down the URLs into manageable pieces for
further analysis.
• Stemming: The tokens are stemmed using SnowballStemmer to reduce them to their root form.
This step standardizesthe tokens, making it easier to compare and analyze them.
• Joining Tokens: The stemmed tokens are combined into a single string for each URL. guarantees
that the preprocessedtext is appropriately forma ed for feature extrac on.
• Feature Extrac on: The preprocessed URL text is converted into a numerical representa on using
CountVectorizer. This step creates a matrix of token counts, providing a numerical
representa on of the URL features suitable for machine learning algorithms.
3) VisualizaƟon: For visualiza on we include a bar plot illustra ng the distribu on of phishing versus
legi mate URLs within the dataset. Addi onally, we include a word cloud showing the most common words

ISBN : 978-81-978522-2-0 322

Optimization and Artificial Intelligent Strategies for Engineering and Management

in phishing URLs compared to legi mate ones.

• Count Plot: The count plot shows the phishing versus legi mate URLs within the dataset.

Fig. 1: Distribution of Phishing

versus legitimate URLs
• Word Cloud: The word cloud displays the most common words in phishing URLs
compared to legitimate ones.
4) Logistic Regression: Logistic regression is one of the machine learning models used for
detecting phishing URLs[9]. Below is an outline of the process:
• Feature Transformation: The preprocessed URL text is transformed into a matrix of
token counts via CountVectorizer. This transformation converts the tetext data into a
numerical format appropriate for logistic regression.
• Data Splitting: The dataset is divided into training and testing sets using train_test_split.
This ensures that the model is evaluated on unseen data, providing a realistic assessment
of its performance.
• Model Training: The logistic regression model is trained using the training set, allowing it
to learn the connection between the URL features and the corresponding labels (phishing
or legitimate).
• Model Evaluation: The trained model is assessed using the testing set. Accuracy,
confusion matrix, and classification report are employed to measure its effectiveness.
• Logistic Regression Formula: The logistic regression model predicts the probability P
that a given input x belongs to the class labeled as 1 (phishing):

Where: - σ(z) = 1+e 1−z is the sigmoid function. - w is the weight vector. - b is the bias
term.

• Model Training: During training, the logistic regression model updates the weights w
and bias b to minimize the cost function, typically the binary cross-entropy loss:

ISBN : 978-81-978522-2-0 323

Optimization and Artificial Intelligent Strategies for Engineering and Management

Where: - m represents the number of training examples. - y(i) denotes the true label for the i-th
training example. - x(i)
is the feature vector for the i-th training example[10].
• ROC Curve: As shown in Fig. 2, ROC curve displays the performance of a classification
model across all thresholdlevels.
•

Fig. 2: Receiver Operating Characteristic Curve

B. Image Detection Model

Phishing attacks often employ deceptive visual elements to mimic legitimate websites. To
counter this threat, we propose an image detection model that analyzes webpage screenshots to
identify phishing attempts[3]. The model consists of the following steps:

1) Data Collection:: We collected approx 1200 images of phishing and non-phishing

websites through manual capture. These images were resized to a standard dimension of
800x600 pixels.
2) Data Preprocessing:: We ensured that all images were in acceptable formats (jpeg, jpg,
bmp, png). Any image not in these formats was removed, and The images were normalized by
adjusting the pixel values to fall between 0 and 1.

ISBN : 978-81-978522-2-0 324

Optimization and Artificial Intelligent Strategies for Engineering and Management

Fig. 3: CNN Architecture

3) CNN Architecture:: As shown in Fig. 3, the CNN model architecture used for detecting phishing
websites consists ofseveral layers, each designed to process and learn from the input images[11]. The
architecture can be detailed as follows [12]:
• Input Layer: The model accepts images resized to 256x256 pixels, with 3 color channels (RGB).
• First Convolutional Layer: It employs 16 filters, each measuring 3x3, to analyze the input image, and
applies the ReLU activation function is applied to incorporate non-linearity. This layer produces feature
maps, each highlighting different aspects of the input image.
• First MaxPooling Layer: The first MaxPooling layer uses a pool size of 2x2 To down-sample the
feature maps, it shrinks the spatial dimensions, making the data more manageable for further processing
by half, thus reducing the data volume, which decreases the computational load and helps in
controlling overfitting.
• Dropout Layer: dropout layer set at a rate of 50% is applied to prevent overfitting.
• Second Convolutional Layer: The second convolutional layer uses 32 filters, each with a size of 3x3,
to process the down-sampled feature maps. It uses the ReLU activation function and L2 regularization
(0.02) to manage model complexity.
• Second MaxPooling Layer: A pool size of 2x2 is used to further reduce the spatial dimensions of the
feature maps, continuing to reduce the computational load and extract more abstract features from
the image.
• Dropout Layer: Another dropout layer with a rate of 50% is applied to reduce overfitting.
• Flatten Layer: The flatten layer converts the 2D feature maps into a 1D vector, setting up the data for
the fully connected (Dense) layers by flattening the spatial dimensions.

ISBN : 978-81-978522-2-0 325

Optimization and Artificial Intelligent Strategies for Engineering and Management

• Dense (Fully Connected) Layer: The dense layer comprises 64 neurons and applies the ReLU
activation function to capture complex patterns and relationships to interpret complex patterns and
relationships derived from the flattened featuremaps.
• Dropout Layer: A final dropout rate of 50% is applied for regularization.
• Output Layer: The final layer has just one neuron with a sigmoid activation function, which gives a
probability score. This score tells us whether the input image is a phishing website or not, allowing the
model to make a clear yes-or-no decision.
Training and Validation: The dataset contains around 600 phishing images and 600 non-phishing images,
divided into training, validation, and test sets with a 70:15:15 split. The model was trained for 10 epochs with
a batch size of 32, while TensorBoard tracked its progress. To avoid overfitting, we monitored key metrics
like precision, recall, and binary accuracy.In the end, the model was tested on a separate set to check how
well it could generalize to new data.
V. RESULTS AND PERFORMANCE
A. URL Detection Model Results and Performance
The logistic regression model produced the following outcomes:
• Training Accuracy: 98.09%
• Testing Accuracy: 96.59%
• Confusion Matrix:

• Classifica on Report:
Fig. 4: Confusion Matrix for Logistic Regression
Precision Recall F1-Score Support
Bad 0.91 0.97 0.94 36845
Good 0.99 0.97 0.98 100492
Accuracy 0.97 137337
Macro Avg 0.95 0.97 0.96 137337
Weighted Avg 0.97 0.97 0.97 137337

ISBN : 978-81-978522-2-0 326

Optimization and Artificial Intelligent Strategies for Engineering and Management

• Result Interpretation
– High Training Accuracy: The logistic regression model attained a training accuracy reaching
98.09%, indicating itsstrong performance on the training data.
– Effective Generalization: A testing accuracy of 96.59% indicates that the model performs well
on new, unseen data,showcasing its reliability and robustness.
– Insights from Confusion Matrix: The confusion matrix (Figure 4) offers a detailed view of the
model’s performance,Displaying the numbers of true positives, true negatives, false positives,
and false negatives for each class.
– Comprehensive Classification Report:The classification report provides precision, recall, and
F1-score metrics forboth phishing and legitimate classes, giving a thorough overview of the
model’s performance.
B. Image Detection Model Results and Performance
The performance of our deep learning model was assessed over 10 epochs. The training and validation
results are summarizedbelow:

• Visualization
TABLE I: Model Performance Metrics per Epoch

Epoch Training Training Validation Validation

Loss Accuracy Loss Accuracy
1 3.7956 0.7188 1.0455 0.7500
2 0.6702 0.8725 0.8393 0.9464
3 0.5242 0.9262 0.6285 0.9866
4 0.4804 0.9262 0.5378 0.9911
5 0.4231 0.9663 0.4489 0.9955
6 0.4036 0.9675 0.4536 0.9821
7 0.3687 0.9775 0.3386 0.9955
8 0.3177 0.9700 0.3324 0.9911
9 0.3080 0.9812 0.3062 0.9955
10 0.2911 0.9837 0.3578 0.9777

Fig. 5: Training Loss & Validation Loss Graph

ISBN : 978-81-978522-2-0 327

Optimization and Artificial Intelligent Strategies for Engineering and Management

Fig. 6: Training Accuracy & Validation Accuracy Graph

• Results Interpretation
Our deep learning model exhibited strong performance throughout the training process:
– Training Loss: Decreased consistently from 3.7956 in the first epoch to 0.2911 in the final
epoch, indicating effectivelearning.
– Validation Loss: Showed a decreasing trend with some fluctuation, ending at 0.3578.
– Training Accuracy: Improved from 71.88% to 98.37%, showing the model’s ability to efficiently
learn from the trainingdata.
– Validation Accuracy: Rose from 75.00% to 97.77%, indicating high generalization performance.
– Precision: Achieved a high value of 0.9841.
– Recall: Achieved a high value of 0.9394.
– Overall Accuracy: Reached 96.09%, highlighting the model’s ability to understand the
underlying patterns in the dataand perform well during both the training and validation stages.
The consistent decrease in losses and increase in accuracy demonstrate the model’s effectiveness in
minimizing errors andmaking accurate predictions.
VI. DISCUSSION AND FUTURE WORK
• Model Performance Discussion:
– The logistic regression model for URL detection demonstrated robust performance with high
training and testingaccuracies, showing that it’s effective at telling apart phishing and
legitimate URLs.
– The image detection model exhibited exceptional performance across 20 epochs, achieving
near-perfect accuracy onboth training and validation datasets, suggesting its strong
generalization capability.
• Comparison with Existing Methods:
– The deep learning-based image detection model outperformed conventional techniques in terms
of accuracy and effi-ciency, showcasing the benefits of using deep neural networks for image
classification tasks.
– Comparative analysis with other URL detection models revealed competitive performance,
showcasing the effectivenessof the proposed logistic regression approach.
• Limitations and Challenges:

ISBN : 978-81-978522-2-0 328

Optimization and Artificial Intelligent Strategies for Engineering and Management

– Despite achieving high accuracy, the image detection model may face challenges in scenarios with
complex backgroundsor low-quality images, indicating the need for further robustness
enhancements.
– The logistic regression model’s reliance on feature engineering and limited capacity to capture
complex patterns mighthinder its performance in detecting sophisticated phishing attacks.
• Future Work:
– Look into cutting-edge advanced deep learning techniques, like CNNs and RNNs, to boost
precision and reliability indetecting URLs and images.
– Explore ensemble learning methods to merge various models, aiming to enhance performance
and increase resistanceto adversarial attacks in cybersecurity applications.
– Expand the dataset in terms of diversity and scale to more accurately reflect real-world
scenarios and improve themodels’ ability to generalize across various domains and
environments.
– Conduct rigorous evaluation and validation on unseen datasets and real-world applications to
assess the models’ practicalusability and address any deployment challenges.
VII. CONCLUSION
In this research, we introduced a pair of machine learning models for cybersecurity applications: a
logistic regressionmodel for URL detection and a deep learning-based image detection model. The logistic
regression model showed excellent effectiveness in distinguishing phishing URLs from legitimate ones,
attaining high accuracy on both training and testing datasets.Additionally, the deep learning image detection
model showcased exceptional accuracy and generalization capability, surpassing traditional methods in
image classification tasks.
Through extensive experimentation and evaluation, we have shown the effectiveness of these models in
detecting malicious activities and enhancing cybersecurity measures.Nevertheless, there are ongoing
challenges and opportunities for improvement, particularly in enhancing the robustness and scalability of
the models to adapt to evolving cyber threats.
Overall, the findings of this research underscore the promise of leveraging machine learning and deep
learning techniques in enhancing cybersecurity measures. By utilizing these advanced technologies, we can
improve protection against cyberattacks while safeguarding sensitive information in a rapidly digitizing
environment.
REFERENCES
[1] J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran, and B. Bindhumadhava, “Phishing
website classification and detection using machine learning,” in 2020 International
Conference on Computer Communication and Informatics (ICCCI), 2020, pp. 1–6.
[2] M. Dunlop, S. Groat, and D. Shelly, “Goldphish: Using images for content-based phishing
analysis,” in 2010 Fifth International Conference on Internet Monitoring and Protection,
2010, pp. 123–128.
[3] S. Y. Yerima and M. K. Alzaylaee, “High accuracy phishing detection based on
convolutional neural networks,” in 2020 3rd International Conference on Computer
Applications Information Security (ICCAIS), 2020, pp. 1–6.
[4] M. Khonji, Y. Iraqi, and A. Jones, “Phishing detection: A literature survey,” IEEE
Communications Surveys Tutorials, vol. 15, no. 4, pp. 2091–2121, 2013.

ISBN : 978-81-978522-2-0 329

Optimization and Artificial Intelligent Strategies for Engineering and Management

[5] G. Ramesh, I. Krishnamurthi, and K. S. S. Kumar, “An efficacious method for detecting
phishing webpages through target domain identification,”
Decision Support Systems, vol. 61, pp. 12–22, 2014.
[6] J. James, S. L., and C. Thomas, “Detection of phishing urls using machine learning
techniques,” in 2013 International Conference on Control Communication and Computing
(ICCC), 2013, pp. 304–309.
[7] E. S. Aung, C. T. Zan, and H. Yamana, “A survey of url-based phishing detection,” in
DEIM forum, 2019, pp. G2–3.
[8] S. H. Ahammad, S. D. Kale, G. D. Upadhye, S. D. Pande, E. V. Babu, A. V.
Dhumane, and M. D. K. J. Bahadur, “Phishing url detection using machine learning
methods,” Advances in Engineering Software, vol. 173, p. 103288, 2022. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0965997822001892
[9] R. Chiramdasu, G. Srivastava, S. Bhattacharya, P. K. Reddy, and T. Reddy Gadekallu,
“Malicious url detection using logistic regression,” in 2021 IEEE International Conference
on Omni-Layer Intelligent Systems (COINS), 2021, pp. 1–6.
[10] R. Naresh, A. Gupta, and S. Giri, “Malicious url detection system using combined sym and
logistic regression model,” International Journal of Advanced Research in Engineering and
Technology (IJARET), vol. 11, no. 4, 2020.
[11] S. Y. Yerima and M. K. Alzaylaee, “High accuracy phishing detection based on
convolutional neural networks,” in 2020 3rd International Conference on Computer
Applications Information Security (ICCAIS), 2020, pp. 1–6.
[12] K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” ArXiv e-prints,
11 2015.

ISBN : 978-81-978522-2-0 330

Report For Foophones: by Utkarsh Munra
No ratings yet
Report For Foophones: by Utkarsh Munra
37 pages
Networking The Soldier: Army Network Capability Set Modernization
100% (2)
Networking The Soldier: Army Network Capability Set Modernization
18 pages
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
No ratings yet
Leveraging Advanced Machine Learning Techniques For Phishing Website Detection
6 pages
Batch-5 Journal-6 ECE-D new (1)
No ratings yet
Batch-5 Journal-6 ECE-D new (1)
6 pages
Project
No ratings yet
Project
12 pages
ISAA Report PDF
No ratings yet
ISAA Report PDF
24 pages
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
No ratings yet
Machine Learning-Driven Phishing Detection: A Robust Browser Extension Solution
4 pages
128 Submission
No ratings yet
128 Submission
7 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
6 pages
Major Project Final Report
No ratings yet
Major Project Final Report
53 pages
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
No ratings yet
PHISHING WEBSITE DETECTION USING MACHINE LEARNING - COMPLETED (1) Full
73 pages
CyberSec Review3 Team10
No ratings yet
CyberSec Review3 Team10
28 pages
3406-6866-1-PB
No ratings yet
3406-6866-1-PB
10 pages
paper2
No ratings yet
paper2
10 pages
Detection of URL Based Phishing Websites Using Machine Learning
No ratings yet
Detection of URL Based Phishing Websites Using Machine Learning
6 pages
Batch-5 ECE-D
No ratings yet
Batch-5 ECE-D
4 pages
Automated Phishing Detection Through URL Analysis and Machine Learning
No ratings yet
Automated Phishing Detection Through URL Analysis and Machine Learning
9 pages
paper-major1
No ratings yet
paper-major1
6 pages
Batch 30 - PHISHSIM Journal Paper
No ratings yet
Batch 30 - PHISHSIM Journal Paper
17 pages
Midterm Project Report
No ratings yet
Midterm Project Report
21 pages
Enhanced Phishing Website Detection: Leveraging Random Forest and XGBoost Algorithms With Hybrid Features
No ratings yet
Enhanced Phishing Website Detection: Leveraging Random Forest and XGBoost Algorithms With Hybrid Features
4 pages
A Machine Learning Based Approach For Phishing Detection Using
No ratings yet
A Machine Learning Based Approach For Phishing Detection Using
14 pages
Review 0 - Phishing Website in SEO (1)
No ratings yet
Review 0 - Phishing Website in SEO (1)
6 pages
A Hybrid Model To Detect Phishing-Sites Using Supervised Learning Algorithms
No ratings yet
A Hybrid Model To Detect Phishing-Sites Using Supervised Learning Algorithms
8 pages
Towards Detection of Phishing Websites On Client-Side Using Machine
No ratings yet
Towards Detection of Phishing Websites On Client-Side Using Machine
14 pages
Logistic Regression Based Machine Learning Technique For Phishing Website Detection
No ratings yet
Logistic Regression Based Machine Learning Technique For Phishing Website Detection
4 pages
Innovative Nitesh
No ratings yet
Innovative Nitesh
14 pages
A multi-algorithm approach for phishing uniform resource locator’s detection
No ratings yet
A multi-algorithm approach for phishing uniform resource locator’s detection
10 pages
ITB1 Documentation Detection of Phishing Website Using ML
No ratings yet
ITB1 Documentation Detection of Phishing Website Using ML
49 pages
Improved Detection of Phishing Websites Using Machine Learning 11-6-2024
No ratings yet
Improved Detection of Phishing Websites Using Machine Learning 11-6-2024
15 pages
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
No ratings yet
Generative Adversarial Network-Based Phishing URL Detection With Variational Autoencoder and Transformer
8 pages
Real Time Phishing Website Detectionusing ML
No ratings yet
Real Time Phishing Website Detectionusing ML
4 pages
updated_phishing_url_detection
No ratings yet
updated_phishing_url_detection
13 pages
BSC Final Project PPT
No ratings yet
BSC Final Project PPT
8 pages
Sat - 26.Pdf - Phishing Website Detection Using Novel Machine Learning Fusion Approach
No ratings yet
Sat - 26.Pdf - Phishing Website Detection Using Novel Machine Learning Fusion Approach
11 pages
20mis0106 VL2023240102875 Pe003
No ratings yet
20mis0106 VL2023240102875 Pe003
42 pages
Across_the_Spectrum_In-Depth_Review_AI-Based_Models_for_Phishing_Detection
No ratings yet
Across_the_Spectrum_In-Depth_Review_AI-Based_Models_for_Phishing_Detection
28 pages
A Deep Learning Approach To Phishing Website Detection
No ratings yet
A Deep Learning Approach To Phishing Website Detection
48 pages
Enhancing Phishing URL Detection Through Comprehen
No ratings yet
Enhancing Phishing URL Detection Through Comprehen
7 pages
Detecting Phishing Websites Using Machine Learning
No ratings yet
Detecting Phishing Websites Using Machine Learning
16 pages
Presentation Slides
No ratings yet
Presentation Slides
42 pages
Cui Qian 2019 Thesis
No ratings yet
Cui Qian 2019 Thesis
136 pages
Phishing Website Detection Using ML 2-1
No ratings yet
Phishing Website Detection Using ML 2-1
20 pages
Fr -Detecting Malicious Urls Using Data Analytics
No ratings yet
Fr -Detecting Malicious Urls Using Data Analytics
17 pages
AI-Based Phishing Detection Techniques
No ratings yet
AI-Based Phishing Detection Techniques
15 pages
JETIR2504A41
No ratings yet
JETIR2504A41
7 pages
Innovative Nitesh
No ratings yet
Innovative Nitesh
14 pages
Introduction(8)
No ratings yet
Introduction(8)
4 pages
Phishing Phase1 Report
No ratings yet
Phishing Phase1 Report
20 pages
Jain 2018
No ratings yet
Jain 2018
14 pages
Detection of Phishing Website
No ratings yet
Detection of Phishing Website
12 pages
Detecting Phishing Website With Code Implementation
No ratings yet
Detecting Phishing Website With Code Implementation
13 pages
Detection of Phishing On Apps and Websites - Project Report
No ratings yet
Detection of Phishing On Apps and Websites - Project Report
21 pages
Paper 1
No ratings yet
Paper 1
5 pages
Final Yr Project PhishingAttack Ppt
No ratings yet
Final Yr Project PhishingAttack Ppt
12 pages
Phish Guard Phishing Website using Machine Learning Algorithms
No ratings yet
Phish Guard Phishing Website using Machine Learning Algorithms
10 pages
Phishing Detection (Yamu Research Project)
No ratings yet
Phishing Detection (Yamu Research Project)
19 pages
Machine_Learning_for_Detecting_the_Phishing_Threats
No ratings yet
Machine_Learning_for_Detecting_the_Phishing_Threats
6 pages
Applsci 13 04649
No ratings yet
Applsci 13 04649
16 pages
Batch-22
No ratings yet
Batch-22
14 pages
A Predictive Model For Phishing Attack D
No ratings yet
A Predictive Model For Phishing Attack D
6 pages
Fascination: Honeypots and Cybercrime
From Everand
Fascination: Honeypots and Cybercrime
Armin Snyder
No ratings yet
By Palo Alto Networks - PA-3200 Series - Datasheet
No ratings yet
By Palo Alto Networks - PA-3200 Series - Datasheet
3 pages
IEH - Flash Cards
No ratings yet
IEH - Flash Cards
44 pages
CHAPTER 3
No ratings yet
CHAPTER 3
37 pages
2024 Top 10 Government Erp Systems Report Panorama Consulting
No ratings yet
2024 Top 10 Government Erp Systems Report Panorama Consulting
19 pages
Room Rent Proposal Report
No ratings yet
Room Rent Proposal Report
20 pages
Wordpress Security by Shehu Awwal
100% (1)
Wordpress Security by Shehu Awwal
19 pages
Crypto Assignment
No ratings yet
Crypto Assignment
9 pages
Cyber Crime Project Report
No ratings yet
Cyber Crime Project Report
50 pages
Advanced Encryption Standard
No ratings yet
Advanced Encryption Standard
3 pages
CP 211 Lecture 1 Introduction To System Administration
No ratings yet
CP 211 Lecture 1 Introduction To System Administration
47 pages
Cyber Crime: Types of Cyber Crimes
No ratings yet
Cyber Crime: Types of Cyber Crimes
5 pages
Usermanagement
No ratings yet
Usermanagement
8 pages
Artificial Intelligence Agents & Large Action Models in Digital Government
No ratings yet
Artificial Intelligence Agents & Large Action Models in Digital Government
35 pages
Layer7 Application Firewalls
No ratings yet
Layer7 Application Firewalls
5 pages
Web Forensics For Tackling Terrorism Investigation Use Case
No ratings yet
Web Forensics For Tackling Terrorism Investigation Use Case
14 pages
Unit 3 Law and Emerging Technology Unit 3
No ratings yet
Unit 3 Law and Emerging Technology Unit 3
7 pages
RRL
No ratings yet
RRL
18 pages
University of Engineering & Technology, Taxila: Lab Report 2
No ratings yet
University of Engineering & Technology, Taxila: Lab Report 2
8 pages
School Management System
No ratings yet
School Management System
7 pages
Asa Remote Access VPN Technologies: SSLVPN Webvpn Ipsecvpn: Security Consulting Se Ccie, Cissp
No ratings yet
Asa Remote Access VPN Technologies: SSLVPN Webvpn Ipsecvpn: Security Consulting Se Ccie, Cissp
43 pages
HIPAA Awareness For Healthcare Providers Certificate For Joao Furtado
No ratings yet
HIPAA Awareness For Healthcare Providers Certificate For Joao Furtado
3 pages
CA Module 20
No ratings yet
CA Module 20
22 pages
Nini Muly Resume-15
No ratings yet
Nini Muly Resume-15
1 page
Unit 8 Information System Security: 8.1 Introduction: Ethics in Information Society
No ratings yet
Unit 8 Information System Security: 8.1 Introduction: Ethics in Information Society
7 pages
Mobile Application Security and Penetration Testing: Syllabus
No ratings yet
Mobile Application Security and Penetration Testing: Syllabus
14 pages
Threats and Countermeasures
No ratings yet
Threats and Countermeasures
22 pages
Lynx
No ratings yet
Lynx
6 pages
SQL Injection Ethical Hacking Guide
No ratings yet
SQL Injection Ethical Hacking Guide
2 pages

39 (1)

Uploaded by

39 (1)

Uploaded by

AI/ML Dual Approach for Phishing Domain

Detection: URL and ImageAnalysis

Souvik Karmakar Dipanjan Santra Arijit Tewary

ISBN : 978-81-978522-2-0 https://doi.org/10.37285/bsp.oaisem2025.39 320

ISBN : 978-81-978522-2-0 321

III. PROBLEM STATEMENT

ISBN : 978-81-978522-2-0 322

in phishing URLs compared to legi mate ones.

Fig. 1: Distribution of Phishing

ISBN : 978-81-978522-2-0 323

Fig. 2: Receiver Operating Characteristic Curve

B. Image Detection Model

1) Data Collection:: We collected approx 1200 images of phishing and non-phishing

ISBN : 978-81-978522-2-0 324

Fig. 3: CNN Architecture

ISBN : 978-81-978522-2-0 325

ISBN : 978-81-978522-2-0 326

Epoch Training Training Validation Validation

Fig. 5: Training Loss & Validation Loss Graph

ISBN : 978-81-978522-2-0 327

Fig. 6: Training Accuracy & Validation Accuracy Graph

ISBN : 978-81-978522-2-0 328

ISBN : 978-81-978522-2-0 329

ISBN : 978-81-978522-2-0 330

You might also like