0% found this document useful (0 votes)
73 views

Phishing Email Detection Abstract

The document proposes a new deep learning model called ___ to detect phishing emails. It uses an improved recurrent convolutional neural network (RCNN) model to analyze emails at the character, word, header, and body level simultaneously. An experiment on an unbalanced real-world dataset shows the model achieves promising results in identifying phishing emails with high accuracy while filtering out most legitimate emails, outperforming existing detection methods. The model introduces minimal noise by using an attention mechanism on the header and body.

Uploaded by

Tunnu Sunny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Phishing Email Detection Abstract

The document proposes a new deep learning model called ___ to detect phishing emails. It uses an improved recurrent convolutional neural network (RCNN) model to analyze emails at the character, word, header, and body level simultaneously. An experiment on an unbalanced real-world dataset shows the model achieves promising results in identifying phishing emails with high accuracy while filtering out most legitimate emails, outperforming existing detection methods. The model introduces minimal noise by using an attention mechanism on the header and body.

Uploaded by

Tunnu Sunny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Phishing Email Detection Using Improved RCNN Model with

Multilevel Vectors and Attention Mechanism

Abstract:

The phishing email is one of the significant threats in the world today and has
causedtremendous financial losses. Although the methods of confrontation are
continually being updated, the results of those methods are not very satisfactory at
present. Moreover, phishing emails are growing at an alarming rate in recent years.
Therefore, more effective phishing detection technology is needed to curb the
threat of phishing emails. In this paper, we first analyzed the email structure. Then
based on an improved Recurrent Convolutional Neural Networks (RCNN) model
with multilevel vectors and attention mechanism, we proposed a new phishing
email detection model named, which is used to model emails at the email header,
the email body, the character level, and the word level simultaneously. To evaluate
the effectiveness of, we use an unbalanced dataset that has realistic ratios of
phishing and legitimate emails. Experimental results show that the. Meanwhile, the
ensure that the filter can identify phishing emails with high probability and filter
out legitimate emails as little as possible. This promising result is superior to the
existing detection methods and verifies the effectiveness of in detecting phishing
emails.
Architecture:
EXISTING SYSTEM:
Various techniques for detecting phishing emails are mentionedin the literature. In
the entire technology development process, there are mainly three types of
technical methods including blacklist mechanisms, classification algorithms based
on machine learning and based on deep learning. From previous work, the existing
detection methods based on the blacklist mechanism mainly rely on people’s
identification and reporting of phishing links requiring a large amount of
manpower and time. However, applying artificial intelligence to the detection
method based on a machine learning classification algorithm requires feature
engineering to manually find representative features that are not conducive to the
migration of application scenarios. Moreover, the current detection method based
on deep learning is limited to word embedding in the content representation of the
email. These methods directly transferred natural language processing (NLP) and
deep learning technology, ignoring the specificity of phishing email detection so
that the results were not ideal Given the methods mentioned above and the
corresponding problems, we set to study phishing email detection systematically
based on deep learning. Specifically, this paper makes the following contributions:

Disadvantages
1. With respect to the particularity of the email text, weanalyze the email
structure, and mine the text featuresfrom four more detailed parts: the email
header, theemail body, the word-level, and the char-level.
2. The RCNN model is improved by using the Then,the email is modelled from
multiple levels using animproved RCNN model. Noise is introduced as
littleas possible, and the context information of the emailcan be better
captured.
PROPOSED SYSTEM:
With the emergence of email, the convenience of communicationhas led to the
problem of massive spam, especially phishing attacks through email. Various anti
phishing technologies have been proposed to solve the problem of phishing
attacks. studied the effectiveness of phishing blacklists. Blacklists mainly include
sender blacklists and link blacklists. This detection method extracts the sender’s
address and link address in the message and checks whether it is in the blacklist to
distinguish whether the email is a phishing email. The update of a blacklist is
usually reported by users, and whether it is a phishing website or not is manually
identified. At present, the two well-known phishing websites are PhishTank
andOpenPhish. To some extent, the perfection of the blacklist determines the
effectiveness of this method based on the blacklist mechanism for phishing email
detection.The currentsituation is that new threats may not only cause severe
damage to customers’ computers but also aim to steal their money and identity.
Among these threats, phishing is a noteworthy one and is a criminal activity that
uses social engineering and technology to steal a victim’s identity data and account
information. According to a report from the Anti-Phishing Working compared with
the fourth quarter of According to the striking data, it is clear that phishing has
shown an apparent upward trend in recent years. Similarly, the harm caused by
phishing can be imagined as well.
Advantages
1. Phishing email refers to an attacker using a fake email to trick the recipient
into returning information such as an account passwordto a designated
recipient.
2. Additionally, it may be used to trick recipients into entering special web
pages, which are usually disguised as real web pages, such as a bank’s web
page, to convince users to enter sensitive information such as a credit card or
bank card number and password. Although the attack of phishing email
seems simple, its harmis immense.

ALGORITHM

R-CNN Algorithms
Let’s quickly summarize the different algorithms in the R-CNN family (R-CNN,
Fast R-CNN, and Faster R-CNN) that we saw in the first article. This will help lay
the ground for our implementation part later when we will predict the bounding
boxes present in previously unseen images (new data). R-CNN extracts a bunch of
regions from the given image using selective search, and then checks if any of
these boxes contains an object. We first extract these regions, and for each region,
CNN is used to extract specific features. Finally, these features are then used to
detect objects. Unfortunately, R-CNN becomes rather slow due to these multiple
steps involved in the process. Fast R-CNN, on the other hand, passes the entire
image to ConvNet which generates regions of interest (instead of passing the
extracted regions from the image). Also, instead of using three different models (as
we saw in R-CNN), it uses a single model which extracts features from the regions,
classifies them into different classes, and returns the bounding boxes. All these
steps are done simultaneously, thus making it execute faster as compared to R-
CNN. Fast R-CNN is, however, not fast enough when applied on a large dataset as
it also uses selective search for extracting the regions.

REQUIREMENT ANALYSIS

The project involved analyzing the design of few applications so as to make


the application more users friendly. To do so, it was really important to keep the
navigations from one screen to the other well ordered and at the same time
reducing the amount of typing the user needs to do. In order to make the
application more accessible, the browser version had to be chosen so that it is
compatible with most of the Browsers.

REQUIREMENT SPECIFICATION

Functional Requirements
 Graphical User interface with the User.

Software Requirements

For developing the application the following are the Software Requirements:

1. Python

2. Django

3. MySql

4. MySqlclient

5. WampServer 2.4

Operating Systems supported

1. Windows 7

2. Windows XP

3. Windows 8

Technologies and Languages used to Develop

1. Python

Debugger and Emulator


 Any Browser (Particularly Chrome)

Hardware Requirements

For developing the application the following are the Hardware Requirements:

 Processor: Pentium IV or higher


 RAM: 256 MB
 Space on Hard Disk: minimum 512MB

Conclusion:

we use a new deep learning model namedto detect phishing emails. The model
employs an improved CNN to model the email header and the email body at both
the character level and the word level. Therefore, the noise is introduced into the
model minimally. In the model, we use the attention mechanism in the header and
the body, making the model pay more attention to the morevaluable information
between them. We use the unbalanced dataset closer to the real-world situation to
conduct experiments and evaluate the model. The model obtains a promising
result. Several experiments are performed to demonstrate the benefits of the
proposed model. For future work, we will focus on how to improve our model for
detecting phishing emails with no email header and only an email body.

You might also like