0% found this document useful (0 votes)
12 views

Slide Format

Uploaded by

Rakshith Ah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Slide Format

Uploaded by

Rakshith Ah
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

TITLE

CLASSIFICATION OF EMAILS
USING DATA MINING AND
MACHINE LEARNING ALGORITHMS
TEAMS
SI NO NAME USN

01 ADITI 01JST21IS004

02 DHANYA SHEKAR DC 01JST21IS015

03 RAKSHITH A H 01JST21IS040
AGENDA
INTRODUCTION
E-mail is one of the most popular and frequently used ways of communication due to its
worldwide accessibility, relatively fast message transfer, and low sending cost.
Email messages are sent from software programs and web browsers, collectively
referred to as email ‘clients.’ Individual messages are routed through multiple servers
before they reach the recipient’s email server. In general, all the e-mail messages are
classified as “Ham” and “Spam”. Ham messages are the intended or safe legitimate
messages in a mailbox; whereas Spam messages are the junk, unsolicited bulk or
commercial messages in the mailbox.
An e-mail could be considered as Spam e-mail when it is associated with Bad grammar,
Distorted images, Distorted symbols or logos, Bad links, Tempting offers, and timebased
subscriptions that forces the users to subscribe immediately. Phishing is also considered
as one of the dangerous cyber-crime which targets the individuals and tricks them to
click on links or subscribe to steal the individual’s data like login credentials of social
accounts ,internet banking details in the worst-case scenario. Phishing e-mails are also
considered as spam messages. Spam e-mails also include Spamvertised sites - emails
that advertise products containing URLs that direct to other webpages, 419 Scams –
spam emails where a small initial payment in a huge sum of money is offered to the
users, Image spams – content present in an e-mail is displayed in the form of images
When a large number of spam messages are received, it is necessary to take a long time
to identify spam or non-spam email and their email messages may cause the mail server
to crush. To solve the spam problem, there have been several attempts to detect and
filter the spam email on the client-side. Data mining and ML approaches are applied to
the problem, including Bayesian classifiers as Naive Bayes ,KNN algorithm .
SI NO TITLE AUTHORS REVIVEW

01 EMAIL SPAM Sunidhi Pandey, • In paper[1], Compares Naïve Bayes classification


DETECTION Shantanu Singh and Support Vector Machines (SVM) for a certain
USING Chandel, Prof. task, and concludes navye bayes is a better
SUPERVISED Kunal Kumar algorithm.
ALGORITHMS • In paper[2], Utilizes deep learning techniques,
specifically the BERT-based cased transformer
model.
• In paper[3], Focuses on detecting spam emails
using machine learning, particularly Naïve Bayes
algorithms.
• In paper[4], Takes a comprehensive approach to
classification using various preprocessing
techniques like stop word removal, tokenization,
and bag of words.
LITERATURE SURVEY
SI NO TITLE AUTHORS REVIVEW

02 SPAM T. Hamsapriya ,
CLASSIFICATION D. Karthika
BASED ON Renuka and M.
SUPERVISED
LEARNING USING Raja
MACHINE Chakkaravarthi
LEARNING
TECHNIQUES
SI NO TITLE AUTHORS REVIVEW

03 COMPARATIVE Mangena Venu


ANALYSIS OF Madhavan , Sagar
DETECTION OF Pande Pooja
THE EMAIL SPAM Umekar , Tushar
USING THE AID Mahore, Dhiraj
OF MACHINE Kalyankar.
LEARNING.
SI NO TITLE AUTHORS REVIVEW

04 EMAIL
CLASSSIFICATIO
N ANALYSIS
USING
MACHINE
LEARNING
TECHNIQUES
SI NO TITLE AUTHORS REVIVEW

05
Problem statement
Aim :
To develop an effective classification model using data mining and machine learning
techniques for accurately distinguishing between spam and non-spam (ham) emails.
Objectives :
Data Gathering: Collect a variety of labeled emails containing both spam and ham
examples.
Data Cleaning: Remove duplicate emails, handle missing data, and address class
imbalance.
Feature Extraction: Identify and extract relevant features from the email content.
Model Building: Develop and evaluate multiple machine learning models for classifying
spam and ham emails.
Performance Evaluation: Assess the models' performance using metrics like accuracy,
precision, recall, and F1 score.
Model Comparison: Compare the performance of different models to identify the most
effective one.
Optimization: Fine-tune the parameters of the best-performing model to improve its
accuracy and generalization.
Deployment: Implement the chosen model into a real-world email system for automatic
spam classification

You might also like