An effecient spam detection technique for io t devices using machine learning

An Efficient Spam Detection Technique For IOT Devices Using Machine
Learning
Abstract:
The Internet of Things (IoT) is a group of millions of devices having sensors and actuators linked
over wired or wireless channel for data transmission. IoT has grown rapidly over the past decade
with more than 25 billion devices expected to be connected by 2020. The volume of data
released from these devices will increase many-fold in the years to come. In addition to an
increased volume, the IoT devices produces a large amount of data with a number of different
modalities having varying data quality defined by its speed in terms of time and position
dependency. In such an environment, machine learning (ML) algorithms can play an important
role in ensuring security and authorization based on biotechnology, anomalous detection to
improve the usability, and security of IoT systems. On the other hand, attackers often view
learning algorithms to exploit the vulnerabilities in smart IoT-based systems. Motivated from
these, in this article, we propose the security of the IoT devices by detecting spam using ML. To
achieve this objective, Spam Detection in IoT using Machine Learning framework is proposed.
In this framework, five ML models are evaluated using various metrics with a large collection of
inputs features sets. Each model computes a spam score by considering the refined input
features. This score depicts the trustworthiness of IoT device under various parameters. REFIT
Smart Home data set is used for the validation of proposed technique. The results obtained
proves the effectiveness of the proposed scheme in comparison to the other existing schemes.
Existing System:
The safety measures of IoT devices depends upon the size and type of organization in which it is
imposed. The behavior of users forces the security gateways to cooperate. In other words, we can
say that the location, nature, application of IoT devices decides the security measures. For
instance, the smart IoT security cameras in the smart organization can capture the different
parameters for analysis and intelligent decision making. The maximum care to be taken is with
web based devices as maximum number of IoT devices are web dependent. It is common at the
workplace that the IoT devices installed in an organization can be used to implement security
and privacy features efficiently. For example, wearable devices collect and send user’s health
data to a connected smartphone should prevent leakage of information to ensure privacy. It has
been found in the market that 25-30% of working employees connect their personal IoT devices
with the organizational network. The expanding nature of IoT attracts both the audience, i.e., the
users and the attackers.
However, with the emergence of ML in various attacks scenarios, IoT devices choose a
defensive strategy and decide the key parameters in the security protocols for trade-off between

security, privacy and computation. This job is challenging as it is usually difficult for an IoT
system with limited resources to estimate the current network and timely attack status.
Proposed System:
1) The proposed scheme of spam detection is validated using five different machine learning
models.
2) An algorithm is proposed to compute the spamicity score of each model which is then used for
detection and intelligent decision making.
3) Based upon the spamicity score computed in previous step, the reliability of IoT devices is
analyzed using different evaluation metrics.
1) Feature Engineering: The machine learning algorithms works accurately with the
appropriate instances and their attributes. We all know that the instances are the real data
world value, gathered from the real world smart objects deployed across the globe.
Feature extraction and feature selection are the core of feature engineering process.
Feature reduction: This methods is used to reduce the dimension of data. In other words,
feature reduction is the procedure to reduce the complexity of features. This technique
reduces the issues like, over-fitting, large memory requirement, computation power. There
are various feature extraction techniques. Among these, principal component analysis (PCA)
is the most popular . But, the method used in this proposal is PCA along with following IoT
parameters.
– Analysis time: The dataset used in the experiments, contains the data recorded for the span
of eighteen months. For better results and accuracy, we have considered the data of one
month. Considering the fact, the climate is the important parameter for the working of IoT
device, the month with maximum variations has been taken into the consideration
– Web based appliances: Only those appliances are included, which stay connected with web
for their working. The data collection includes the appliances: Television, Set top box, DVD
player/recorder, HiFi, Electric heater, Fridge, Dishwasher, Toaster, Coffee maker, Kettle,
Freezer, Washing machine, Tumble dryer, Electric heater, DAB radio, Desktop PC, PC
monitor, Printer, Router, Electric heater, Electric heater, Shredder, Freezer, Lamp, Alarm
radio, Lava lamp, CD player, Television, Video player, Set top box, Hub (network). Feature
selection: It is the process of computing the most important subset of features. It works by
computing the importance of each feature [16].

Entropy based filter is used as a feature selection technique in this proposal. – Entropy-based
filter: This algorithm uses the correlation among the discrete attributes with continuous
attributes to find out the weights of discrete attributes .
There are three functions using this entropy based filter namely, information.gain, gain.ratio,
symmetrical.uncertainty. The syntax for these functions are:
information.gain(formula, data, unit)
gain.ratio(formula, data, unit)
symmetrical.uncertainty(formula, data, unit)
The arguments used in the function definition are described here.
a) formula: It is the description of the working behind the algorithm.
b) data: It is the set of training data with the defined attributes for which the selection is to be
made.
c) unit: It is the unit which is used for entropy computing. By default it takes the value “log”.

An effecient spam detection technique for io t devices using machine learning

More Related Content

What's hot (20)

Similar to An effecient spam detection technique for io t devices using machine learning (20)

More from Venkat Projects (20)

Recently uploaded (20)

An effecient spam detection technique for io t devices using machine learning