0% found this document useful (0 votes)

51 views78 pages

16.smart Sensing Production

Uploaded by

v rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views78 pages

16.smart Sensing Production

Uploaded by

v rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 78

DNN MODEL FOR SECURING SMART SENSING

PRODUCTION SYSTEM: APPROACHES FOR SECURITY IN

SMART MANUFACTURING

ABSTRACT
In recent years, the integration of smart sensing technologies in production systems has
become increasingly prevalent. These systems leverage sensors to collect and process data in
real-time, enabling more efficient and automated manufacturing processes. However, with the
growing complexity and connectivity of these smart sensing production systems, there is a
heightened need for robust security measures to protect against potential cyber threats and
vulnerabilities. These challenges include the potential for unauthorized access to sensitive
data, manipulation of sensor readings, and disruption of communication between devices.
The problem definition, therefore, revolves around developing a security framework that can
effectively mitigate these emerging threats and ensure the integrity, confidentiality, and
availability of the smart sensing production system. Traditional security systems typically
rely on firewalls, intrusion detection systems, and encryption techniques to safeguard
networks and data. However, these measures may not be sufficient to address the specific
challenges posed by smart sensing production systems. Traditional systems also struggle to
detect subtle and sophisticated attacks targeting the interconnected sensors and
communication channels. As a result, there is a need for a more adaptive and intelligent
security solution that can understand the unique characteristics of smart sensing environments
and respond proactively to emerging threats. In addition, any compromise in the security of
modern manufacturing systems can have severe consequences, including disruptions in
production, data breaches, and potential safety hazards. As the manufacturing industry
continues to adopt Industry 4.0 principles, the reliance on interconnected devices and data-
driven decision-making underscores the urgency of implementing advanced security
measures. Therefore, this research offers a promising solution for securing smart sensing
production systems with the adoption of deep neural networks (DNNs), which excel at
processing complex and high-dimensional data, making them well-suited for analyzing the
diverse streams of information generated by sensors in a production environment. By
leveraging machine learning and artificial intelligence, DNN models can learn patterns of
normal behavior and detect anomalies indicative of security threats. These models provide a
more intelligent and adaptive approach to security, offering a higher level of protection
against emerging cyber threats in the context of interconnected and data-driven production
environments.
CHAPTER 1

INTRODUCTION
Sensors are most commonly used in numerous applications ranging from body-parameters’
measurement to automated driving. Moreover, sensors play a key role in performing
detection- and vision-related tasks in all the modern applications of science, engineering and
technology where the computer vision is dominating. An interesting emerging domain that
employs the smart sensors is the Internet of Things (IoT) dealing with wireless networks and
sensors distributed to sense data in real time and producing specific outcomes of interest
through suitable processing. In IoT-based devices, sensors and artificial intelligence (AI) are
the most important elements which make these devices sensible and intelligent. In fact, due to
the role of AI, the sensors act as smart sensors and find an efficient usage for a variety of
applications, such as general environmental monitoring [1]; monitoring a certain number of
environmental factors; weather forecasting; satellite imaging and its use; remote sensing
based applications; hazard events’ monitoring such as landslide detection; self-driving cars;
healthcare and so on. In reference to this latter sector, recently the usage of smart devices has
been hugely increased in hospitals and diagnostic centers for evaluating and monitoring
various health conditions of affected patients, remotely as well as physically [2].

Practically, there is no field of science or research which performs smartly without using the
modern sensors. The wide usage and need of sensors; and IoT employed in remote sensing,
environment and human health monitoring make the applications as intelligent. In the last
decade, the agriculture applications have also included [3] the utilization of many types of
sensors for monitoring and controlling various types of environmental parameters such as
temperature, humidity, soil quality, pollution, air quality, water contamination, radiation, etc.
This paper also aims to highlight the use of the sensors and IoT for remote sensing and
agriculture applications in terms of extensive discussion and review.

In recent years, SHM of civil structures has been a critical topic for research. SHM helps to
detect the damage of a structure, and it also provides early caution of a structure that is not in
a safe condition for usage. Civil infrastructure like [4] bridges get damaged with time, and the
reason for the damage is heavy vehicles, loading environmental changes, and dynamic forces
such as seismic. These types of changes mainly occur at existing structures constructed long
ago, and various methods will detect that damage. The strategy of SHM involves observing
the structure for a certain period to notice the condition of the structure and the periodic
measurements of data will be collected, and the features of data will be extracted from these
computation results, and the process of analysis can be done with the help of a featured data
to find out the present-day health of the structure. The information collected from the process
can be updated periodically to monitor the structure and based on the data collected through
monitoring a structure, and the structure can be strengthened and repaired, and rehabilitation
and maintenance can be completed [5].
CHAPTER 2

LITERATURE SURVEY
Ullo et. al [6] focused on an extensive study of the advances in smart sensors and IoT,
employed in remote sensing and agriculture applications such as the assessment of weather
conditions and soil quality; the crop monitoring; the use of robots for harvesting and
weeding; the employment of drones. The emphasis has been given to specific types of sensors
and sensor technologies by presenting an extensive study, review, comparison and
recommendation for advancements in IoT that would help researchers, agriculturists, remote
sensing scientists and policy makers in their research and implementations.

Sivasuriyan et. al [7] provides a detailed understanding of bridge monitoring, and it focuses
on sensors utilized and all kinds of damage detection (strain, displacement, acceleration, and
temperature) according to bridge nature (scour, suspender failure, disconnection of bolt and
cables, etc.) and environmental degradation under static and dynamic loading. This paper
presents information about various methods, approaches, case studies, advanced
technologies, real-time experiments, stimulated models, data acquisition, and predictive
analysis. Future scope and research also discussed the implementation of SHM in bridges.
The main aim of this research is to assist researchers in better understanding the monitoring
mechanism in bridges.

Dazhe Zhao et. al [8] proposed an easy-fabricated and compact untethered triboelectric patch
with Polytetrafluoroethylene (PTFE) as triboelectric layer and human body as conductor. We
find that the conductive characteristic of human body has negligible influence on the outputs,
and the untethered triboelectric patch has good output ability and robustness. The proposed
untethered triboelectric patches can work as sensor patches and energy harvester patches.
Three typical applications are demonstrated, which are machine learning assisted objects
distinguishing with accuracy up to 93.09–94.91 %, wireless communication for sending
typical words to a cellphone, and human motions energy harvesting for directly powering
electronics or charging an energy storage device.

Bacco et. al [9] described, both analytically and empirically, a real testbed implementing
IEEE 802.15.4-based communications between an UAV and fixed ground sensors. In our
scenario, we found that aerial mobility limits the actual IEEE 802.15.4 transmission range
among the UAV and the ground nodes to approximately 1/3 of the nominal one. We also
provide considerations to design the deployment of sensors in precision agriculture scenarios.

Verma et. al [10] discussed the existing state-of-the-art practices of improved intelligent
features, controlling parameters and Internet of things (IoT) infrastructure required for smart
building. The main focus is on sensing, controlling the IoT infrastructure which enables the
cloud clients to use a virtual sensing infrastructure using communication protocols. The
following are some of the intelligent features that usually make building smart such as
privacy and security, network architecture, health services, sensors for sensing, safety, and
overall management in smart buildings. As we know, the Internet of Things (IoT) describes
the ability to connect and control the appliances through the network in smart buildings. The
development of sensing technology, control techniques, and IoT infrastructure give rise to a
smart building more efficient. Therefore, the new and problematic innovation of smart
buildings in the context of IoT is to a great extent and scattered. The conducted review
organized in a scientific manner for future research direction which presents the existing
challenges, and drawbacks.

Hu et. al [11] presented a real-time, fine-grained, and power-efficient air quality monitor
system based on aerial and ground sensing. The architecture of this system consists of the
sensing layer to collect data, the transmission layer to enable bidirectional communications,
the processing layer to analyze and process the data, and the presentation layer to provide a
graphic interface for users. Three major techniques are investigated in our implementation for
data processing, deployment strategy, and power control. For data processing, spatial fitting
and short-term prediction are performed to eliminate the influences of incomplete
measurement and the latency of data uploading. The deployment strategies of ground sensing
and aerial sensing are investigated to improve the quality of the collected data. Power control
is further considered to balance between power consumption and data accuracy. Our
implementation has been deployed in Peking University and Xidian University since
February 2018, and has collected almost 100,000 effective values thus far.

Famila et. al [12] proposed an Improved Artificial Bee colony optimization based
ClusTering(IABCOCT) algorithm by utilizing the merits of Grenade Explosion Method
(GEM) and Cauchy Operator. This incorporation of GEM and Cauchy operator prevents the
Artificial Bee Colony (ABC) algorithm from stuck into local optima and improves the
convergence rate. The benefits of GEM and Cauchy operator are embedded into the Onlooker
Bee and scout bee phase for phenomenal improvement in the degree of exploitation and
exploration during the process of CH selection. The simulation results reported that the
IABCOCT algorithm outperforms the state of art methods like Hierarchical Clustering-based
CH Election (HCCHE), Enhanced Particle Swarm Optimization Technique (EPSOCT)

CHAPTER 3

EXISTING SYSTEM

3.1 K-Nearest Neighbor (KNN)

K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on

Supervised Learning technique. K-NN algorithm assumes the similarity between the new
case/data and available cases and put the new case into the category that is most similar to the
available categories.

It stores all the available data and classifies a new data point based on the similarity. This
means when new data appears then it can be easily classified into a well suite category by
using K- NN algorithm. It can be used for Regression as well as for Classification but mostly
it is used for the Classification problems. It is a non-parametric algorithm, which means it
does not make any assumption on underlying data. It is also called a lazy learner algorithm
because it does not learn from the training set immediately instead it stores the dataset and at
the time of classification, it performs an action on the dataset.

KNN algorithm at the training phase just stores the dataset and when it gets new data, then it
classifies that data into a category that is much like the new data.

Why do we need a K-NN Algorithm?

Suppose there are two categories, i.e., Category A and Category B, and we have a new data
point x1, so this data point will lie in which of these categories. To solve this type of problem,
we need a K-NN algorithm. With the help of K-NN, we can easily identify the category or
class of a particular dataset. Consider the below diagram:
Fig. 3.4: KNN on dataset.

How does K-NN work?

The K-NN working can be explained based on the below algorithm:

Step-1: Select the number K of the neighbors.

Step-2: Calculate the Euclidean distance of K number of neighbors.

Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.

Step-4: Among these k neighbors, count the number of the data points in each category.

Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.

Step-6: Model is ready.

Suppose we have a new data point, and we need to put it in the required category. Consider
the below image:
Fig. 3.5: Considering new data point.

Firstly, we will choose the number of neighbors, so we will choose the k=5.

Next, we will calculate the Euclidean distance between the data points. The Euclidean
distance is the distance between two points, which we have already studied in geometry. It
can be calculated as:

Fig. 3.6: Measuring of Euclidean distance.

By calculating the Euclidean distance we got the nearest neighbors, as three nearest neighbors
in category A and two nearest neighbors in category B. Consider the below image:
Fig. 3.7: Assigning data point to category A.

As we can see the 3 nearest neighbors are from category A, hence this new data point must
belong to category A.

How to select the value of K in the K-NN Algorithm?

Below are some points to remember while selecting the value of K in the K-NN algorithm:

 There is no particular way to determine the best value for "K", so we need to try some
values to find the best out of them. The most preferred value for K is 5.
 A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of
outliers in the model.
 Large values for K are good, but it may find some difficulties.

Advantages of KNN Algorithm

 It is simple to implement.
 It is robust to the noisy training data.
 It can be more effective if the training data is large.

Disadvantages of KNN Algorithm

 Always needs to determine the value of K which may be complex some time.
 The computation cost is high because of calculating the distance between the data
points for all the training samples.
CHAPTER 4

PROPOSED SYSTEM

4.1 Overview

The Python script that uses the Tkinter library to create a graphical user interface (GUI) for a
Smart Sensing System in an industrial environment. The GUI provides functionality for
uploading and preprocessing datasets, running various machine learning algorithms (Naive
Bayes, Random Forest, SVM, Logistic Regression, DNN, KNN), and displaying performance
metrics.

Fig. 4.1: Block diagram of proposed system.

Imported Libraries:

The script imports various libraries such as Tkinter for GUI, NumPy, Matplotlib, Pandas,
Scikit-learn, Seaborn, Webbrowser, TensorFlow, and others.

Global Variables:

Several global variables are declared to store information such as the dataset, file names,
trained models, and performance metrics.
Main GUI Window:

The main GUI window is created using Tkinter with a specified title and dimensions.

Functions:

The script defines several functions to perform specific tasks, including:

uploadDataset(): Allows the user to upload a dataset and displays basic information about it.

preprocessDataset(): Preprocesses the dataset by handling missing values, encoding

categorical variables, oversampling using SMOTE, and splitting into training and testing sets.

Functions for running machine learning algorithms (runNaiveBayes(), runRandomForest(),

runSVM(), runLR(), DNN(), runKNN()).

calculateMetrics(): Computes and displays performance metrics (accuracy, precision, recall,

F1-score) and confusion matrices for the chosen algorithm.

predict(): Allows the user to predict failure types for new test data.

graph(): Generates a comparison graph for the performance metrics of different algorithms.

GUI Components:

The GUI includes buttons for uploading and preprocessing the dataset, running various
algorithms, displaying metrics, and predicting from test data.The results and information are
displayed in a Text widget within the GUI.

Machine Learning Models and Serialization:

Trained models (Naive Bayes, Random Forest, SVM, Logistic Regression, DNN, KNN) are
saved using the pickle library to avoid retraining each time. The DNN model is saved in a
separate HDF5 file.

HTML Report:

The script generates an HTML report with a table containing the accuracy, precision, recall,
and F1-score for each algorithm. The report is opened in a web browser.

Graphical Visualization:

The code creates graphical visualizations such as count plots and confusion matrices using
Matplotlib and Seaborn.
Predictions:

The predict () function allows the user to select a test dataset and obtain predictions using the
trained model.

Overall, the script provides an interactive interface for users to upload datasets, preprocess
data, train machine learning models, and analyse the performance of different algorithms in
the context of an industrial sensing system.

4.2 Dataset Splitting

In machine learning data pre-processing, we divide our dataset into a training set and test set.
This is one of the crucial steps of data pre-processing as by doing this, we can enhance the
performance of our machine learning model. Suppose if we have given training to our
machine learning model by a dataset and we test it by a completely different dataset. Then, it
will create difficulties for our model to understand the correlations between the models.

If we train our model very well and its training accuracy is also very high, but we provide a
new dataset to it, then it will decrease the performance. So, we always try to make a machine
learning model which performs well with the training set and also with the test dataset. Here,
we can define these datasets as:

Training Set: A subset of dataset to train the machine learning model, and we already know
the output.

Test set: A subset of dataset to test the machine learning model, and by using the test set,
model predicts the output.

4.3 DNN

4.3.1 Perceptron

Although today the Perceptron is widely recognized as an algorithm, it was initially intended
as an image recognition machine. It gets its name from performing the human-like function
of perception, seeing, and recognizing images.
In particular, interest has been centered on the idea of a machine which would be capable of
conceptualizing inputs impinging directly from the physical environment of light, sound,
temperature, etc. — the “phenomenal world” with which we are all familiar — rather than
requiring the intervention of a human agent to digest and code the necessary information.
Rosenblatt’s perceptron machine relied on a basic unit of computation, the neuron. Just like in
previous models, each neuron has a cell that receives a series of pairs of inputs and weights.
The major difference in Rosenblatt’s model is that inputs are combined in a weighted
sum and, if the weighted sum exceeds a predefined threshold, the neuron fires and produces
an output.

Perceptron neuron model (left) and threshold logic (right).

Threshold T represents the activation function. If the weighted sum of the inputs is greater
than zero the neuron outputs the value 1, otherwise the output value is zero.

Perceptron for Binary Classification

With this discrete output, controlled by the activation function, the perceptron can be used as
a binary classification model, defining a linear decision boundary.

It finds the separating hyperplane that minimizes the distance between misclassified points
and the decision boundary. The perceptron loss function is defined as below:

To minimize this distance, perceptron uses stochastic gradient descent (SGD) as the
optimization function. If the data is linearly separable, it is guaranteed that SGD will
converge in a finite number of steps. The last piece that Perceptron needs is the activation
function, the function that determines if the neuron will fire or not. Initial Perceptron models
used sigmoid function, and just by looking at its shape, it makes a lot of sense! The sigmoid
function maps any real input to a value that is either 0 or 1 and encodes a non-linear function.
The neuron can receive negative numbers as input, and it will still be able to produce an
output that is either 0 or 1.

But, if you look at Deep Learning papers and algorithms from the last decade, you’ll see the
most of them use the Rectified Linear Unit (ReLU) as the neuron’s activation function. The
reason why ReLU became more adopted is that it allows better optimization using SGD,
more efficient computation and is scale-invariant, meaning, its characteristics are not affected
by the scale of the input.

The neuron receives inputs and picks an initial set of weights random. These are combined in
weighted sum and then ReLU, the activation function, determines the value of the output.

Perceptron neuron model (left) and activation function (right).

Perceptron uses SGD to find, or you might say learn, the set of weight that minimizes the
distance between the misclassified points and the decision boundary. Once SGD converges,
the dataset is separated into two regions by a linear hyperplane. Although it was said the
Perceptron could represent any circuit and logic, the biggest criticism was that it couldn’t
represent the XOR gate, exclusive OR, where the gate only returns 1 if the inputs are
different. This was proved almost a decade later and highlights the fact that Perceptron, with
only one neuron, can’t be applied to non-linear data.

4.3.2 DNN

The DNN was developed to tackle this limitation. It is a neural network where the mapping
between inputs and output is non-linear. A DNN has input and output layers, and one or
more hidden layers with many neurons stacked together. And while in the Perceptron the
neuron must have an activation function that imposes a threshold, like ReLU or sigmoid,
neurons in a DNN can use any arbitrary activation function.
Architecture of DNN.

DNN falls under the category of feedforward algorithms, because inputs are combined with
the initial weights in a weighted sum and subjected to the activation function, just like in the
Perceptron. But the difference is that each linear combination is propagated to the next layer.
Each layer is feeding the next one with the result of their computation, their internal
representation of the data. This goes all the way through the hidden layers to the output layer.

If the algorithm only computed the weighted sums in each neuron, propagated results to the
output layer, and stopped there, it wouldn’t be able to learn the weights that minimize the cost
function. If the algorithm only computed one iteration, there would be no actual learning.
This is where Backpropagation comes into play.

Backpropagation

Backpropagation is the learning mechanism that allows the DNN to iteratively adjust the
weights in the network, with the goal of minimizing the cost function. There is one hard
requirement for backpropagation to work properly.

The function that combines inputs and weights in a neuron, for instance the weighted sum,
and the threshold function, for instance ReLU, must be differentiable. These functions must
have a bounded derivative because Gradient Descent is typically the optimization function
used in DNN.
DNN, highlighting the Feedforward and Backpropagation steps.

In each iteration, after the weighted sums are forwarded through all layers, the gradient of
the Mean Squared Error is computed across all input and output pairs. Then, to propagate it
back, the weights of the first hidden layer are updated with the value of the gradient. That’s
how the weights are propagated back to the starting point of the neural network. One iteration
of Gradient Descent is defined as follows:

This process keeps going until gradient for each input-output pair has converged, meaning the
newly computed gradient hasn’t changed more than a specified convergence threshold,
compared to the previous iteration.
CHAPTER 5

UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized general-purpose
modeling language in the field of object-oriented software engineering. The standard is
managed, and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta-
model and a notation. In the future, some form of method or process may also be added to; or
associated with, UML.

The Unified Modeling Language is a standard language for specifying, Visualization,

Constructing and documenting the artifacts of software system, as well as for business
modeling and other non-software systems. The UML represents a collection of best
engineering practices that have proven successful in the modeling of large and complex
systems. The UML is a very important part of developing objects-oriented software and the
software development process. The UML uses mostly graphical notations to express the
design of software projects.

GOALS

The Primary goals in the design of the UML are as follows:

 Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
 Provide extendibility and specialization mechanisms to extend the core concepts.
 Be independent of particular programming languages and development process.
 Provide a formal basis for understanding the modeling language.
 Encourage the growth of OO tools market.
 Support higher level development concepts such as collaborations, frameworks,
patterns and components.
 Integrate best practices.
USE CASE DIAGRAM
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor. Roles of the actors
in the system can be depicted.
CLASS DIAGRAM

In software engineering, a class diagram in the Unified Modeling Language (UML) is a type
of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among the classes. It
explains which class contains information.

Activity diagram

The process flows in the system are captured in the activity diagram. Similar to a state
diagram, an activity diagram also consists of activities, actions, transitions, initial and final
states, and guard conditions.
Sequence diagram

A sequence diagram represents the interaction between different objects in the system. The
important aspect of a sequence diagram is that it is time-ordered. This means that the exact
sequence of the interactions between the objects is represented step by step. Different objects
in the sequence diagram interact with each other by passing "messages".
Deployment diagram: Deployment diagrams are used to visualize the topology of the
physical components of a system, where the software components are deployed.
Component diagram: Component diagrams are used to describe the physical artifacts of a
system.
CHAPTER 6

MACHINE LEARNING
What is Machine Learning

Before we take a look at the details of various machine learning methods, let's start by
looking at what machine learning is, and what it isn't. Machine learning is often categorized
as a subfield of artificial intelligence, but I find that categorization can often be misleading at
first brush. The study of machine learning certainly arose from research in this context, but in
the data science application of machine learning methods, it's more helpful to think of
machine learning as a means of building models of data.

Fundamentally, machine learning involves building mathematical models to help understand

data. "Learning" enters the fray when we give these models tunable parameters that can be
adapted to observed data; in this way the program can be considered to be "learning" from the
data. Once these models have been fit to previously seen data, they can be used to predict and
understand aspects of newly observed data. I'll leave to the reader the more philosophical
digression regarding the extent to which this type of mathematical, model-based "learning" is
similar to the "learning" exhibited by the human brain. Understanding the problem setting in
machine learning is essential to using these tools effectively, and so we will start with some
broad categorizations of the types of approaches we'll discuss here.

Categories of Machine Leaning

At the most fundamental level, machine learning can be categorized into two main types:
supervised learning and unsupervised learning.

Supervised learning involves somehow modeling the relationship between measured features
of data and some label associated with the data; once this model is determined, it can be used
to apply labels to new, unknown data. This is further subdivided into classification tasks
and regression tasks: in classification, the labels are discrete categories, while in regression,
the labels are continuous quantities. We will see examples of both types of supervised
learning in the following section.

Unsupervised learning involves modeling the features of a dataset without reference to any
label and is often described as "letting the dataset speak for itself." These models include
tasks such as clustering and dimensionality reduction. Clustering algorithms identify distinct
groups of data, while dimensionality reduction algorithms search for more succinct
representations of the data. We will see examples of both types of unsupervised learning in
the following section.

Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on earth
because they can think, evaluate, and solve complex problems. On the other side, AI is still in
its initial stage and have not surpassed human intelligence in many aspects. Then the question
is that what is the need to make machine learn? The most suitable reason for doing this is, “to
make decisions, based on data, with efficiency and scale”.

Lately, organizations are investing heavily in newer technologies like Artificial Intelligence,
Machine Learning and Deep Learning to get the key information from data to perform several
real-world tasks and solve problems. We can call it data-driven decisions taken by machines,
particularly to automate the process. These data-driven decisions can be used, instead of
using programing logic, in the problems that cannot be programmed inherently. The fact is
that we can’t do without human intelligence, but other aspect is that we all need to solve real-
world problems with efficiency at a huge scale. That is why the need for machine learning
arises.

Challenges in Machines Learning

While Machine Learning is rapidly evolving, making significant strides with cybersecurity
and autonomous cars, this segment of AI as whole still has a long way to go. The reason
behind is that ML has not been able to overcome number of challenges. The challenges that
ML is facing currently are −

1. Quality of data − Having good-quality data for ML algorithms is one of the biggest
challenges. Use of low-quality data leads to the problems related to data
preprocessing and feature extraction.

2. Time-Consuming task − Another challenge faced by ML models is the consumption

of time especially for data acquisition, feature extraction and retrieval.

3. Lack of specialist persons − As ML technology is still in its infancy stage, availability

of expert resources is a tough job.

4. No clear objective for formulating business problems − Having no clear objective and
well-defined goal for business problems is another key challenge for ML because this
technology is not that mature yet.

5. Issue of overfitting & underfitting − If the model is overfitting or underfitting, it

cannot be represented well for the problem.

6. Curse of dimensionality − Another challenge ML model faces is too many features of

data points. This can be a real hindrance.

7. Difficulty in deployment − Complexity of the ML model makes it quite difficult to be

deployed in real life.

Applications of Machines Learning

Machine Learning is the most rapidly growing technology and according to researchers we
are in the golden year of AI and ML. It is used to solve many real-world complex problems
which cannot be solved with traditional approach. Following are some real-world
applications of ML.

 Emotion analysis

 Sentiment analysis

 Error detection and prevention

 Weather forecasting and prediction

 Stock market analysis and forecasting

 Speech synthesis

 Speech recognition

 Customer segmentation

 Object recognition

 Fraud detection

 Fraud prevention

 Recommendation of products to customer in online shopping

How to Start Learning Machine Learning?

Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field of
study that gives computers the capability to learn without being explicitly programmed”.

And that was the beginning of Machine Learning! In modern times, Machine Learning is one
of the most popular (if not the most!) career choices. According to Indeed, Machine Learning
Engineer Is the Best Job of 2019 with a 344% growth and an average base salary
of $146,085 per year.

But there is still a lot of doubt about what exactly is Machine Learning and how to start
learning it? So, this article deals with the Basics of Machine Learning and also the path you
can follow to eventually become a full-fledged Machine Learning Engineer. Now let’s get
started!!!

How to start learning ML?

This is a rough roadmap you can follow on your way to becoming an insanely talented
Machine Learning Engineer. Of course, you can always modify the steps according to your
needs to reach your desired end-goal!

Step 1 – Understand the Prerequisites

In case you are a genius, you could start ML directly but normally, there are some
prerequisites that you need to know which include Linear Algebra, Multivariate Calculus,
Statistics, and Python. And if you don’t know these, never fear! You don’t need a Ph.D.
degree in these topics to get started but you do need a basic understanding.

(a) Learn Linear Algebra and Multivariate Calculus

Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However,
the extent to which you need them depends on your role as a data scientist. If you are more
focused on application heavy machine learning, then you will not be that heavily focused on
maths as there are many common libraries available. But if you want to focus on R&D in
Machine Learning, then mastery of Linear Algebra and Multivariate Calculus is very
important as you will have to implement many ML algorithms from scratch.

(b) Learn Statistics

Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML
expert will be spent collecting and cleaning data. And statistics is a field that handles the
collection, analysis, and presentation of data. So it is no surprise that you need to learn it!!!
Some of the key concepts in statistics that are important are Statistical Significance,
Probability Distributions, Hypothesis Testing, Regression, etc. Also, Bayesian Thinking is
also a very important part of ML which deals with various concepts like Conditional
Probability, Priors, and Posteriors, Maximum Likelihood, etc.

(c) Learn Python

Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn
them as they go along with trial and error. But the one thing that you absolutely cannot skip
is Python! While there are other languages you can use for Machine Learning like R, Scala,
etc. Python is currently the most popular language for ML. In fact, there are many Python
libraries that are specifically useful for Artificial Intelligence and Machine Learning such
as Keras, TensorFlow, Scikit-learn, etc.

So, if you want to learn ML, it’s best if you learn Python! You can do that using various
online resources and courses such as Fork Python available Free on GeeksforGeeks.

Step 2 – Learn Various ML Concepts

Now that you are done with the prerequisites, you can move on to actually learning ML
(Which is the fun part!!!) It’s best to start with the basics and then move on to the more
complicated stuff. Some of the basic concepts in ML are:

(a) Terminologies of Machine Learning

 Model – A model is a specific representation learned from data by applying some

machine learning algorithm. A model is also called a hypothesis.

 Feature – A feature is an individual measurable property of the data. A set of numeric

features can be conveniently described by a feature vector. Feature vectors are fed as
input to the model. For example, in order to predict a fruit, there may be features like
color, smell, taste, etc.

 Target (Label) – A target variable or label is the value to be predicted by our model.
For the fruit example discussed in the feature section, the label with each set of input
would be the name of the fruit like apple, orange, banana, etc.
 Training – The idea is to give a set of inputs(features) and it’s expected
outputs(labels), so after training, we will have a model (hypothesis) that will then map
new data to one of the categories trained on.

 Prediction – Once our model is ready, it can be fed a set of inputs to which it will
provide a predicted output(label).

(b) Types of Machine Learning

 Supervised Learning – This involves learning from a training dataset with labeled data
using classification and regression models. This learning process continues until the
required level of performance is achieved.

 Unsupervised Learning – This involves using unlabelled data and then finding the
underlying structure in the data in order to learn more and more about the data itself
using factor and cluster analysis models.

 Semi-supervised Learning – This involves using unlabelled data like Unsupervised

Learning with a small amount of labeled data. Using labeled data vastly increases the
learning accuracy and is also more cost-effective than Supervised Learning.

 Reinforcement Learning – This involves learning optimal actions through trial and
error. So, the next action is decided by learning behaviors that are based on the current
state and that will maximize the reward in the future.

Advantages of Machine learning

1. Easily identifies trends and patterns: Machine Learning can review large volumes of data
and discover specific trends and patterns that would not be apparent to humans. For instance,
for an e-commerce website like Amazon, it serves to understand the browsing behaviors and
purchase histories of its users to help cater to the right products, deals, and reminders relevant
to them. It uses the results to reveal relevant advertisements to them.

2. No human intervention needed (automation): With ML, you don’t need to babysit your
project every step of the way. Since it means giving machines the ability to learn, it lets them
make predictions and also improve the algorithms on their own. A common example of this is
anti-virus softwares; they learn to filter new threats as they are recognized. ML is also good at
recognizing spam.
3. Continuous Improvement: As ML algorithms gain experience, they keep improving in
accuracy and efficiency. This lets them make better decisions. Say you need to make a
weather forecast model. As the amount of data, you have keeps growing, your algorithms
learn to make more accurate predictions faster.

4. Handling multi-dimensional and multi-variety data: Machine Learning algorithms are

good at handling data that are multi-dimensional and multi-variety, and they can do this in
dynamic or uncertain environments.

5. Wide Applications: You could be an e-tailer or a healthcare provider and make ML work
for you. Where it does apply, it holds the capability to help deliver a much more personal
experience to customers while also targeting the right customers.

Disadvantages of Machine Learning

1. Data Acquisition: Machine Learning requires massive data sets to train on, and these
should be inclusive/unbiased, and of good quality. There can also be times where they must
wait for new data to be generated.

2. Time and Resources: ML needs enough time to let the algorithms learn and develop
enough to fulfill their purpose with a considerable amount of accuracy and relevancy. It also
needs massive resources to function. This can mean additional requirements of computer
power for you.

3. Interpretation of Results: Another major challenge is the ability to accurately interpret

results generated by the algorithms. You must also carefully choose the algorithms for your
purpose.

4. High error-susceptibility: Machine Learning is autonomous but highly susceptible to

errors. Suppose you train an algorithm with data sets small enough to not be inclusive. You
end up with biased predictions coming from a biased training set. This leads to irrelevant
advertisements being displayed to customers. In the case of ML, such blunders can set off a
chain of errors that can go undetected for long periods of time. And when they do get noticed,
it takes quite some time to recognize the source of the issue, and even longer to correct it.
CHAPTER 7

SOFTWARE ENVIRONMENT
What is Python?

Below are some facts about Python.

 Python is currently the most widely used multi-purpose, high-level programming

language.
 Python allows programming in Object-Oriented and Procedural paradigms. Python
programs generally are smaller than other programming languages like Java.
 Programmers have to type relatively less and indentation requirement of the language,
makes them readable all the time.
 Python language is being used by almost all tech-giant companies like – Google,
Amazon, Facebook, Instagram, Dropbox, Uber… etc.

The biggest strength of Python is huge collection of standard libraries which can be used for
the following –

 Machine Learning

 GUI Applications (like Kivy, Tkinter, PyQt etc.)

 Web frameworks like Django (used by YouTube, Instagram, Dropbox)

 Image processing (like Opencv, Pillow)

 Web scraping (like Scrapy, BeautifulSoup, Selenium)

 Test frameworks

 Multimedia

Advantages of Python
Let’s see how Python dominates over other languages.

1. Extensive Libraries

Python downloads with an extensive library and it contain code for various purposes like
regular expressions, documentation-generation, unit-testing, web browsers, threading,
databases, CGI, email, image manipulation, and more. So, we don’t have to write the
complete code for that manually.

2. Extensible

As we have seen earlier, Python can be extended to other languages. You can write some of
your code in languages like C++ or C. This comes in handy, especially in projects.

3. Embeddable

Complimentary to extensibility, Python is embeddable as well. You can put your Python code
in your source code of a different language, like C++. This lets us add scripting capabilities to
our code in the other language.

4. Improved Productivity

The language’s simplicity and extensive libraries render programmers more productive than
languages like Java and C++ do. Also, the fact that you need to write less and get more things
done.

5. IOT Opportunities

Since Python forms the basis of new platforms like Raspberry Pi, it finds the future bright for
the Internet of Things. This is a way to connect the language with the real world.

6. Simple and Easy

When working with Java, you may have to create a class to print ‘Hello World’. But in
Python, just a print statement will do. It is also quite easy to learn, understand, and code. This
is why when people pick up Python, they have a hard time adjusting to other more verbose
languages like Java.

7. Readable

Because it is not such a verbose language, reading Python is much like reading English. This
is the reason why it is so easy to learn, understand, and code. It also does not need curly
braces to define blocks, and indentation is mandatory. These further aids the readability of the
code.

8. Object-Oriented

This language supports both the procedural and object-oriented programming paradigms.
While functions help us with code reusability, classes and objects let us model the real world.
A class allows the encapsulation of data and functions into one.

9. Free and Open-Source

Like we said earlier, Python is freely available. But not only can you download Python for
free, but you can also download its source code, make changes to it, and even distribute it. It
downloads with an extensive collection of libraries to help you with your tasks.

10. Portable

When you code your project in a language like C++, you may need to make some changes to
it if you want to run it on another platform. But it isn’t the same with Python. Here, you need
to code only once, and you can run it anywhere. This is called Write Once Run Anywhere
(WORA). However, you need to be careful enough not to include any system-dependent
features.

11. Interpreted

Lastly, we will say that it is an interpreted language. Since statements are executed one by
one, debugging is easier than in compiled languages.

Any doubts till now in the advantages of Python? Mention in the comment section.

Advantages of Python Over Other Languages

1. Less Coding

Almost all of the tasks done in Python requires less coding when the same task is done in
other languages. Python also has an awesome standard library support, so you don’t have to
search for any third-party libraries to get your job done. This is the reason that many people
suggest learning Python to beginners.

2. Affordable
Python is free therefore individuals, small companies or big organizations can leverage the
free available resources to build applications. Python is popular and widely used so it gives
you better community support.

The 2019 Github annual survey showed us that Python has overtaken Java in the most
popular programming language category.

3. Python is for Everyone

Python code can run on any machine whether it is Linux, Mac or Windows. Programmers
need to learn different languages for different jobs but with Python, you can professionally
build web apps, perform data analysis and machine learning, automate things, do web
scraping and also build games and powerful visualizations. It is an all-rounder programming
language.

Disadvantages of Python

So far, we’ve seen why Python is a great choice for your project. But if you choose it, you
should be aware of its consequences as well. Let’s now see the downsides of choosing Python
over another language.

1. Speed Limitations

We have seen that Python code is executed line by line. But since Python is interpreted, it
often results in slow execution. This, however, isn’t a problem unless speed is a focal point
for the project. In other words, unless high speed is a requirement, the benefits offered by
Python are enough to distract us from its speed limitations.

2. Weak in Mobile Computing and Browsers

While it serves as an excellent server-side language, Python is much rarely seen on the client-
side. Besides that, it is rarely ever used to implement smartphone-based applications. One
such application is called Carbonnelle.

The reason it is not so famous despite the existence of Brython is that it isn’t that secure.

3. Design Restrictions

As you know, Python is dynamically-typed. This means that you don’t need to declare the
type of variable while writing the code. It uses duck-typing. But wait, what’s that? Well, it
just means that if it looks like a duck, it must be a duck. While this is easy on the
programmers during coding, it can raise run-time errors.

4. Underdeveloped Database Access Layers

Compared to more widely used technologies like JDBC (Java DataBase

Connectivity) and ODBC (Open DataBase Connectivity), Python’s database access layers are
a bit underdeveloped. Consequently, it is less often applied in huge enterprises.

5. Simple

No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my example. I
don’t do Java, I’m more of a Python person. To me, its syntax is so simple that the verbosity
of Java code seems unnecessary.

This was all about the Advantages and Disadvantages of Python Programming Language.

History of Python

What do the alphabet and the programming language Python have in common? Right, both
start with ABC. If we are talking about ABC in the Python context, it's clear that the
programming language ABC is meant. ABC is a general-purpose programming language and
programming environment, which had been developed in the Netherlands, Amsterdam, at the
CWI (Centrum Wiskunde &Informatica). The greatest achievement of ABC was to influence
the design of Python. Python was conceptualized in the late 1980s. Guido van Rossum
worked that time in a project at the CWI, called Amoeba, a distributed operating system. In
an interview with Bill Venners1, Guido van Rossum said: "In the early 1980s, I worked as an
implementer on a team building a language called ABC at Centrum voor Wiskunde en
Informatica (CWI). I don't know how well people know ABC's influence on Python. I try to
mention ABC's influence because I'm indebted to everything I learned during that project and
to the people who worked on it. "Later on in the same Interview, Guido van Rossum
continued: "I remembered all my experience and some of my frustration with ABC. I decided
to try to design a simple scripting language that possessed some of ABC's better properties,
but without its problems. So I started typing. I created a simple virtual machine, a simple
parser, and a simple runtime. I made my own version of the various ABC parts that I liked. I
created a basic syntax, used indentation for statement grouping instead of curly braces or
begin-end blocks, and developed a small number of powerful data types: a hash table (or
dictionary, as we call it), a list, strings, and numbers."
Python Development Steps

Guido Van Rossum published the first version of Python code (version 0.9.0) at alt.sources in
February 1991. This release included already exception handling, functions, and the core data
types of lists, dict, str and others. It was also object oriented and had a module system.
Python version 1.0 was released in January 1994. The major new features included in this
release were the functional programming tools lambda, map, filter and reduce, which Guido
Van Rossum never liked. Six and a half years later in October 2000, Python 2.0 was
introduced. This release included list comprehensions, a full garbage collector and it was
supporting unicode. Python flourished for another 8 years in the versions 2.x before the next
major release as Python 3.0 (also known as "Python 3000" and "Py3K") was released. Python
3 is not backwards compatible with Python 2.x. The emphasis in Python 3 had been on the
removal of duplicate programming constructs and modules, thus fulfilling or coming close to
fulfilling the 13th law of the Zen of Python: "There should be one -- and preferably only one
-- obvious way to do it."Some changes in Python 7.3:

 Print is now a function.

 Views and iterators instead of lists

 The rules for ordering comparisons have been simplified. E.g., a heterogeneous list
cannot be sorted, because all the elements of a list must be comparable to each
other.

 There is only one integer type left, i.e., int. long is int as well.

 The division of two integers returns a float instead of an integer. "//" can be used to
have the "old" behaviour.

 Text Vs. Data Instead of Unicode Vs. 8-bit

Purpose

We demonstrated that our approach enables successful segmentation of intra-retinal layers—

even with low-quality images containing speckle noise, low contrast, and different intensity
ranges throughout—with the assistance of the ANIS feature.

Python
Python is an interpreted high-level programming language for general-purpose programming.
Created by Guido van Rossum and first released in 1991, Python has a design philosophy that
emphasizes code readability, notably using significant whitespace.

Python features a dynamic type system and automatic memory management. It supports
multiple programming paradigms, including object-oriented, imperative, functional and
procedural, and has a large and comprehensive standard library.

 Python is Interpreted − Python is processed at runtime by the interpreter. You do not

need to compile your program before executing it. This is similar to PERL and PHP.

 Python is Interactive − you can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.

Python also acknowledges that speed of development is important. Readable and terse code is
part of this, and so is access to powerful constructs that avoid tedious repetition of code.
Maintainability also ties into this may be an all but useless metric, but it does say something
about how much code you have to scan, read and/or understand to troubleshoot problems or
tweak behaviors. This speed of development, the ease with which a programmer of other
languages can pick up basic Python skills and the huge standard library is key to another area
where Python excels. All its tools have been quick to implement, saved a lot of time, and
several of them have later been patched and updated by people with no Python background -
without breaking.

Modules Used in Project

TensorFlow

TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library and is also used
for machine learning applications such as neural networks. It is used for both research and
production at Google.‍

TensorFlow was developed by the Google Brain team for internal Google use. It was released
under the Apache 2.0 open-source license on November 9, 2015.

NumPy
NumPy is a general-purpose array-processing package. It provides a high-performance
multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains various
features including these important ones:

 A powerful N-dimensional array object

 Sophisticated (broadcasting) functions

 Tools for integrating C/C++ and Fortran code

 Useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional
container of generic data. Arbitrary datatypes can be defined using NumPy which allows
NumPy to seamlessly and speedily integrate with a wide variety of databases.

Pandas

Pandas is an open-source Python Library providing high-performance data manipulation and

analysis tool using its powerful data structures. Python was majorly used for data munging
and preparation. It had very little contribution towards data analysis. Pandas solved this
problem. Using Pandas, we can accomplish five typical steps in the processing and analysis
of data, regardless of the origin of data load, prepare, manipulate, model, and analyze. Python
with Pandas is used in a wide range of fields including academic and commercial domains
including finance, economics, Statistics, analytics, etc.

Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a

variety of hardcopy formats and interactive environments across platforms. Matplotlib can be
used in Python scripts, the Python and IPython shells, the Jupyter Notebook, web application
servers, and four graphical user interface toolkits. Matplotlib tries to make easy things easy
and hard things possible. You can generate plots, histograms, power spectra, bar charts, error
charts, scatter plots, etc., with just a few lines of code. For examples, see the sample
plots and thumbnail gallery.

For simple plotting the pyplot module provides a MATLAB-like interface, particularly when
combined with IPython. For the power user, you have full control of line styles, font
properties, axes properties, etc, via an object-oriented interface or via a set of functions
familiar to MATLAB users.

Scikit – learn

Scikit-learn provides a range of supervised and unsupervised learning algorithms via a

consistent interface in Python. It is licensed under a permissive simplified BSD license and is
distributed under many Linux distributions, encouraging academic and commercial use.
Python

Python is an interpreted high-level programming language for general-purpose programming.

Created by Guido van Rossum and first released in 1991, Python has a design philosophy that
emphasizes code readability, notably using significant whitespace.

 Python is Interpreted − Python is processed at runtime by the interpreter. You do not

need to compile your program before executing it. This is similar to PERL and PHP.

 Python is Interactive − you can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.

Install Python Step-by-Step in Windows and Mac

Python a versatile programming language doesn’t come pre-installed on your computer

devices. Python was first released in the year 1991 and until today it is a very popular high-
level programming language. Its style philosophy emphasizes code readability with its
notable use of great whitespace.

The object-oriented approach and language construct provided by Python enables

programmers to write both clear and logical code for projects. This software does not come
pre-packaged with Windows.

How to Install Python on Windows and Mac

There have been several updates in the Python version over the years. The question is how to
install Python? It might be confusing for the beginner who is willing to start learning Python
but this tutorial will solve your query. The latest or the newest version of Python is version
3.7.4 or in other words, it is Python 3.

Note: The python version 3.7.4 cannot be used on Windows XP or earlier devices.

Before you start with the installation process of Python. First, you need to know about
your System Requirements. Based on your system type i.e., operating system and based
processor, you must download the python version. My system type is a Windows 64-bit
operating system. So the steps below are to install python version 3.7.4 on Windows 7 device
or to install Python 3. Download the Python Cheatsheet here. The steps on how to install
Python on Windows 10, 8 and 7 are divided into 4 parts to help understand better.

Download the Correct version into the system

Step 1: Go to the official site to download and install python using Google Chrome or any
other web browser. OR Click on the following link: https://www.python.org
Now, check for the latest and the correct version for your operating system.

Step 2: Click on the Download Tab.

Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow Color
or you can scroll further down and click on download with respective to their version. Here,
we are downloading the most recent python version for windows 3.7.4

Step 4: Scroll down the page until you find the Files option.

Step 5: Here you see a different version of python along with the operating system.
 To download Windows 32-bit python, you can select any one from the three options:
Windows x86 embeddable zip file, Windows x86 executable installer or Windows x86
web-based installer.

 To download Windows 64-bit python, you can select any one from the three options:
Windows x86-64 embeddable zip file, Windows x86-64 executable installer or
Windows x86-64 web-based installer.

Here we will install Windows x86-64 web-based installer. Here your first part regarding
which version of python is to be downloaded is completed. Now we move ahead with the
second part in installing python i.e., Installation

Note: To know the changes or updates that are made in the version you can click on the
Release Note Option.

Installation of Python

Step 1: Go to Download and Open the downloaded python version to carry out the
installation process.
Step 2: Before you click on Install Now, Make sure to put a tick on Add Python 3.7 to PATH.

Step 3: Click on Install NOW After the installation is successful. Click on Close.
With these above three steps on python installation, you have successfully and correctly
installed Python. Now is the time to verify the installation.

Note: The installation process might take a couple of minutes.

Verify the Python Installation

Step 1: Click on Start

Step 2: In the Windows Run Command, type “cmd”.

Step 3: Open the Command prompt option.

Step 4: Let us test whether the python is correctly installed. Type python –V and press Enter.

Step 5: You will get the answer as 3.7.4

Note: If you have any of the earlier versions of Python already installed. You must first
uninstall the earlier version and then install the new one.

Check how the Python IDLE works

Step 1: Click on Start

Step 2: In the Windows Run command, type “python idle”.

Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program

Step 4: To go ahead with working in IDLE you must first save the file. Click on File > Click
on Save

Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I have
named the files as Hey World.

Step 6: Now for e.g. enter print (“Hey World”) and Press Enter.
You will see that the command given is launched. With this, we end our tutorial on how to
install Python. You have learned how to download python for windows into your respective
operating system.

Note: Unlike Java, Python does not need semicolons at the end of the statements otherwise it
won’t work.
CHAPTER 8

SYSTEM REQUIREMENTS SPECIFICATIONS

Software Requirements

The functional requirements or the overall description documents include the product
perspective and features, operating system and operating environment, graphics requirements,
design constraints and user documentation.

The appropriation of requirements and implementation constraints gives the general overview
of the project in regard to what the areas of strength and deficit are and how to tackle them.

 Python IDLE 3.7 version (or)

 Anaconda 3.7 (or)
 Jupiter (or)
 Google colab

Hardware Requirements

Minimum hardware requirements are very dependent on the particular software being
developed by a given Enthought Python / Canopy / VS Code user. Applications that need to
store large arrays/objects in memory will require more RAM, whereas applications that need
to perform numerous calculations or tasks more quickly will require a faster processor.

Operating system : Windows, Linux

Processor : minimum intel i3

Ram : minimum 4 GB

Hard disk : minimum 250GB

CHAPTER 9

SOURCE CODE
from tkinter import messagebox

from tkinter import *

from tkinter import simpledialog

import tkinter

from tkinter import filedialog

from tkinter.filedialog import askopenfilename

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

from sklearn.metrics import accuracy_score

from sklearn.model_selection import train_test_split

import os

from sklearn.metrics import precision_score

from sklearn.metrics import recall_score

from sklearn.metrics import f1_score

from sklearn.metrics import accuracy_score,confusion_matrix,classification_report

from sklearn.naive_bayes import GaussianNB

from sklearn import svm

from sklearn.metrics import precision_score

from sklearn.metrics import recall_score

from sklearn.metrics import f1_score

import seaborn as sns

import webbrowser

from sklearn.preprocessing import LabelEncoder

from sklearn.ensemble import RandomForestClassifier

import pickle

from sklearn.neighbors import KNeighborsClassifier

from sklearn.linear_model import LogisticRegression

from sklearn.preprocessing import normalize

from sklearn.decomposition import PCA

from imblearn.over_sampling import SMOTE

global filename

global X,Y

global dataset

global main

global text

import os

from tensorflow.keras.models import load_model

global accuracy, precision, recall, fscore

global X_train, X_test, y_train, y_test

global classifier

global label_encoder, labels, columns, types, pca

import tensorflow as tf

from tensorflow.keras import layers, models

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

accuracy = []

precision = []

recall = []

fscore = []

main = tkinter.Tk()

main.title("Smart sensing system in industrial Environment") #designing main screen

main.geometry("1300x1200")

def getLabel(name):

label = -1

for i in range(len(labels)):

if name == labels[i]:

label = i

break

return label

#fucntion to upload dataset

def uploadDataset():

global filename, dataset, labels

text.delete('1.0', END)

filename = filedialog.askopenfilename(initialdir = "Dataset")

text.insert(END,filename+' Loaded\n')

dataset = pd.read_csv(filename)

text.insert(END,str(dataset.head())+"\n\n")

text.insert(END,filename+" loaded\n\n")
labels = np.unique(dataset['Failure Type']).tolist()

print(labels)

text.update_idletasks()

attack = dataset.groupby('Failure Type').size()

attack.plot(kind="bar")

plt.xlabel('DDOS Attacks')

plt.ylabel('Number of Records')

plt.title('Different Attacks found in dataset')

plt.show()

def preprocessDataset():

global X, Y,X_train, X_test, y_train, y_test

global le

global dataset

global x_train,y_train,x_test,y_test

le = LabelEncoder()

text.delete('1.0', END)

dataset.fillna(0, inplace = True)

print(dataset.info())

text.insert(END,str(dataset.head())+"\n\n")

# Create a count plot

sns.set(style="darkgrid") # Set the style of the plot

plt.figure(figsize=(8, 6)) # Set the figure size

# Replace 'dataset' with your actual DataFrame and 'Drug' with the column name
ax = sns.countplot(x='Failure Type', data=dataset, palette="Set3")

plt.title("Count Plot") # Add a title to the plot

plt.xlabel("Categories") # Add label to x-axis

plt.ylabel("Count") # Add label to y-axis

# Annotate each bar with its count value

for p in ax.patches:

ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),

ha='center', va='center', fontsize=10, color='black', xytext=(0, 5),

textcoords='offset points')

plt.show() # Display the plot

le = LabelEncoder()

dataset['Failure Type'] = le.fit_transform(dataset['Failure Type'])

dataset['Product ID'] = le.fit_transform(dataset['Product ID'])

dataset['Type'] = le.fit_transform(dataset['Type'])

X = dataset.drop(["Target", "Failure Type"], axis=1)

y=dataset.iloc[:,-1]

smote = SMOTE(sampling_strategy='auto', random_state=42)

X,y= smote.fit_resample(X, y)

text.insert(END,"Total records found in dataset: "+str(X.shape[0])+"\n\n")

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=0)

text.insert(END,"Total records found in dataset to train: "+str(X_train.shape[0])+"\n\n")

text.insert(END,"Total records found in dataset to test: "+str(X_test.shape[0])+"\n\n")

print(X_train)

sns.set(style="darkgrid") # Set the style of the plot

plt.figure(figsize=(8, 6)) # Set the figure size

# Replace 'dataset' with your actual DataFrame and 'Drug' with the column name

ax = sns.countplot(x=y, data=dataset, palette="Set3")

plt.title("Count Plot") # Add a title to the plot

plt.xlabel("Categories") # Add label to x-axis

plt.ylabel("Count") # Add label to y-axis

# Annotate each bar with its count value

for p in ax.patches:

ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),

ha='center', va='center', fontsize=10, color='black', xytext=(0, 5),

textcoords='offset points')

plt.show()

def calculateMetrics(algorithm, predict, y_test):

a = accuracy_score(y_test,predict)*100

p = precision_score(y_test, predict,average='macro') * 100

r = recall_score(y_test, predict,average='macro') * 100

f = f1_score(y_test, predict,average='macro') * 100

accuracy.append(a)

precision.append(p)

recall.append(r)

fscore.append(f)

text.insert(END,algorithm+" Accuracy : "+str(a)+"\n")

text.insert(END,algorithm+" Precision : "+str(p)+"\n")

text.insert(END,algorithm+" Recall : "+str(r)+"\n")

text.insert(END,algorithm+" FScore : "+str(f)+"\n\n")

text.update_idletasks()

print(np.unique(predict))

print(np.unique(y_test))

conf_matrix = confusion_matrix(y_test, predict)

#plt.figure(figsize =(6, 6))

ax = sns.heatmap(conf_matrix, xticklabels = labels, yticklabels = labels, annot = True,

cmap="viridis" ,fmt ="g");

ax.set_ylim([0,len(labels)])

plt.title(algorithm+" Confusion matrix")

plt.ylabel('True class')

plt.xlabel('Predicted class')

plt.show()

def runNaiveBayes():

global X, Y, X_train, X_test, y_train, y_test

global accuracy, precision,recall, fscore

accuracy = []

precision = []

recall = []

fscore = []

text.delete('1.0', END)

if os.path.exists('model/nb.txt'):

with open('model/nb.txt', 'rb') as file:

nb = pickle.load(file)

file.close()
else:

nb = GaussianNB()

nb.fit(X_train, y_train)

with open('model/nb.txt', 'wb') as file:

pickle.dump(nb, file)

file.close()

predict = nb.predict(X_test)

calculateMetrics("Naive Bayes", predict, y_test)

def runRandomForest():

global classifier

if os.path.exists('model/rf.txt'):

with open('model/rf.txt', 'rb') as file:

rf = pickle.load(file)

file.close()

else:

rf = RandomForestClassifier(n_estimators=1,criterion="entropy",max_depth=10)

rf.fit(X_train, y_train)

with open('model/rf.txt', 'wb') as file:

pickle.dump(rf, file)

file.close()

predict = rf.predict(X_test)

classifier = rf

calculateMetrics("Random Forest", predict, y_test)

def runSVM():

if os.path.exists('model/svm.txt'):

with open('model/svm.txt', 'rb') as file:

svm_cls = pickle.load(file)

file.close()

else:

svm_cls = svm.SVC()

svm_cls.fit(X_train, y_train)

with open('model/svm.txt', 'wb') as file:

pickle.dump(svm_cls, file)

file.close()

predict = svm_cls.predict(X_test)

calculateMetrics("SVM", predict, y_test)

def runLR():

global LR

if os.path.exists('model/LR.txt'):

with open('model/LR.txt', 'rb') as file:

LR_cls = pickle.load(file)

file.close()

else:

LR_cls = LogisticRegression()

LR_cls.fit(X_train, y_train)

with open('model/LR_cls.txt', 'wb') as file:

pickle.dump(LR_cls, file)
file.close()

predict = LR_cls.predict(X_test)

calculateMetrics("LogisticRegression", predict, y_test)

def DNN():

global y_test,model,scaler

# Standardize the features

scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)

X_test_scaled = scaler.transform(X_test)

y_train1 = y_train.values

y_test1 = y_test.values

model_path = 'model.h5'

# Check if the model file exists

if os.path.exists(model_path):

# Load the pre-trained model

model = load_model(model_path)

else:

# Build a simple Deep Neural Network

model = models.Sequential([

layers.Dense(128, activation='relu', input_shape=(X_train.shape[1],)),

layers.Dropout(0.5),

layers.Dense(64, activation='relu'),

layers.Dropout(0.5),

layers.Dense(6, activation='softmax') # Assuming 6 classes for Failure types

])

# Compile the model

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

# Train the model

model.fit(X_train_scaled, y_train1, epochs=30, batch_size=32, validation_split=0.2)

# Save the trained model

model.save(model_path)

# Evaluate the model on the test set

test_loss, test_accuracy = model.evaluate(X_test_scaled, y_test1)

print(f'Test Accuracy: {test_accuracy}')

print(f'Test Loss: {test_loss}')

# Predictions and Metrics

predict = model.predict(X_test_scaled)

predict = np.argmax(predict, axis=1)

calculateMetrics("DNN", predict, y_test1)

def runKNN():

if os.path.exists('model/knn.txt'):

with open('model/knn.txt', 'rb') as file:

knn_cls = pickle.load(file)

file.close()

else:

knn_cls = KNeighborsClassifier(n_neighbors = 2)

knn_cls.fit(X_train, y_train)

with open('model/knn.txt', 'wb') as file:

pickle.dump(knn_cls, file)

file.close()

predict = knn_cls.predict(X_test)

calculateMetrics("KNN", predict, y_test)

from tkinter import filedialog

from sklearn.preprocessing import LabelEncoder, StandardScaler

def predict():

global label_encoder, labels, columns, types, pca, scaler, model

label_encoder=LabelEncoder()

# Assuming 'model' is your pre-trained model and 'text' is your tkinter text widget

text.delete('1.0', END)

filename = filedialog.askopenfilename(initialdir="testData")

test = pd.read_csv(filename)

# Apply the same preprocessing as during training

test['Product ID'] = label_encoder.fit_transform(test['Product ID'])

test['Type'] = label_encoder.fit_transform(test['Type'])

# Assuming 'scaler' is the scaler used during training

test_scaled = scaler.transform(test)

predictions = []

for i in range(len(test)):

# Convert the Pandas DataFrame to a NumPy array

row = test_scaled[i].reshape(1, -1)

# Assuming 'model' is your pre-trained model

predicted_data = model.predict(row)

# Extract the predicted class index

predicted_class = np.argmax(predicted_data)

# Map the class index to the corresponding label

if predicted_class == 0:

predicted_label = "No Failure"

elif predicted_class == 1:

predicted_label = "Heat Dissipation Failure"

elif predicted_class == 2:

predicted_label = "Overstrain Failure"

elif predicted_class == 3:

predicted_label = "Power Failure"

elif predicted_class == 4:

predicted_label = "Random Failures"

elif predicted_class == 5:

predicted_label = "Tool Wear Failure"

predictions.append(predicted_label)

text.insert(END, f'Test for row {i}: {row}\n')

text.insert(END, f'Predicted output for row {i}: {predicted_label}\n')

return predictions

def graph():

output = "<html><body><table align=center border=1><tr><th>Algorithm

Name</th><th>Accuracy</th><th>Precision</th><th>Recall</th>"

output+="<th>FSCORE</th></tr>"

output+="<tr><td>Naive Bayes
Algorithm</td><td>"+str(accuracy[0])+"</td><td>"+str(precision[0])+"</
td><td>"+str(recall[0])+"</td><td>"+str(fscore[0])+"</td></tr>"

output+="<tr><td>Random Forest
Algorithm</td><td>"+str(accuracy[1])+"</td><td>"+str(precision[1])+"</
td><td>"+str(recall[1])+"</td><td>"+str(fscore[1])+"</td></tr>"

output+="<tr><td>SVM
Algorithm</td><td>"+str(accuracy[2])+"</td><td>"+str(precision[2])+"</
td><td>"+str(recall[2])+"</td><td>"+str(fscore[2])+"</td></tr>"

output+="<tr><td>Logistc Regression
Algorithm</td><td>"+str(accuracy[3])+"</td><td>"+str(precision[3])+"</
td><td>"+str(recall[3])+"</td><td>"+str(fscore[3])+"</td></tr>"

output+="<tr><td>DNN
Algorithm</td><td>"+str(accuracy[4])+"</td><td>"+str(precision[4])+"</
td><td>"+str(recall[4])+"</td><td>"+str(fscore[4])+"</td></tr>"

output+="<tr><td>KNN
Algorithm</td><td>"+str(accuracy[5])+"</td><td>"+str(precision[5])+"</
td><td>"+str(recall[5])+"</td><td>"+str(fscore[5])+"</td></tr>"

output+="</table></body></html>"

f = open("table.html", "w")
f.write(output)

f.close()

webbrowser.open("table.html",new=2)

df = pd.DataFrame([['Naive Bayes','Precision',precision[0]],['Naive
Bayes','Recall',recall[0]],['Naive Bayes','F1 Score',fscore[0]],['Naive
Bayes','Accuracy',accuracy[0]],

['Random Forest','Precision',precision[1]],['Random Forest','Recall',recall[1]],

['Random Forest','F1 Score',fscore[1]],['Random Forest','Accuracy',accuracy[1]],

['SVM','Precision',precision[2]],['SVM','Recall',recall[2]],['SVM','F1
Score',fscore[2]],['SVM','Accuracy',accuracy[2]],

['Logistc Regression','Precision',precision[3]],['Logistc
Regression','Recall',recall[3]],['Logistc Regression','F1 Score',fscore[3]],['Logistc
Regression','Accuracy',accuracy[3]],

['DNN','Precision',precision[4]],['DNN','Recall',recall[4]],['DNN','F1
Score',fscore[4]],['DNN','Accuracy',accuracy[4]],

['KNN','Precision',precision[5]],['KNN','Recall',recall[5]],['KNN','F1
Score',fscore[5]],['KNN','Accuracy',accuracy[5]],

],columns=['Algorithms','Performance Output','Value'])

df.pivot("Algorithms", "Performance Output", "Value").plot(kind='bar')

plt.show()

font = ('times', 16, 'bold')

title = Label(main, text='Smart sensing system in industrial Environment')

title.config(bg='greenyellow', fg='dodger blue')

title.config(font=font)

title.config(height=3, width=120)
title.place(x=0,y=5)

font1 = ('times', 12, 'bold')

text=Text(main,height=20,width=150)

scroll=Scrollbar(text)

text.configure(yscrollcommand=scroll.set)

text.place(x=50,y=120)

text.config(font=font1)

font1 = ('times', 13, 'bold')

uploadButton = Button(main, text="Upload Dataset", command=uploadDataset)

uploadButton.place(x=50,y=550)

uploadButton.config(font=font1)

preprocessButton = Button(main, text="Preprocess Dataset", command=preprocessDataset)

preprocessButton.place(x=330,y=550)

preprocessButton.config(font=font1)

nbButton = Button(main, text="Run Naive Bayes Algorithm", command=runNaiveBayes)

nbButton.place(x=630,y=550)

nbButton.config(font=font1)

rfButton = Button(main, text="Run Random Forest Algorithm",

command=runRandomForest)

rfButton.place(x=920,y=550)
rfButton.config(font=font1)

svmButton = Button(main, text="Run SVM Algorithm", command=runSVM)

svmButton.place(x=50,y=600)

svmButton.config(font=font1)

xgButton = Button(main, text="Run Logistic Regression Algorithm", command=runLR)

xgButton.place(x=330,y=600)

xgButton.config(font=font1)

adaboostButton = Button(main, text="Run DNN Algorithm", command=DNN)

adaboostButton.place(x=630,y=600)

adaboostButton.config(font=font1)

knnButton = Button(main, text="Run KNN Algorithm", command=runKNN)

knnButton.place(x=920,y=600)

knnButton.config(font=font1)

graphButton = Button(main, text="Comparison Graph", command=graph)

graphButton.place(x=50,y=650)

graphButton.config(font=font1)

predictButton = Button(main, text="Predict from Test Data", command=predict)

predictButton.place(x=330,y=650)

predictButton.config(font=font1)

main.config()

main.mainloop()

CHAPTER 10
RESULTS AND DISCUSSION

10.1 Implementation Description

The Project initializes the main GUI window using Tkinter, setting the title as "Smart Sensing
System in Industrial Environment" and specifying the dimensions as 1300x1200 pixels.

 Dataset Upload and Display: The uploadDataset() function allows users to upload a
dataset using a file dialog. The selected dataset is displayed in the Tkinter Text
widget, along with the first few rows of the dataset. Additionally, a bar chart is created
to visualize the distribution of different failure types in the dataset.
 Dataset Preprocessing: The preprocessDataset() function is responsible for
preprocessing the uploaded dataset. It fills missing values with 0, encodes categorical
variables using Label Encoding, and oversamples the data using SMOTE to address
class imbalance. The processed dataset is then split into training and testing sets.
Count plots are created to visualize the distribution of failure types before and after
preprocessing.
 Algorithm Execution: Buttons for Naive Bayes, Random Forest, SVM, Logistic
Regression, DNN, and KNN algorithms trigger the corresponding functions
(runNaiveBayes(), runRandomForest(), runSVM(), runLR(), DNN(), runKNN()).
These functions either load pre-trained models (if available) or train new models and
display performance metrics (accuracy, precision, recall, F1-score) and confusion
matrices.
 DNN Model Training: The DNN() function builds a simple Deep Neural Network
(DNN) model using TensorFlow's Keras API. It checks if a pre-trained model exists
and loads it; otherwise, it trains a new model, evaluates it, and saves the trained model
for future use.
 Prediction from Test Data: The predict() function allows users to select a test dataset
using a file dialog. The chosen dataset is preprocessed and normalized based on the
training data, and predictions are made using the pre-trained DNN model. The
predictions are displayed in the Tkinter Text widget.
 HTML Report and Comparison Graph: The graph() function generates an HTML
report containing a table with performance metrics for each algorithm. Additionally, a
bar chart is created to visually compare the performance metrics of different
algorithms.
 Graphical User Interface Components: The GUI includes buttons for dataset upload,
preprocessing, algorithm execution, performance metric display, prediction from test
data, and comparison graph. The results and information are presented in a Tkinter
Text widget.
 Serialization of Models: Trained models are serialized using the pickle library and
saved to files (nb.txt, rf.txt, svm.txt, LR.txt, model.h5, knn.txt). This allows the
models to be loaded quickly without retraining.
 Graphical Visualizations: Matplotlib and Seaborn are used to create graphical
visualizations, including count plots, confusion matrices, and a bar chart for algorithm
comparison.

10.2 Dataset Description

Dataset Description: The dataset captures various parameters related to an industrial process,
aiming to monitor and predict the occurrence of failures.

Let's delve deeper into the key aspects of the dataset:

Unique Data Identifier (UDI): A unique identifier for each data entry, ensuring distinctiveness
and traceability.

Product ID: Represents the identifier for different products involved in the industrial process.

Type: Categorical variable denoting the type of product. Understanding the product types can
provide insights into how different categories may contribute to failures.

Air Temperature and Process Temperature: Measure the temperature of both the air and the
industrial process. These factors are crucial in maintaining optimal conditions for the
manufacturing process.

Rotational Speed: Indicates the speed of rotation during the industrial process. Changes in
rotational speed can affect the efficiency and performance of machinery.

Torque: Measures the rotational force applied during the process. Monitoring torque is
essential for ensuring that machinery operates within specified limits.

Tool Wear: Represents the wear and tear on the tools used in the process. Managing tool wear
is vital for preventing unexpected failures and maintaining product quality.
Target: A binary target variable indicating the occurrence of failure (1) or no failure (0). This
is the variable we aim to predict using machine learning models.

Failure Type: Categorical variable specifying the type of failure when it occurs.
Understanding failure types can assist in targeted maintenance and improvement strategies.

Exploratory Data Analysis (EDA): Conducting EDA on the dataset can reveal patterns,
correlations, and outliers. Visualizations such as histograms, scatter plots, and correlation
matrices can provide a comprehensive overview.

Preprocessing: Handling missing values, encoding categorical variables, and scaling

numerical features are essential preprocessing steps before training machine learning models.

Failure Prediction: Implementing machine learning models, such as logistic regression,

random forest, or support vector machines, to predict failures based on the provided features.

Failure Type Prediction: If predicting the type of failure is of interest, building a multi-class
classification model using algorithms like decision trees or neural networks can be beneficial.

Feature Importance: Determining which features contribute most to failure prediction through
feature importance analysis. This insight can guide prioritized maintenance efforts.

Model Evaluation: Assessing the performance of the trained models using metrics such as
accuracy, precision, recall, and F1-score. Confusion matrices can provide a detailed view of
prediction outcomes.

Deployment and Monitoring: Deploying the best-performing model in a real-world industrial

environment and continuously monitoring its performance to adapt to changing conditions.

By exploring and modeling this dataset, organizations can enhance their predictive
maintenance strategies, minimize downtime, and optimize industrial processes.

10.3 Results and Description

Figure 1: Sample UI used for smart sensing production system This figure shows a visual
representation or screenshot of the user interface (UI) used in the smart sensing production
system. Figure 2: Dataset used for smart sensing production system This figure displays
information or characteristics of the dataset employed in the smart sensing production
system. It may include details about features, labels, and data distribution. Figure 3: UI shows
the dataset after preprocessing This figure represents the user interface displaying the dataset
after undergoing some preprocessing steps. Preprocessing may involve cleaning,
transforming, or handling missing data.

Figure 1: Sample UI used for smart sensing production system

Figure 2: Dataset used for smart sensing production system

Figure 3:UI shows the dataset after preprocessing

Figure 4: Count plot of target column used for smart sensing production system
Figure 4: Count plot of target column used for smart sensing production system after
preprocessing
Figure 5: Confusion matrix of all machine learning and deep learning algorithms

Figure 4: Count plot of target column used for smart sensing production system This figure
presents a count plot visualizing the distribution of the target column in the dataset before any
preprocessing.

Figure 6: Performance comparison graph of all the ml algorithms

Figure7: UI shows the prediction results on test data

Figure 4: Count plot of target column used for smart sensing production system after
preprocessing Similar to the previous figure, this one illustrates the count plot of the target
column, but after the dataset has undergone preprocessing steps.

Figure 5: Confusion matrix of all machine learning and deep learning algorithms This figure
likely displays a confusion matrix that evaluates the performance of various machine learning
and deep learning algorithms on the task at hand. It provides insights into the model's ability
to correctly classify instances.

Figure 6: Performance comparison graph of all the ML algorithms This figure shows a
performance comparison graph, possibly depicting metrics such as accuracy, precision, recall,
or F1 score for different machine learning algorithms used in the smart sensing production
system.

Figure 7: UI shows the prediction results on test data This figure demonstrates the user
interface presenting the prediction results of the smart sensing production system on test data.
It might include visualizations or summaries of the model's predictions.

Table 1: Performance comparison of quality metrics obtained using Machine Learning This
table likely summarizes the performance metrics (e.g., accuracy, precision, recall) obtained
from various machine learning algorithms. It provides a structured comparison of the models'
effectiveness.
Table 1: Performance comparison of quality metrics obtained using Machine Learning

Algorith Precision Recal F1-Score Accuracy

m l

KNN 56.39 55.57 55.77 56.24

DNN 98.99 99.21 99.08 99.20

Precision: Precision measures the accuracy of positive predictions made by the model. For
the KNN algorithm, the precision is 56.39%, indicating that out of all the instances predicted
as positive, 56.39% were actually positive. In contrast, the DNN algorithm achieved a much
higher precision of 98.99%, indicating a higher accuracy in positive predictions.

Recall: Recall measures the ability of the model to identify all positive instances. The KNN
algorithm achieved a recall of 55.57%, meaning that it correctly identified 55.57% of all
actual positive instances. On the other hand, the DNN algorithm had a recall of 99.21%,
indicating its superior ability to capture positive instances.

F1-Score: The F1-Score is the harmonic mean of precision and recall, providing a balance
between the two metrics. For the KNN algorithm, the F1-Score is 55.77%, reflecting the
balance between precision and recall in its predictions. Conversely, the DNN algorithm
achieved a much higher F1-Score of 99.08%, indicating a strong balance between precision
and recall.

Accuracy: Accuracy measures the overall correctness of the model's predictions. The KNN
algorithm achieved an accuracy of 56.24%, meaning that it correctly classified 56.24% of all
instances. In comparison, the DNN algorithm achieved a significantly higher accuracy of
99.20%, indicating its overall superior performance in classification tasks.

Overall, the DNN algorithm outperformed the KNN algorithm across all metrics,
demonstrating its effectiveness in accurately classifying instances and making predictions.
REFERENCES
[1] Kayad, A.; Paraforos, D.; Marinello, F.; Fountas, S. Latest advances in sensor
applications in agriculture. Agriculture 2020, 10, 362.
[2] Elahi, H.; Munir, K.; Eugeni, M.; Atek, S.; Gaudenzi, P. Energy harvesting towards
self-powered IoT devices. Energies 2020, 13, 5528.
[3] Ullo, S.L.; Sinha, G.R. Advances in smart environment monitoring systems using IoT
and sensors. Sensors 2020, 20, 3113.
[4] Carminati, M.; Sinha, G.R.; Mohdiwale, S.; Ullo, S.L. Miniaturized pervasive sensors
for indoor health monitoring in smart cities. Smart Cities 2021, 4, 146–155.
[5] Ullo, S.L.; Addabbo, P.; Di Martire, D.; Sica, S.; Fiscante, N.; Cicala, L.; Angelino,
C.V. Application of DInSAR technique to high coherence Sentinel-1 images for dam
monitoring and result validation through in situ measurements. IEEE J. Sel. Top.
Appl. Earth Obs. Remote. Sens. 2019, 12, 875–890.
[6] Ullo, S.L. and Sinha, G.R., 2021. Advances in IoT and smart sensors for remote
sensing and agriculture applications. Remote Sensing, 13(13), p.2585.
[7] Sivasuriyan, A., Vijayan, D.S., LeemaRose, A., Revathy, J., Gayathri Monicka, S.,
Adithya, U.R. and Jebasingh Daniel, J., 2021. Development of smart sensing
technology approaches in structural health monitoring of bridge structures. Advances
in Materials Science and Engineering, 2021.
[8] Dazhe Zhao, Kaijun Zhang, Yan Meng, Zhaoyang Li, Yucong Pi, Yujun Shi, Jiacheng
You, Renkun Wang, Ziyi Dai, Bingpu Zhou, Junwen Zhong, Untethered triboelectric
patch for wearable smart sensing and energy harvesting, Nano Energy, Volume 100,
2022, 107500, ISSN 2211-2855, https://doi.org/10.1016/j.nanoen.2022.107500.
[9] M. Bacco, A. Berton, A. Gotta and L. Caviglione, "IEEE 802.15.4 Air-Ground UAV
Communications in Smart Farming Scenarios," in IEEE Communications Letters, vol.
22, no. 9, pp. 1910-1913, Sept. 2018, doi: 10.1109/LCOMM.2018.2855211.
[10] A. Verma, S. Prakash, V. Srivastava, A. Kumar and S. C. Mukhopadhyay,
"Sensing, Controlling, and IoT Infrastructure in Smart Building: A Review," in IEEE
Sensors Journal, vol. 19, no. 20, pp. 9036-9046, 15 Oct.15, 2019, doi:
10.1109/JSEN.2019.2922409.
[11] Z. Hu, Z. Bai, Y. Yang, Z. Zheng, K. Bian and L. Song, "UAV Aided Aerial-
Ground IoT for Air Quality Sensing in Smart City: Architecture, Technologies, and
Implementation," in IEEE Network, vol. 33, no. 2, pp. 14-22, March/April 2019, doi:
10.1109/MNET.2019.1800214.
[12] Famila, S., Jawahar, A., Sariga, A. et al. Improved artificial bee colony
optimization- based clustering algorithm for SMART sensor environments. Peer-to-
Peer Netw. Appl. 13, 1071–1079 (2020). https://doi.org/10.1007/s12083-019-00805-4

Hands-On Lab 6 - Filtering and Sorting Data
No ratings yet
Hands-On Lab 6 - Filtering and Sorting Data
5 pages
Mini Project Report
No ratings yet
Mini Project Report
58 pages
Underground Coal Monitoring: Characteristics
No ratings yet
Underground Coal Monitoring: Characteristics
5 pages
Smart Sensors For Structural Health Monitoring Filippo Ubertini download
No ratings yet
Smart Sensors For Structural Health Monitoring Filippo Ubertini download
91 pages
Abstract:: Integrating Sensors and Sensor Systems
No ratings yet
Abstract:: Integrating Sensors and Sensor Systems
6 pages
Ai and ML For SHM
No ratings yet
Ai and ML For SHM
5 pages
Advance in Iot
No ratings yet
Advance in Iot
14 pages
Smart Sensing Technology: Opportunities and Challenges: B.F. Spencer, JR., Manuel E. Ruiz-Sandoval, and Narito Kurata
No ratings yet
Smart Sensing Technology: Opportunities and Challenges: B.F. Spencer, JR., Manuel E. Ruiz-Sandoval, and Narito Kurata
31 pages
A Convolutional Neural Network For Impact Detection and Characterization of Complex Composite Structure
No ratings yet
A Convolutional Neural Network For Impact Detection and Characterization of Complex Composite Structure
25 pages
ijeetv12n1_02 (1)
No ratings yet
ijeetv12n1_02 (1)
8 pages
Research Paper On Sensors
No ratings yet
Research Paper On Sensors
14 pages
TECHNICAL SEMINAR-2
No ratings yet
TECHNICAL SEMINAR-2
14 pages
Document (5)
No ratings yet
Document (5)
2 pages
Seminar PPT On SHM
67% (3)
Seminar PPT On SHM
30 pages
18CSS84 - Technical Seminar - Report Body Format
No ratings yet
18CSS84 - Technical Seminar - Report Body Format
16 pages
18CSS84 - Technical Seminar - Report Body Format
No ratings yet
18CSS84 - Technical Seminar - Report Body Format
16 pages
Sensors: A Convolutional Neural Network For Impact Detection and Characterization of Complex Composite Structures
No ratings yet
Sensors: A Convolutional Neural Network For Impact Detection and Characterization of Complex Composite Structures
25 pages
IEEESJ 2023 Hidalgo Munoz Wireless Postprint
No ratings yet
IEEESJ 2023 Hidalgo Munoz Wireless Postprint
13 pages
A Framework For Intelligent Sensor Network With Video Camera For Structural Health Monitoring of Bridges
100% (1)
A Framework For Intelligent Sensor Network With Video Camera For Structural Health Monitoring of Bridges
5 pages
Internet of Things
No ratings yet
Internet of Things
17 pages
Structural Health Monitoring Using Smart Sensors: NSEL Report Series
No ratings yet
Structural Health Monitoring Using Smart Sensors: NSEL Report Series
186 pages
Sensor Network Applications in Structures - A Survey: Sanath Alahakoon
No ratings yet
Sensor Network Applications in Structures - A Survey: Sanath Alahakoon
10 pages
Structural Health Monitoring: Internet of Things Application
No ratings yet
Structural Health Monitoring: Internet of Things Application
4 pages
Structure
No ratings yet
Structure
10 pages
Major Project (2) - 1
No ratings yet
Major Project (2) - 1
16 pages
Smart Sensors: Analysis of Different Types of Iot Sensors
No ratings yet
Smart Sensors: Analysis of Different Types of Iot Sensors
6 pages
Internet of Things
No ratings yet
Internet of Things
2 pages
s44285-024-00031-2
No ratings yet
s44285-024-00031-2
19 pages
1 s2.0 S2665917422002951 Main
No ratings yet
1 s2.0 S2665917422002951 Main
6 pages
Topic:: Sensor Technology
No ratings yet
Topic:: Sensor Technology
15 pages
The Rise of Smart Cities : Advanced Structural Sensing and Monitoring Systems Amir Alavi - The ebook version is available in PDF and DOCX for easy access
100% (1)
The Rise of Smart Cities : Advanced Structural Sensing and Monitoring Systems Amir Alavi - The ebook version is available in PDF and DOCX for easy access
36 pages
Structural Health Monitoring Systems An Overview
No ratings yet
Structural Health Monitoring Systems An Overview
5 pages
2022 - Martakis Et Al. - A Semi-Supervised Interpretable Machine Learning Framework For Sensor Fault Detection
No ratings yet
2022 - Martakis Et Al. - A Semi-Supervised Interpretable Machine Learning Framework For Sensor Fault Detection
16 pages
Energy Harvesting For Structural Health Monitoring
No ratings yet
Energy Harvesting For Structural Health Monitoring
31 pages
Editorial: Distributed Sensor Networks For Health Monitoring of Civil Infrastructures
No ratings yet
Editorial: Distributed Sensor Networks For Health Monitoring of Civil Infrastructures
4 pages
Internet of Things
No ratings yet
Internet of Things
17 pages
Mobile Security Fundamentals: A Guide for CompTIA Security+ 601 Exam
From Everand
Mobile Security Fundamentals: A Guide for CompTIA Security+ 601 Exam
Adil Ahmed
No ratings yet
Sustainability 15 01509 v2
No ratings yet
Sustainability 15 01509 v2
31 pages
IND INST DA 1
No ratings yet
IND INST DA 1
3 pages
A scientometric analysis of drone-based
No ratings yet
A scientometric analysis of drone-based
23 pages
Fayyad-2025-A-scientometric-analysis-of-drone-b
No ratings yet
Fayyad-2025-A-scientometric-analysis-of-drone-b
23 pages
A Review On Deep Learning-Based Structural Health Monitoring of Civil Infrastructures
No ratings yet
A Review On Deep Learning-Based Structural Health Monitoring of Civil Infrastructures
20 pages
Internet of things
No ratings yet
Internet of things
13 pages
Iot Based Health Monitoring System For Building and Bridges
No ratings yet
Iot Based Health Monitoring System For Building and Bridges
4 pages
CH0013_Plan_Ozer_v4_ii
No ratings yet
CH0013_Plan_Ozer_v4_ii
43 pages
Sensing Technology Applications in The M
No ratings yet
Sensing Technology Applications in The M
16 pages
An Optimal Real-Time Structural Health Monitoring System: An Interdisciplinary Approach
No ratings yet
An Optimal Real-Time Structural Health Monitoring System: An Interdisciplinary Approach
4 pages
Honeypot Systems and Techniques: Definitive Reference for Developers and Engineers
From Everand
Honeypot Systems and Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
MachineLearningAssignment1 2012
No ratings yet
MachineLearningAssignment1 2012
5 pages
1
No ratings yet
1
5 pages
Abstract - Wireless Sensors and Wireless Sensor Networks Have Come To The
No ratings yet
Abstract - Wireless Sensors and Wireless Sensor Networks Have Come To The
7 pages
Iot & MM
No ratings yet
Iot & MM
9 pages
Challenges Applications and Future of WS in IoT A Review (2022)
No ratings yet
Challenges Applications and Future of WS in IoT A Review (2022)
13 pages
Fbuil 06 604665
No ratings yet
Fbuil 06 604665
2 pages
uav and digital twins
No ratings yet
uav and digital twins
31 pages
IJARCCE4A A Manali Study
No ratings yet
IJARCCE4A A Manali Study
4 pages
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
From Everand
Botnet Attack Detection in the Internet of Things Using Selected Learning Algorithms: A Research Study on Securing IoT Against Cyber Threats Using Machine Learning
Bolakale Aremu
5/5 (1)
Internet of Things IoT Platform For Structure
No ratings yet
Internet of Things IoT Platform For Structure
10 pages
ssrn-5030556
No ratings yet
ssrn-5030556
22 pages
CS3237 Notes
No ratings yet
CS3237 Notes
18 pages
Internet of Things & Wireless Sensor Network
From Everand
Internet of Things & Wireless Sensor Network
Ajit Singh
No ratings yet
sensors-25-01763-with-cover
No ratings yet
sensors-25-01763-with-cover
26 pages
CV Rohit
No ratings yet
CV Rohit
3 pages
A Machine Learning Approach To Predict M
No ratings yet
A Machine Learning Approach To Predict M
66 pages
Beat Sheet Output
No ratings yet
Beat Sheet Output
15 pages
3.11.304 Accounting of Disclosure of PHI
No ratings yet
3.11.304 Accounting of Disclosure of PHI
13 pages
Blockchainfor Business Seminar Paper
No ratings yet
Blockchainfor Business Seminar Paper
33 pages
Teldat V - Nateo - Wifi - Access List - Lan - Wan-Dchp
No ratings yet
Teldat V - Nateo - Wifi - Access List - Lan - Wan-Dchp
3 pages
AJP Assignment No-3
No ratings yet
AJP Assignment No-3
7 pages
CV Template Solvo
No ratings yet
CV Template Solvo
1 page
A Pattern Recognition System For Malicious PDF Files Detection
No ratings yet
A Pattern Recognition System For Malicious PDF Files Detection
2 pages
Unit I-Cloud Computing
No ratings yet
Unit I-Cloud Computing
29 pages
8a Android Menus and Dialogs
No ratings yet
8a Android Menus and Dialogs
14 pages
7.2.1.7 Packet Tracer - Configuring Named Standard IPv4 ACLs Instructions - ILM
No ratings yet
7.2.1.7 Packet Tracer - Configuring Named Standard IPv4 ACLs Instructions - ILM
2 pages
Omniverse Enterprise 5 Steps Digital Twins Ebook
No ratings yet
Omniverse Enterprise 5 Steps Digital Twins Ebook
8 pages
Spring MVC: IBM Global Technology Services
No ratings yet
Spring MVC: IBM Global Technology Services
19 pages
R12.2.9 TOI - Implement and Use Receivables - Receivables Command Center
100% (1)
R12.2.9 TOI - Implement and Use Receivables - Receivables Command Center
50 pages
Qspider Testing Sllybus
No ratings yet
Qspider Testing Sllybus
11 pages
Evolution and Present Status of Cloud Computing
No ratings yet
Evolution and Present Status of Cloud Computing
20 pages
Switching CNC
No ratings yet
Switching CNC
34 pages
IT576 Computer Systems (3-0-2-4) : Lecture Schedule (CEP 108)
No ratings yet
IT576 Computer Systems (3-0-2-4) : Lecture Schedule (CEP 108)
11 pages
18CSC303J DBMS Sample MCQ
No ratings yet
18CSC303J DBMS Sample MCQ
12 pages
NMCP
No ratings yet
NMCP
2 pages
UKPSC Computer Science and Computer Application
No ratings yet
UKPSC Computer Science and Computer Application
4 pages
Mpv Commands for Keybinding Input.conf
No ratings yet
Mpv Commands for Keybinding Input.conf
4 pages
Lesson 1-Introduction To Enterprise Systems For Management
No ratings yet
Lesson 1-Introduction To Enterprise Systems For Management
17 pages
Software Evolution Class Note
No ratings yet
Software Evolution Class Note
24 pages
Cyber Warfare and Security Threats
No ratings yet
Cyber Warfare and Security Threats
7 pages
Aw6000 DX100
No ratings yet
Aw6000 DX100
2 pages
Kendriya Vidyalaya O.N.G.C Mehsana
No ratings yet
Kendriya Vidyalaya O.N.G.C Mehsana
21 pages
Veeam_v9.5_textbook
No ratings yet
Veeam_v9.5_textbook
281 pages
Conversion Rule in ALE - IDOC Scenario
No ratings yet
Conversion Rule in ALE - IDOC Scenario
27 pages
Brose Boosts Windows Cluster: Leading Automotive Supplier Speeds Up Simulation Using Altair Solutions
No ratings yet
Brose Boosts Windows Cluster: Leading Automotive Supplier Speeds Up Simulation Using Altair Solutions
2 pages
ADC - Lecture 5 Digital Source Coding - 1
No ratings yet
ADC - Lecture 5 Digital Source Coding - 1
10 pages
Chapter 8 Optimization 1
No ratings yet
Chapter 8 Optimization 1
5 pages