AI Lab 1
AI Lab 1
Laboratory work № 1
Subject «Artificial intelligence technologies»
Topic: « Development of software for the implementation of a two-layer
perceptron»
Group CS 372
Kyiv, 2022
The goal of the work. Get the skills to use a two-layer perceptron for solving
practical problems.
Theoretical information
The theoretical basis of laboratory work is the information presented in
sectionsс3 of the textbook: Rudenko OG, Bodyansky EV Artificial neural
networks.
History of Multi-layer ANN
Deep Learning deals with training multi-layer artificial neural networks, also
called Deep Neural Networks. After Rosenblatt perceptron was developed in the
1950s, there was a lack of interest in neural networks until 1986, when Dr.Hinton
and his colleagues developed the backpropagation algorithm to train a multilayer
neural network. Today it is a hot topic with many leading firms like Google,
Facebook, and Microsoft which invest heavily in applications using deep neural
networks.
Multi-layer ANN
A fully connected multi-layer neural network is called a Multilayer Perceptron
(MLP).
A multi-layer neural network contains more than one layer of artificial neurons
or nodes. They differ widely in design. It is important to note that while single-
layer neural networks were useful early in the evolution of AI, the vast majority of
networks used today have a multi-layer model.
It has 3 layers including one hidden layer. If it has more than 1 hidden layer, it
is called a deep ANN. An MLP is a typical example of a feedforward artificial
neural network. In this figure, the ith activation unit in the lth layer is denoted as
AI.
The number of layers and the number of neurons are referred to as
hyperparameters of a neural network, and these need tuning. Cross-validation
techniques must be used to find ideal values for these.
The weight adjustment training is done via backpropagation. Deeper neural
networks are better at processing data. However, deeper layers can lead to
vanishing gradient problems. Special algorithms are required to solve this issue.
in which the error signal delta k(N) at the output layer N is simply the difference
between the target and actual outputs times the derivative of the output activation
function:
and these error signals propagate back to give the deltas at earlier layers n:
In a feedforward network, the relationship between the net’s error and a single
weight will look something like this:
That is, given two variables, Error and weight, that are mediated by a third
variable, activation, through which the weight is passed, you can calculate how a
change in weight affects a change in Error by first calculating how a change in
activation affects a change in Error, and how a change in weight affects a change
in activation.
The essence of learning in deep learning is nothing more than that: adjusting a
model’s weights in response to the error it produces, until you can’t reduce the
error any more.
Logistic Regression
On a deep neural network of many layers, the final layer has a particular role.
When dealing with labeled input, the output layer classifies each example,
applying the most likely label. Each node on the output layer represents one label,
and that node turns on or off according to the strength of the signal it receives from
the previous layer’s input and parameters.
Each output node produces two possible outcomes, the binary output values 0
or 1, because an input variable either deserves a label or it does not. After all, there
is no such thing as a little pregnant.
While neural networks working with labeled data produce binary output, the
input they receive is often continuous. That is, the signals that the network receives
as input will span a range of values and include any number of metrics, depending
on the problem it seeks to solve.
For example, a recommendation engine has to make a binary decision about
whether to serve an ad or not. But the input it bases its decision on could include
how much a customer has spent on Amazon in the last week, or how often that
customer visits the site.
So the output layer has to condense signals such as $67.59 spent on diapers,
and 15 visits to a website, into a range between 0 and 1; i.e. a probability that a
given input should be labeled or not.
The mechanism we use to convert continuous signals into binary output is
called logistic regression. The name is unfortunate, since logistic regression is used
for classification rather than regression in the linear sense that most people are
familiar with. It calculates the probability that a set of inputs match the label.