0% found this document useful (0 votes)
11 views

Machine Learning with Artificial Neural Networks

Uploaded by

bog2k3
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Machine Learning with Artificial Neural Networks

Uploaded by

bog2k3
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Machine Learning with Artificial

Neural Networks
What is Machine Learning (ML)
ML is a field in mathematics and computer science concerned with designing and building software that
can “learn” to do useful work, without having to be taught explicitly how to achieve that.

“Machine learning (ML) is the study of computer algorithms that improve


automatically through experience and by the use of data.”

[Wikipedia - https://en.wikipedia.org/wiki/Machine_learning ]

The software (or the entire hardware + software ecosystem, a.k.a “the Machine”) can achieve learning
by various means, such as:

 Supervised learning
o This is the classical “training” where a set of data for which the result is known is fed
into the algorithm and by comparing the output with the expected values, the algorithm
can be adjusted.
o More here: https://en.wikipedia.org/wiki/Supervised_learning
 Unsupervised learning
o No set of training data exists, but the algorithm studies data in the wild and extracts
similarities and differences from it, learning connections and categorizations.
o More here: https://en.wikipedia.org/wiki/Unsupervised_learning
 Reinforcement learning
o This type of learning uses what is known as a “utility function” which the algorithm tries
to maximize. The utility function tells the algorithm how well it performed some task,
and by observing its own previous actions and the outcome, the algorithm can adjust to
perform better.
o More here: https://en.wikipedia.org/wiki/Reinforcement_learning

Machine learning can be implemented in various ways, but usually it’s done with Artificial Neural
Networks (ANNs), due to their versatility and simplicity.

Artificial Neural Networks (ANN)


[ https://en.wikipedia.org/wiki/Artificial_neural_network ]

An ANN is a piece of software (or hardware in some cases) inspired by biological brains that is used as
the foundation for ML.
An ANN consists of a set of “neurons” – entities that can perform simple computations – interconnected
in such a way that enables them to perform much more complex tasks as a whole and that allows
adjustments to be made in order to tune the network for a desired outcome.

The neurons in an ANN are usually organized in layers, starting from the left where we find the “input
layer”, to the right, where the “output layer” lies. All layers in between are called “hidden”. A network
without loops, where the data flows clearly in one direction – from the input layer towards the output
layer – is called a “feed forward” network.

A neuron is an object that has these properties:

 One or more inputs


o The inputs can come from outside the network, as is the case for the neurons in the
input layer, or from neurons in the previous layer
 One output (called the “activation”)
o This is the value produced by the neuron internally
 One transfer function
o This is a function (usually the step or sigmoid function) that is applied to the neuron’s
output in order to decide whether or not the neuron should “fire” – transfer its output
to the next layer.

Apart from neurons, an ANN also contains “synapses” – these are the connections between neurons.

A synapse consists of these properties:

 Source neuron – where the output is taken from


 Target neuron – where the data is delivered to
 Weight – a number which is used to modulate the output of the source neuron before it arrives
at the target neuron. This represents the “strength” of the synapse and dictates how much the
source neuron’s output contributes to the overall value of the target neuron.

Mathematical definition
The value produced by a neuron can be defined as:

( )
n
value=σ b+ ∑ ai∗wi
i=1

Where:

 σ is the transfer function


 b is the bias
 n is the number of inputs for the neuron
 a i is the value of the i-th input
 w i is the weight of the i-th input

The bias is just a number added to the weighted sum of the inputs; In some models it’s represented as
the weight for the 0th input which by definition is considered to be 1.

Considering this definition, we can derive from here the need for a transfer function – without one, the
total output of the network would always be a linear combination of its inputs, thus the network would
only ever be able to model linear functions.

Because in practice ANNs are used to model highly complex, non-linear functions, a non-linear element
must be introduced in their working – enter the transfer function.

The transfer in theory can be chosen as any non-linear function, but for simplicity it’s usually chosen as
one of these two:

 The “heavyside step function“

o
o This function produces a value of 0.0 when the input is <= 0 and 1.0 when input > 0
o Because it’s non-continuous, it’s non-differentiable (this will become important later)

 The sigmoid function


o
o This function produces a smooth transition between “low” and “high”, thus being
differentiable

Uses of ML with Artificial Neural Networks


There are usually three types of problems that are solved using this technique:

1. Approximation/prediction problems
2. Classification problems
3. Optimization problems

Approximation/prediction problems
Starting from a set of data points which define some characteristic of a process in relation to a variable,
we want to approximate the value of the characteristic for values of the variable for which we don’t
have any data points.

 The value of the variable can be in-between the known values – in this case we’re asking the
ANN to do an approximation similar to interpolation, but smarter
 The value of the variable can be outside the known values – in this case we’re asking the ANN to
do a prediction – something like extrapolation, but smarter.

ML is used for this problem when the function in question is highly complex and impossible or
impractical to derive mathematically. For example, given a set of datapoints that can be accommodated
using a polynomial function, there’s no point in building an ML system for that. But when given a set of
datapoints that don’t follow any standard or simple enough mathematical function, ML can do the job
with little headache.

Classification problems
A set of entities must be tagged. For example we have the set of numbers between 1 and 100. We can
assign to each a tag of (“odd”, “even”). The tags need not necessarily be mutually exclusive like in this
case (multiple tags can be attached to each entity). This problem is simple enough to solve with a
classical algorithm, but we can use it as an example. The ML algorithm is expected to assign these tags
to the entities, by either training on a set of data for which we provide the expected answer, or by
observing a large data set in the wild and making associations on its own.
For example, one such use-case would be assigning categories to products in a web store, based on the
photos of the products.

Optimization problems
This is the most difficult, where more complicated techniques are used. Usually the reinforced learning
model is used here, and the ML algorithm is left to its own devices to find ways to maximize the utility
function. One example is Google’s deep mind which learnt how to play board games by watching
recordings. Basically what it does is trying to maximize the outcome – win the game – by resorting to
whatever means it devised internally in accordance to the rules of the game.

These types of algorithms can be used also for “creative” purposes – finding new and different solutions
to problems.

Supervised learning
The most common technique for supervised learning the back-propagation of errors (there’s a
comprehensive Wikipedia article on this, but it’s quite technical. A more practical description can be
found here https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-
python/ )

Basically error backpropagation is done like this:

 Feed the inputs into the ANN


 Run a forward-phase in which the data flows forward until the output layer
 Compare the ANNs output to the expected output and compute the error
 Run a backward-phase in which the errors propagate back from the output layer until the input
layer – at each step the error is computed using the weights of the synapses and the derivative
of the transfer function in order to assess how much each left neuron’s output contributed to
the error in the current node.
 Adjust the weights of the synapses using the computed error and a learning rate – this is a small
number that dictates how fast the weights are adjusted; the smaller, the slower the network
converges (but more accurately); the bigger, the faster the network will converge but it may get
stuck into a non-optimal local maximum or even fail to converge at all.

Since the computation of errors uses the derivative of the transfer function, this is the reason why the
sigmoid function is required when doing backpropagation instead of the heavyside step function, being
differentiable.

Once the synapses are adjusted, the training proceeds with the next data set. After all data sets have
been worked through, an “epoch” has been completed. Usually, the training is repeated with the same
data for several epochs until the accuracy is high enough.
ML in FOSSY
For our purposes in FOSSY we’ll be using classification ML based on ANNs, trained in a hybrid fashion,
using both supervised learning and unsupervised learning.

Classification is used in our case for making decisions, such as:

 whether some license/copyright falls into the “good” or “bad” category


 should or should not do some action

Supervised learning can be initially implemented for copyright / license discrimination, using a known
set of data. It is going to be represented later by the feedback received from the user.

Unsupervised learning can enable the ML algorithm to make associations and learn new stuff on its own,
just by watching the users.

You might also like