0% found this document useful (0 votes)
13 views

Competitive Applications

Uploaded by

rkrwl26
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Competitive Applications

Uploaded by

rkrwl26
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Artificial Intelligence

‘With artificial intelligence we are summoning the demon.’


– Elon Musk

Rishabh Karwal
01
The Fundamentals
History
Artificial Intelligence was first conceptualised by none other than Alan Turing, he
posed the notion that “humans use available information as well as reason in
order to solve problems and make decisions, so why can’t machines do the
same thing?” He wrote a paper in 1950, “Computing Machinery and
Intelligence”.

This was the start of a, still ongoing, race to make machines intelligent.
Famously, in 1997, chess grandmaster, Gary Kasparov, was defeated by Deep
Blue in a rematch, a chess playing computer AI (written by IBM) after.

The development of artificial intelligence and machine learning has allowed us


to mimic the function of the brain and simulate learning.

Although mislead by many Hollywood movies, it is infeasible for AI to have ‘free


will’ and want to destroy humanity, simply because it is developed with a
Definitions
● Neuron
○ Holds a value
○ This value is computed from all the outputs of the previous layer

● Activation
○ The value in the neuron

● Weight
○ How important a connection is

● Bias
○ Decides how high the activation needs to be for the neuron to be active

● Epoch
○ An iteration of the network
Structure
● Each circle is called a neuron and the value
it is holding is called its activation
● The first set of neurons is called the input
layer
● The last neuron(s) is called the output
layer
● The middle columns is called the hidden
layer and is where the ‘thinking’ (pattern
recognition and data processing is done)
● Each neuron has an adjustable weight and
bias
● The weight is how important the neuron is,
and the bias decides how high the activation
of the neuron has to be for the neuron to be
active
Layer 0

More Structure 16
Neurons
Layer 2
10
Layer 1
Neurons
12 Layer 3
● The activation of each neuron in Layer 1
Neurons 1 Neuron
is computed by applying an activation
function the weighted sum of all the
neurons in the previous layer.
● Layer 1 has (16 * 12) connections, Layer
2 has (12 * 10) connections and Layer 3
has (10* 1) connections
● This is a total of…
354
Completely adjustable parameters
02
The Maths Behind
Linear Algebra
• Organise all activations from the input layer as a N-dimensional vector
• Organise all weights as a matrix where each row of the weight matrix
corresponds to the row of the activations
• Multiply, add the bias and apply the activation function to the result to get the
input for one neuron in the next layer

Credit: 3Blue1Brown
The Activation Function
• Below are two examples of conventionally used activation functions
• The sigmoid function converts any input to be between 0-1 by ‘squishing’ it
• This conversion is needed to scale down the activations to convert it to
anything between ‘on’ and ‘off’
• Rectified linear unit (ReLU) takes the max of either 0 or its input
• This is a more accurate attempt at mimicking nature by strictly being active or
inactive – inspired by how neurons work in real-life
• This is a more used activation function in deep learning as it’s proven to be
better at ‘learning’
Evaluation
• To allow the network to ‘learn’ from its mistake it needs to know how far from
the expected answer/response was. This is called the error and is used to
assess how good the epoch was.
• Therefore, you need a way to do this and then minimise the error
• Error can be calculated by adding up the squares of the differences between
the networks output and the expected output. This will be the overall error of
the network for one training example
• Repeating this with a training dataset can be used to create a cost function
• A higher error means a higher cost, so we want to minimise cost to minimise
error because a lower error means more correct
Gradient Descent
• This is a numerical method for finding local minima
• Conventionally, using calculus, you differentiate the function, set the
derivative to 0 and then solve, for more complex and multivariable functions it
can be very difficult
• The way it works; it checks the gradient of the tangent at a random point, it
then steps across and reevaluates the gradient. If this gradient is closer to 0
than the previous calculated value, the step size is decreased, and the
process is repeated until the gradient = 0 (local minima found) else it goes
the other way and repeats
• This process is used to find the minima of error (the cost function) with
respect to the weights, so the weights can be adjusted to find the lowest
located value of error
• The problem with this method of correcting the network is that you won’t
always find the global minimum of the function, but you will find only one local
minima
Gradient Descent
Back Propagation
Always has
been
• This is the what makes neural networks
Wait, it’s all
‘learn’ linear algebra?

• The process of back propagation is where


the error is minimised by making it
smaller after every iteration, until it
reaches near 0
• As aforementioned, we use a numerical
method to find the weights for a
minimised cost
Another view
• Neural networks are also called
‘universal function
approximators’
• When the network encounters
the same situation again, the
activations will be very similar
y

• Therefore, if mapped, they will


be grouped together
• This means that, over many
epochs, it will begin to
recognise an input because it’s
had the same/similar input
before (this is why it’s essential
x to randomise your training set)
Thanks!

You might also like