0% found this document useful (0 votes)
20 views42 pages

ML - Unit-V-1

The document discusses various machine learning concepts, focusing on reinforcement learning, Markov Chain Monte Carlo (MCMC) methods, Bayesian networks, and tracking methods. Reinforcement learning involves an agent interacting with an environment to maximize rewards, while MCMC methods facilitate sampling from complex distributions. Additionally, Bayesian networks represent variable relationships graphically, and tracking methods like Kalman and particle filters are used for estimating system states over time.

Uploaded by

koushikkookatla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views42 pages

ML - Unit-V-1

The document discusses various machine learning concepts, focusing on reinforcement learning, Markov Chain Monte Carlo (MCMC) methods, Bayesian networks, and tracking methods. Reinforcement learning involves an agent interacting with an environment to maximize rewards, while MCMC methods facilitate sampling from complex distributions. Additionally, Bayesian networks represent variable relationships graphically, and tracking methods like Kalman and particle filters are used for estimating system states over time.

Uploaded by

koushikkookatla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Machine Learning

Reinforcement Learning
Reinforcement learning fills the gap between supervised learning, where the
algorithm is trained on the correct answers given in the target data, and
unsupervised learning, where the algorithm can only exploit similarities in the data
to cluster it.
Search is a fundamental part of any reinforcement learner: the algorithm searches
over the state space of possible inputs and outputs in order to try to maximise a
reward.
Reinforcement learning is usually described in terms of the interaction between
some agent and its environment. The agent is the thing that is learning, and the
environment is where it is learning, and what it is learning about.
Reinforcement learning maps states or situations to actions in order to maximise
some numerical reward. That is, the algorithm knows about the current input (the
state), and the possible things it can do (the actions), and its aim is to maximise the
reward.
Reinforcement Learning
Reinforcement Learning : Getting Lost
Reinforcement Learning : How it Works?
1. States and Actions: You are in a state (e.g., A, B etc), and it can take various actions
(e.g., moving to other states).
2. Rewards and Penalties: The environment provides feedback (rewards or penalties)
based on the actions. Moving towards the goal might receive a positive reward, while
getting close to a hazard or stepping on one might receive a negative reward.
3. Policy Update: use the experiences (states, actions, and rewards) to update "policy,"
which is a rule that determines which action to take in a given state. Over time, the policy
improves, and the robot learns to take actions that are more likely to lead to positive
rewards.
4. Exploration vs. Exploitation: Reinforcement learning algorithms often balance
exploration (trying new actions) with exploitation (using what has already been learned). A
certain amount of exploration is necessary to discover new and potentially better paths,
but the agent should also exploit its current knowledge to maximize its reward.
Markov Chain Monte Carlo (MCMC) Methods
Sampling: Sampling in machine learning refers to the process of selecting a
subset of data from a larger dataset to use for tasks such as training, testing, or validation.
This helps manage computational resources, balance class distributions, or reduce over
fitting.
Random Numbers: The basis of all of these sampling methods is in the generation of
random numbers, and this is something that computers are not really capable of doing.
However, there are plenty of algorithms that produce pseudo-random numbers, the
simplest of which is the linear congruential generator.
This is a very simple function that is defined by a recurrence relation
Markov Chain Monte Carlo (MCMC) Methods
Gaussian Random Numbers:The Mersenne twister produces uniform
random numbers. However, often we might want to produce samples from other
distributions, e.g., Gaussian. The usual method of doing this is the Box–Muller scheme,
which uses a pair of uniformly randomly distributed numbers in order to make two
independent Gaussian-distributed numbers with zero mean and unit variance.
Markov Chain Monte Carlo (MCMC) Methods
Proposal Distribution: Proposal distribution sampling is a technique used
primarily in Bayesian statistics and Monte Carlo methods, especially importance sampling
and Markov Chain Monte Carlo (MCMC). It involves drawing samples from a proposal
distribution instead of directly sampling from the target distribution, which may be
complex or intractable.
Proposal distribution sampling refers to generating candidate samples from a simpler or
more convenient distribution (called the proposal distribution, usually denoted q(x) in
order to approximate or sample from a more complex target distribution (denoted p(x)).
Markov Chain Monte Carlo (MCMC) Methods
Two sampling algorithms are used
1. Rejection Sampling Algorithm
2. Sampling Importance Re sampling Algorithm
Markov Chain Monte Carlo (MCMC) Methods
Markov Chain Monte Carlo (MCMC) Methods
Two sampling algorithms are used
1. Rejection Sampling Algorithm
2. Sampling Importance Re sampling Algorithm
Markov Chain Monte Carlo (MCMC) Methods
Markov Chain Monte Carlo (MCMC) Methods
Markov Chain Monte Carlo (MCMC) is a set of algorithms that use Markov
chains to draw samples from a probability distribution. It's particularly useful when directly
sampling from the target distribution is difficult or impossible. MCMC methods work by
constructing a Markov chain whose equilibrium distribution approximates the target
distribution
It has two parts:
1. Monte Carlo: The "Monte Carlo" part refers to a general approach that uses randomness
to solve problems, drawing inspiration from gambling. In MCMC, this means using
random sampling to approximate properties of a probability distribution.
2. Markov Chain: A Markov chain is a sequence of random events where the probability of
the next event depends only on the current event, not on the past. In MCMC, a Markov
chain is constructed to explore the target distribution, spending more time in regions
with higher probability.
Markov Chain Monte Carlo (MCMC) Methods
How it works:
Construct a Markov Chain: The algorithm constructs a Markov chain that is designed to
"drift" through the parameter space, spending more time in regions where the target
distribution has higher probability.
Sampling: After the chain has "warmed up" and reached a stable state (equilibrium),
samples are drawn from the chain. These samples approximate the target distribution.
Markov Chain Monte Carlo (MCMC) Methods
Common MCMC algorithms:
1. Metropolis-Hastings: A general algorithm for constructing Markov chains.
2. Gibbs sampling: An algorithm that samples each parameter conditioned on the current
values of the other parameters.
Markov Chain Monte Carlo (MCMC) Methods
Common MCMC algorithms:
1. Metropolis-Hastings: A general algorithm for constructing Markov chains.
2. Gibbs sampling: An algorithm that samples each parameter conditioned on the current
values of the other parameters.
Markov Chain Monte Carlo (MCMC) Methods
Markov Chain Monte Carlo (MCMC) Methods
2. Gibbs sampling: An algorithm that samples each parameter conditioned on the current
values of the other parameters.
Bayesian Networks
Bayesian networks or Bayesian belief networks are models that represent variables
and their relationships using a graph with directed connections. Each point (node) in
the graph stands for a variable, while the links (edges) show how they depend on
each other. Every node also has a probability table that outlines the chances based
on related variables.
Bayesian Networks : An Example
Bayesian Networks : An Example
Figure 16.2 shows a graph with a full set of distribution tables specified. It is a
handy guide to whether or not you will be scared before an exam based on whether
or not the course was boring (‘B’), which was the key factor you used to decide
whether or not to attend lectures (‘A’) and revise (‘R’). We can use it to perform
inference in order to decide the likelihood of you being scared before the exam (‘S’).
In order to compute the probability of being scared, we need to compute P(b, r, a, s),
where the lower-case letters indicate particular values that the upper-case variables
can take
Bayesian Networks : An Example
If we know particular values for the three observable nodes, then we can plug them in
and work out the probability.
The power of the graphical model is when you don’t have full information. It is possible
to marginalize over any of those variables by summing up the values. So suppose that
you know that the course was boring, and want to work out how likely it is that you will
be scared before the exam. In that case you can ignore the P(b) terms, and just need to
sum up the probabilities for r and a using Equation (16.1)
Bayesian Networks : Variable Elimination
Variable elimination algorithm, which is a variation on the bucket elimination
algorithm. The idea is to convert the conditional probability tables into what are
called tables, which simply list all of the possible values for all variables, and which
initially contain the conditional probabilities. For example, the table for the ‘S’
variable in Figure 16.2 is:
Bayesian Networks : Variable Elimination
Bayesian Networks : Variable Elimination
Bayesian Networks : Approximate Inference
Bayesian Networks : Markov Blanket
For any node we only need to consider its parents, its children, and the other parents
of the children, as shown in Figure 16.4. This set is known as the Markov blanket of a
node.
Markov Random Fields
Two nodes in a Markov Random Field (MRF) are conditionally independent of each
other, given a third node, if there is no path between the two nodes that doesn’t pass
through the third node. This is actually a variation on the Markov property, which is how
the networks got their name: the state of a particular node is a function only of the
states of its immediate neighbors, since all other nodes are conditionally independent
given its neighbors.
The application of MRF are in image denoising.
Markov Random Fields
Hidden Markov Models (HMMs)
The Hidden Markov Model is one of the most popular graphical models. It is used in
speech processing and in a lot of statistical work. The HMM generally works on a set
of temporal data.
The HMM is the simplest dynamic Bayesian network, a Bayesian network that deals
with sequential (often time-series) data. Figure 16.6 shows the HMM as a graphical
model.
Hidden Markov Models (HMMs):Example
Hidden Markov Models (HMMs):Example
HMM Forward Algorithm
The Forward Algorithm is used to compute the probability of an observed sequence
given a Hidden Markov Model (HMM).
Tracking Methods
Tracking keeps track of where predators were and whether they were coming
towards you could keep you alive.
It is also useful for a machine to be able to do this, both for similar reasons to a
human or animal (watching something moving and predicting what path it will
follow, for example in radar or other imaging method) and to keep track of a
changing probability distribution.

There were two popular methods in Tracking


Kalman Filter
Particle Filter
Tracking Methods: Kalman Filter
The Kalman filter is a mathematical algorithm used for estimating the state of a
system by incorporating sequential measurements over time.
It makes an estimate of the next step, then computes an error term based on the
value that was actually produced in the next step, and tries to correct it.
It then uses both of those to make the next prediction, and iterates this procedure. It
can be seen as a simple cycle of predict-correct behavior, where the error at each
step is used to improve the estimate at the next iteration.

The kalman Filter can be represented visually as


Tracking Methods: Kalman Filter
Tracking Methods: Kalman Filter
Tracking Methods: Particle Filter
A particle filter is a nonlinear filtering algorithm that uses Monte Carlo methods to
estimate the state of a system, especially when dealing with nonlinear or non-
Gaussian systems
It represents the state using a set of particles, each representing a possible state, and
assigns weights to these particles based on their likelihood of being the true state.

You might also like