0% found this document useful (0 votes)

88 views70 pages

Unit 1 ML

This document provides an overview of machine learning concepts including supervised learning, unsupervised learning, and semi-supervised learning. It discusses key elements like classification, regression, clustering, association rule learning, and common applications. Popular algorithms mentioned include k-means, Apriori, linear regression, random forest, and support vector machines. The document also touches on topics like overfitting, problem types, learning approaches, and how machine learning systems work at a high level.

Uploaded by

Nihal Gujar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views70 pages

Unit 1 ML

Uploaded by

Nihal Gujar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 70

Machine Learning

(BE Computer 2015 PAT)

A.Y. 2019-20 SEM-II

Prepared by
Mrs.Nilam Patil

1
Unit-1 Introduction to ML Syllabus
• classic and adaptive machines 1hr
• Only learning matters 1hr
• Beyond machine learning - deep learning and
bio-inspired adaptive systems 1hr
• Machine learning and big data 1hr
• Important Elements in Machine Learning
• Data formats ½ hr
• Learnability ½ hr
• Statistical learning approaches 1hr
• Elements of information theory 1hr
2
Classic and Adaptive machines

The image above roughly explains how

Machine Learning works.
3
• Let us say we have a dataset that contains
pictures of different kinds of fruits and we want
Machine Learning to segregate the photos based
on the kind of fruits.

• First we provide the dataset to the system i.e we

provide the input data.
• The system goes through the entire dataset or analyses
it to find patterns based on size, shapes, colors, etc.
• Now that it has figured out the patterns, the systems
takes decisions and starts separating the photos based
on the patterns.
• Once the work is done, the system learns from the
feedback it gets. If it gets any of the fruit type wrong, it
will make sure it does not happen in the future.
•
4
Here is how Machine Learning will work for us in this
case:

5
APPLICATIONS OF MACHINE LEARNING
• Google Search
• Stock Predictions
• Robotics-‘Sophia’ introduced which could
actually behave like humans.
• Social Media Services- Face Recognition , Add
as friend in facebook or people you may know
• Email Spam and Malware Filtering- C 4.5
Decision Tree Induction
• Over 325, 000 malwares are detected everyday and
each piece of code is 90–98% similar to its previous
versions. 6
Generic Representation of a Classical System that
receives some input values, processes them, and
produces output results:
•

7
• Machine learning algorithms are described as
learning a target function (f) that best maps
input variables (X) to an output variable (Y).
Y = f(X)
• This is a general learning task where we would like to make
predictions in the future (Y) given new examples of input
variables (X).
• It is harder than you think. There is also error (e) that is
independent of the input data (X).
Y = f(X) + e
• This error might be error such as not having enough attributes
to sufficiently characterize the best mapping from X to Y. This
error is called irreducible error because no matter how good
we get at estimating the target function (f), we cannot reduce
this error. 8
Schematic Representation of an adaptive system:

9
Adaptive Learning- Spam filtering, Natural
Language Processing, visual tracking with a
webcam or a smartphone, and predictive
analysis are only a few applications that
revolutionized human-machine interaction and
increased our expectations.
• Such a system isn't based on static or permanent
structures (model parameters and architectures)
but rather on a continuous ability to adapt its
behavior to external signals (datasets or real-time
inputs) and, like a human being, to predict the
future using uncertain and fragmentary pieces of
information.
10
Machine Learning Matters
• Machine learning is to study, engineer, and improve
mathematical models which can be trained (once or
continuously) with context-related data (provided by a
generic environment), to infer the future and to make
decisions without complete knowledge of all influencing
elements (external factors).
• In other words, an agent (which is a software entity that
receives information from an environment, picks the
best action to reach a specific goal, and observes the
results of it) adopts a statistical learning approach,
trying to determine the right probability distributions
and use them to compute the action (value or decision)
that is most likely to be successful (with the least error).
11
Machine learning is a sort of
modern magic.
• Prediction- Even in the most complex
scenarios, such as image classification with
convolutional neural networks, every piece of
information (geometry, color, peculiar
features, contrast, and so on) is already
present in the data and the model has to be
flexible enough to extract and learn it
permanently.

12
Supervised Learning
• Supervised learning is where you have input
variables (x) and an output variable (Y) and
you use an algorithm to learn the mapping
function from the input to the output.
Y = f(X)
• The goal is to approximate the mapping
function so well that when you have new
input data (x) that you can predict the output
variables (Y) for that data.

13
• It is called supervised learning because the
process of an algorithm learning from the
training dataset can be thought of as a
teacher supervising the learning process.
• We know the correct answers, the algorithm
iteratively makes predictions on the training
data and is corrected by the teacher.
• Learning stops when the algorithm achieves
an acceptable level of performance.

14
• Supervised learning problems can be further grouped into
regression and classification problems.
• Classification: A classification problem is when the output
variable is a category, such as “red” or “blue” or “disease” and
“no disease”.
• Regression: A regression problem is when the output variable is a
real value, such as “dollars” or “weight”.
• Some common types of problems built on top
of classification and regression include recommendation and time
series prediction respectively.
• Some popular examples of supervised machine learning
algorithms are:
• Linear regression for regression problems.
• Random forest for classification and regression problems.
• Support vector machines for classification problems.
15
Classification example
• Sometimes, instead of predicting the actual category, it's
better to determine its probability distribution.
• For example, an algorithm can be trained to recognize a
handwritten alphabetical letter, so its output is
categorical (in English, there'll be 26 allowed symbols).
• On the other hand, even for human beings, such a
process can lead to more than one probable outcome
when the visual representation of a letter isn't clear
enough to belong to a single category.
• That means that the actual output is better described by
a discrete probability distribution (for example, with 26
continuous values normalized so that they always sum
up to 1).
16
Problem with Supervised learning
• overfitting, which causes an overlearning due
to an excessive capacity.
• ability to
predict correctly only the
samples used for training, while the
error for the remaining ones is
always very high

17
Common Supervised Learning
Applications include:
• Predictive analysis based on regression or
categorical classification Spam detection
• Pattern detection
• Natural Language Processing
• Sentiment analysis
• Automatic image classification
• Automatic sequence processing (for
example, music or speech)
18
19
Unsupervised Machine Learning
• Unsupervised learning is where you only have
input data (X) and no corresponding output
variables.
• The goal for unsupervised learning is to model the
underlying structure or distribution in the data in
order to learn more about the data.
• These are called unsupervised learning because
unlike supervised learning above there is no
correct answers and there is no teacher.
Algorithms are left to their own devises to
discover and present the interesting structure in
the data. 20
• Unsupervised learning problems can be further
grouped into clustering and association problems.
• Clustering: A clustering problem is where you want to
discover the inherent groupings in the data, such as
grouping customers by purchasing behavior.
• Association: An association rule learning problem is
where you want to discover rules that describe large
portions of your data, such as people that buy X also
tend to buy Y.
• Some popular examples of unsupervised learning
algorithms are:
• k-means for clustering problems.
• Apriori algorithm for association rule learning
problems. 21
Commons Unsupervised Applications
include:
• Object segmentation (for example, users,
products, movies, songs, and so on) Similarity
detection
• Automatic labeling

22
Semi-Supervised Machine Learning
• Problems where you have a large amount of input
data (X) and only some of the data is labeled (Y) are
called semi-supervised learning problems.
• These problems sit in between both supervised and
unsupervised learning.
• A good example is a photo archive where only some of
the images are labeled, (e.g. dog, cat, person) and the
majority are unlabeled.
• Many real world machine learning problems fall into
this area. This is because it can be expensive or time-
consuming to label data as it may require access to
domain experts. Whereas unlabeled data is cheap and
easy to collect and store. 23
Summary
• Supervised: All data is labeled and the
algorithms learn to predict the output
from the input data.
• Unsupervised: All data is unlabeled and
the algorithms learn to inherent structure
from the input data.
• Semi-supervised: Some data is labeled but
most of it is unlabeled and a mixture of
supervised and unsupervised techniques
can be used. 24
Reinforcement learning
• Reinforcement learning is also based on
feedback provided by the environment.
However, in this case, the information is more
qualitative and doesn't help the agent in
determining a precise measure of its error.
• this feedback is usually called reward
(sometimes, a negative one is defined as a
penalty) and it's useful to understand whether
a certain action performed in a state is
positive or not.
25
an action can also be imperfect, but in terms of a global
policy it has to offer the highest total reward.
• Reinforcement Learning is a framework for learning where
an agent interacts with an environment and receives a reward
for each interaction. The goal is to learn to accumulate as much
reward as possible over time.
• The real advantage these systems have over conventional supervised learning
is illustrated by this example I like a lot:
• Supervised Learning: Let us say that you know how to play chess. We record
you playing games against a lot of people. Now we train a system in the
supervised fashion to learn from your examples and call it KidPlayer. Let us
say that we train another system on Vishwanathan Anand’s games and call
this ProPlayer. Obviously the “policy” learned by KidPlayer will be an inferior
player to the policy learned by ProPlayer because of the different
capabilities of the teacher.
• Reinforcement Learning: In this setting, you make an agent play Chess against
someone (usually against another copy of itself) and give it a reward for every
time it wins a game. 26
• to learn the best policy for playing Atari video games and to
teach an agent how to associate the right action with an input
representing the state (usually a screenshot or a memory
dump).
• In the following figure, there's a schematic representation of a
deep neural network trained to play a famous Atari game.
• As input, there are one or more subsequent screenshots (this
can often be enough to capture the temporal dynamics as well).
• They are processed using different layers (discussed briefly
later) to produce an output that represents the policy for a
specific state transition.
• After applying this policy, the game produces a feedback (as a
reward-penalty), and this result is used to refine the output
until it becomes stable (so the states are correctly recognized
and the suggested action is always the best one) and the total
reward overcomes a predefined threshold.
27
Atari Video Game

28
schematic representation of a deep neural
network trained to play a famous Atari game.

29
AI vs ML vs Deep Learning

30
Machine Learning vs Deep Learning

31
Deep Learning
• Machine learning concerned with algorithms inspired by the
structure and function of the brain called artificial neural
networks.
• A deep neural network (DNN) is an artificial neural Network (ANN)
with multiple hidden layers between the input and output layers.
• DNNs are typically feedforward networks in which data flows from
the input layer to the output layer without looping back.
• where as feedforward with Backpropagation ( is a common
method for training a neural network in back direction with
change of weights according to error) forms a better DNN.
• Recurrent neural networks (RNNs), in which data can flow in any
direction, are used for applications for language purposes.
• Convolutional neural networks (CNNs) are used in computer
vision. CNNs have been applied to acoustic modeling for automatic
speech recognition
32
Beyond Machine Learning - Deep
Learning and bio-inspired adaptive
systems
• many researchers started training bigger and
bigger models, built with several different
layers (that's why this approach is called deep
learning), to solve new challenging problems.
• The availability of cheap and fast computers
allowed them to get results in acceptable
timeframes and to use very large datasets
(made up of images, texts, and animations).

33
• The idea behind these techniques is to create
algorithms that work like a brain
• neurosciences and cognitive psychology.
• In particular, there's a growing interest in
pattern recognition and associative memories
whose structure and functioning are similar to
what happens in the neocortex. Such an
approach also allows simpler algorithms called
model- free
• It is based on generic learning techniques and
repeating experiences.
• testing different architectures and
optimization algorithms is quite simpler 34
Common Deep learning applications include:
• Image classification, Real-time visual tracking

• Autonomous car driving , Logistic optimization
• Bioinformatics, Speech recognition

35
36
Data Format
• Labeled data: Data consisting of a set
of training examples, where each example is
a pair consisting of an input and a desired
output value (also called the supervisory
signal, labels, etc)
• Classification: The goal is to predict discrete
values, e.g. {1,0}, {True, False}, {spam, not
spam}.
• Regression: The goal is to predict continuous
values, e.g. home prices.
37
Important Elements in Machine
Learning
• Data formats
• In a supervised learning problem, there will
always be a dataset, defined as a finite set of
real vectors with m features each:

38
• Feature vector: A typical setting for machine learning
is to be given a collection of objects (or data points),
each of which is characterised by several different
features.
• Features can be of different sorts: e.g., they might be
continuous (say, real- or integer-valued) or
categorical (for instance, a feature for colour can
have values like green, blue, red ).
• A vector containing all of the feature values for a
given data point is called the feature vector;
• if this is a vector of length m, then one can think of
each data point as being mapped to a m-dimensional
vector space (in the case of real-valued features, this
is R m ), called the feature space. 39
• This means all variables belong to the same
distribution D, and considering an arbitrary
subset of m values, it happens that:

• The corresponding output values can be both

numerical-continuous or categorical. In the
first case, the process is called regression,
while in the second, it is called classification.
Examples of numerical outputs are:

40
• Categorical examples are

• We define generic regressor, a vector-valued

function which associates an input value to a
continuous output and generic classifier, a vector-
values function whose predicted output is
categorical (discrete).
• If they also depend on an internal parameter vector
which determines the actual instance of a generic
predictor, the approach is called parametric
learning:

41
interpretation can be expressed in terms of
additive noise:

In unsupervised learning, we normally only have an input

set X with m-length vectors, and we define clustering
function (with n target clusters) with the following
expression:

In most scikit-learn models, there is an instance variable coef_ which

contains all trained parameters
42
Multiclass strategies
• When the number of output classes is greater than
one, there are two main possibilities to manage a
classification problem:
• One-vs-all- If there are n output classes, n
classifiers will be trained in parallel considering there
is always a separation between an actual class and
the remaining ones.
• This approach is relatively lightweight (at most, n-1
checks are needed to find the right class, so it has an
O(n) complexity) and, for this reason, it's normally
the default choice and there's no need for further
actions.
43
• One-vs-one
• The alternative to one-vs-all is training a
model for each pair of classes.
• The complexity is no longer linear (it's
O(n2) indeed) and the right class is
determined by a majority vote.
• In general, this choice is more expensive
and should be adopted only when a full
dataset comparison is not preferable.
44
Learnability

45
• there's an example of a dataset whose points
must be classified as red (Class A) or blue (Class
B).
• Three hypotheses are shown: the first one (the
middle line starting from left) misclassifies one
sample,
• while the lower and upper ones misclassify 13
and 23 samples respectively:
• the first hypothesis is optimal and should be
selected; however, it's important to understand
an essential concept which can determine a
potential overfitting
46
47
• The blue classifier is linear while the red one is cubic.
At a glance, non-linear strategy seems to perform
better, because it can capture more expressivity,
thanks to its concavities.
• However, if new samples are added following the
trend defined by the last four ones (from the right),
they'll be completely misclassified.
• In fact, while a linear function is globally better but
cannot capture the initial oscillation between 0 and
4, a cubic approach can fit this data almost perfectly
but, at the same time, loses its ability to keep a
global linear trend.

48
Underfitting and overfitting
• Underfitting: It means that the model isn't able to
capture the dynamics shown by the same training
set (probably because its capacity is too limited).
• Overfitting: the model has an excessive capacity
and it's not more able to generalize considering the
original dynamics provided by the training set. It
can associate almost perfectly all the known
samples to the corresponding output values, but
when an unknown input is presented, the
corresponding prediction error can be very high.

49
low-capacity (underfitting), normal-capacity
(normal fitting), and excessive capacity
(overfitting):

50
Error measures
• In general, when working with a supervised
scenario, we define a non-negative error
measure em which takes two arguments
(expected & predicted output ) and allows us
to compute a total error value over the whole
dataset (made up of n samples):

51
• This value is also implicitly dependent on the
specific hypothesis H through the parameter
set, therefore optimizing the error implies
finding an optimal hypothesis

• it's useful to consider the mean square error

(MSE):

52
53
• This measure is also called loss function
because its value must be minimized through
an optimization problem.
• When it's easy to determine an element which
must be maximized, the corresponding loss
function will be its reciprocal.
• Another useful loss function is called zero-
one-loss and it's particularly efficient for
binary classifications (also for one-vs-rest
multiclass strategy):

54
• generic (and continuous) loss function can be
expressed in terms of potential energy:
•

• The predictor is like a ball upon a rough surface: starting

from a random point where energy (=error) is usually
rather high, it must move until it reaches a stable
equilibrium point where its energy (relative to the
global minimum) is null. In the following figure, there's
a schematic representation of some different situations:
55
56
• the starting point is stable without any external
perturbation, so to start the process, it's needed to
provide initial kinetic energy.
• However, if such an energy is strong enough, then
after descending over the slope the ball cannot stop in
the global minimum.
• The residual kinetic energy can be enough to
overcome the ridge and reach the right valley. If there
are not other energy sources, the ball gets trapped in
the plain valley and cannot move anymore.
• avoid local minima. However, every situation must
always be carefully analyzed to understand what level
of residual energy (or error) is acceptable, or whether
it's better to adopt a different strategy
57
Statistical learning approaches
• Imagine that you need to design a spam-filtering
algorithm starting from this initial (over-
simplistic) classification based on two
parameters:
Parameter Spam emails (X1) Regular emails (X2)

P1 Contains > 5 blacklisted

words 80 20
p2 - Message length < 20 75 25
characters
• We have collected 200 email messages (X) (for
simplicity, we consider p1 and p2 mutually exclusive)
and we need to find a couple of probabilistic hypotheses
(expressed in terms of p1 and p2), to determine:

58
• For example, we could think about rules (hypotheses)
like: "If there are more than five blacklisted words" or "If
the message is less than 20 characters in length" then
"the probability of spam is high" (for example, greater
than 50 percent). However, without assigning
probabilities, it's difficult to generalize when the dataset
changes (like in a real world antispam filter). We also
want to determine a partitioning threshold (such as
green, yellow, and red signals) to help the user in
deciding what to keep and what to trash.
• As the hypotheses are determined through the dataset
X, we can also write (in a discrete form):

59
• In this example, it's quite easy to determine the value of each
term. However, in general, it's necessary to introduce the
Bayes formula

• In the previous equation, the first term is

called a posteriori (which comes after)
probability, because it's determined by a
marginal Apriori (which comes first)
probability multiplied by a factor which is
called likelihood.
60
61
MAP learning(maximum a posteriori )
• When selecting the right hypothesis, a Bayesian approach is
normally one of the best choices,
• For example, a real coin is a very short cylinder, so, in tossing
a coin, we should also consider the probability of even.
• Let's say, it‘s 0.001. It means that we have three possible
outcomes: P(head) = P(tail) = (1.0 - 0.001) / 2.0 and P(even) =
0.001. The latter event is obviously unlikely, but in Bayesian
learning it must be considered (even if it'll be squeezed by the
strength of the other terms).
• An alternative is picking the most probable hypothesis in
terms of a posteriori probability:
•

62
Maximum-likelihood learning
• We have defined likelihood as a filtering term
in the Bayes formula. In general, it has the
form of:

• Here the first term expresses the actual likelihood of

a hypothesis, given a dataset X. As you can imagine,
in this formula there are no more Apriori
probabilities, so, maximizing it doesn't imply
accepting a theoretical preferential hypothesis, nor
considering unlikely ones. A very common approach,
known as expectation-maximization and used in
many algorithms
63
• A log-likelihood (normally called L) is a useful
trick that can simplify gradient calculations. A
generic likelihood expression is:

• As all parameters are inside hi, the gradient is

a complex expression which isn't very
manageable. However our goal is maximizing
the likelihood, but it's easier minimizing its
reciprocal:

64
Elements of information theory
• A machine learning problem can also be analyzed in
terms of information transfer or exchange. Our
dataset is composed of n features, which are
considered independent (for simplicity, even if it's
often a realistic assumption) drawn from n different
statistical distributions.
• Therefore, there are n probability density functions
pi(x) which must be approximated through other n
qi(x) functions.
• In any machine learning task, it's very important to
understand how two corresponding distributions
diverge and what is the amount of information we
lose when approximating the original dataset. 65
The most useful measure is called
entropy:

This value is proportional to the uncertainty of X and it's

measured in bits (if the logarithm has another base, this
unit can change too). For many purposes, a high entropy
is preferable, because it means that a certain feature
contains more information. For example, in tossing a coin
(two possible outcomes), H(X) = 1 bit, but if the number
of outcomes grows, even with the same probability, H(X)
also does because of a higher number of different values
and therefore increased variability. It's possible to prove
66
that for a Gaussian distribution (using natural logarithm):
• So, the entropy is proportional to the variance, which
is a measure of the amount of information carried by
a single feature.
• low variance implies low information level and a
model could often discard all those features.
• If we have a target probability distribution p(x),
which is approximated by another distribution q(x), a
useful measure is cross-entropy between p and q

67
• In order to understand how a machine learning
approach is performing, it's also useful to introduce a
conditional entropy or the uncertainty of X given the
knowledge of Y:

• it's possible to introduce the idea of mutual

information, which is the amount of information
shared by both variables and therefore, the
reduction of uncertainty about X provided by the
knowledge of Y:

68
• Intuitively, when X and Y are independent,
they don't share any information. However, in
machine learning tasks, there's a very tight
dependence between an original feature and
its prediction, so we want to maximize the
information shared by both distributions.
• If the conditional entropy is small enough (so
Y is able to describe X quite well), the mutual
information gets close to the marginal entropy
H(X), which measures the amount of
information we want to learn.
69
References
• Russel S., Norvig P., Artificial Intelligence: A Modern Approach, Pearson
• Valiant L., A Theory of the Learnable, Communications of the ACM, Vol. 27, No. 11
(Nov. 1984)
• Hastie T., Tibshirani R., Friedman J., The Elements of Statistical Learning: Data
Mining, Inference and, Prediction, Springer
• Aleksandrov A.D., Kolmogorov A.N, Lavrent'ev M.A., Mathematics: Its contents,
Methods, and Meaning, Courier Corporation
• https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-
algorithms

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
LAB 13-PE-Lab
No ratings yet
LAB 13-PE-Lab
6 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Ch7 Introduction to Machine Learning
No ratings yet
Ch7 Introduction to Machine Learning
29 pages
Machine Learning KTU Module 1
No ratings yet
Machine Learning KTU Module 1
77 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
28 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
W1_ Introduction to ML
No ratings yet
W1_ Introduction to ML
57 pages
CHP 1
No ratings yet
CHP 1
47 pages
L3 - Supervised and Unsupervised Learning
100% (3)
L3 - Supervised and Unsupervised Learning
24 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Lec 7_8_Machine Learning Introduction
No ratings yet
Lec 7_8_Machine Learning Introduction
55 pages
Chapter Five
No ratings yet
Chapter Five
178 pages
Machine Learning
No ratings yet
Machine Learning
46 pages
1.Introduction
No ratings yet
1.Introduction
24 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
Unit1-2
No ratings yet
Unit1-2
101 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Ai Chapter 5
No ratings yet
Ai Chapter 5
45 pages
1. U1 ML Intro and Applications
No ratings yet
1. U1 ML Intro and Applications
123 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
21CSC305P ML_ Unit 1-E.pptx
No ratings yet
21CSC305P ML_ Unit 1-E.pptx
137 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
No ratings yet
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
33 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
unit 01
No ratings yet
unit 01
32 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
An Overview of Machine Learning
No ratings yet
An Overview of Machine Learning
20 pages
Module 1
No ratings yet
Module 1
175 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
mlintro-4
No ratings yet
mlintro-4
28 pages
1 Leaning Introduction
No ratings yet
1 Leaning Introduction
29 pages
UNIT III
No ratings yet
UNIT III
39 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
46 pages
Unit 1 - Machine Learning - NOTES1 - ML
No ratings yet
Unit 1 - Machine Learning - NOTES1 - ML
52 pages
1 Lecture 1: Introduction To Machine Learning
No ratings yet
1 Lecture 1: Introduction To Machine Learning
12 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
MLP Unit-I
No ratings yet
MLP Unit-I
62 pages
Unit_1_ML - Copy
No ratings yet
Unit_1_ML - Copy
96 pages
Lecture 1- Introduction to Machine Learning-HO - Ch0
No ratings yet
Lecture 1- Introduction to Machine Learning-HO - Ch0
44 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
27 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Supervised and Deep Learning
No ratings yet
Supervised and Deep Learning
83 pages
Machine-Learning NOTE2025 2
No ratings yet
Machine-Learning NOTE2025 2
331 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Introduction to machine learning
No ratings yet
Introduction to machine learning
33 pages
mlintro-2
No ratings yet
mlintro-2
28 pages
4. Ai_foundations of Machine Learning i
No ratings yet
4. Ai_foundations of Machine Learning i
40 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Face Recognition System Using Containers For Embedded Devices On A Customized Operating System
No ratings yet
Face Recognition System Using Containers For Embedded Devices On A Customized Operating System
26 pages
CS252 Graduate Computer Architecture
No ratings yet
CS252 Graduate Computer Architecture
34 pages
LP-III Lab Manual
No ratings yet
LP-III Lab Manual
49 pages
Hci Unit Test 2 QP
No ratings yet
Hci Unit Test 2 QP
1 page
Assignment 1 - ICS
No ratings yet
Assignment 1 - ICS
1 page
A Good Site Should Always Begin With The User
No ratings yet
A Good Site Should Always Begin With The User
10 pages
Directory of Development Organizations: EDITION 2008 Volume I.A / Africa
No ratings yet
Directory of Development Organizations: EDITION 2008 Volume I.A / Africa
423 pages
Entrepreneurship, Professional Ethics and Communication Epecpsb301
No ratings yet
Entrepreneurship, Professional Ethics and Communication Epecpsb301
36 pages
Zeal Polytechnic Pune: Micro Project
No ratings yet
Zeal Polytechnic Pune: Micro Project
16 pages
Nadimpalli Venkat Varma
No ratings yet
Nadimpalli Venkat Varma
2 pages
CodeCharge Studio Manual B3
No ratings yet
CodeCharge Studio Manual B3
142 pages
CG Communications_1732118916.1960301
No ratings yet
CG Communications_1732118916.1960301
1 page
m16 Script
No ratings yet
m16 Script
4 pages
S7220D Catalog
No ratings yet
S7220D Catalog
4 pages
2 CNN-Motivation
No ratings yet
2 CNN-Motivation
17 pages
Samsung Network Video Recorder SRN-473S - Manual
No ratings yet
Samsung Network Video Recorder SRN-473S - Manual
192 pages
CCNP Enterprise-Roadmap
No ratings yet
CCNP Enterprise-Roadmap
5 pages
EPAS 10 - Q2 - Mod2
No ratings yet
EPAS 10 - Q2 - Mod2
24 pages
US Blank Download Utility Bill
No ratings yet
US Blank Download Utility Bill
2 pages
Datasheet
No ratings yet
Datasheet
28 pages
Document - Onl Alarmes Erros e Eventos Fiberhome
No ratings yet
Document - Onl Alarmes Erros e Eventos Fiberhome
212 pages
Ser Man GEM
No ratings yet
Ser Man GEM
156 pages
FARISH K B - Marketing-Cv
No ratings yet
FARISH K B - Marketing-Cv
1 page
LFI Vulnerability
100% (1)
LFI Vulnerability
7 pages
Typical Beme For A Building Project
No ratings yet
Typical Beme For A Building Project
10 pages
AI Governance A Consolidated Reference
No ratings yet
AI Governance A Consolidated Reference
124 pages
Acceptance of Visual Docking Guidence
No ratings yet
Acceptance of Visual Docking Guidence
11 pages
UTPC Industrial Company Profile
No ratings yet
UTPC Industrial Company Profile
11 pages
40ft Standard - Open Top
No ratings yet
40ft Standard - Open Top
1 page
Aramco Interview Questions
No ratings yet
Aramco Interview Questions
7 pages
TVL-CSS11-Q3-M3-CABLE-ROUTES-OF-PEER-TO-PEER-NETWORK-DESIGN-edited
No ratings yet
TVL-CSS11-Q3-M3-CABLE-ROUTES-OF-PEER-TO-PEER-NETWORK-DESIGN-edited
15 pages
Question Bank PDC
No ratings yet
Question Bank PDC
1 page
SnapScan Payments Guide
No ratings yet
SnapScan Payments Guide
7 pages
Order Scheduling Status Change Scenarios
No ratings yet
Order Scheduling Status Change Scenarios
7 pages
Aldebahran - Estudos de Astrologia A Lua Nas Casas
No ratings yet
Aldebahran - Estudos de Astrologia A Lua Nas Casas
20 pages

Unit 1 ML

Uploaded by

Unit 1 ML

Uploaded by

Machine Learning

(BE Computer 2015 PAT)

The image above roughly explains how

• First we provide the dataset to the system i.e we

• The corresponding output values can be both

• We define generic regressor, a vector-valued

In unsupervised learning, we normally only have an input

In most scikit-learn models, there is an instance variable coef_ which

• it's useful to consider the mean square error

• The predictor is like a ball upon a rough surface: starting

P1 Contains > 5 blacklisted

• In the previous equation, the first term is

• Here the first term expresses the actual likelihood of

• As all parameters are inside hi, the gradient is

This value is proportional to the uncertainty of X and it's

• it's possible to introduce the idea of mutual

You might also like