0% found this document useful (0 votes)
8 views

2023-24_ML_NOTES_1

The document provides a comprehensive overview of machine learning, including its definitions, key concepts, and applications across various fields. It categorizes machine learning into three types: supervised learning, unsupervised learning, and reinforcement learning, detailing their characteristics and examples. Additionally, it discusses the fundamental components of the learning process, such as data storage, abstraction, generalization, and evaluation.

Uploaded by

pm.xvi.xii.mmiii
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

2023-24_ML_NOTES_1

The document provides a comprehensive overview of machine learning, including its definitions, key concepts, and applications across various fields. It categorizes machine learning into three types: supervised learning, unsupervised learning, and reinforcement learning, detailing their characteristics and examples. Additionally, it discusses the fundamental components of the learning process, such as data storage, abstraction, generalization, and evaluation.

Uploaded by

pm.xvi.xii.mmiii
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

MACHINE LEARNING

NOTES

Summary of Key concepts

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

Introduction

Definition of Machine Learning

Arthur Samuel, an early American leader in the field of computer gaming and artificial
intelligence, coined the term “Machine Learning” in 1959 while at IBM. He defined
machine learning as “the field of study that gives computers the ability to learn without
being explicitly programmed.” However, there is no universally accepted definition for
machine learning. Different authors define the term differently. We give below two more
definitions.

Machine learning is programming computers to optimize a performance


criterion using example data or past experience. We have a model defined
up to some parameters, and learning is the execution of a computer
program to optimize the parameters of the model using the training data or
past experience.
The field of study known as machine learning is concerned with the question of
how to construct computer programs that automatically improve with experience.

 In the above definitions we have used the term “model” and we will be using this term
at several contexts later. It appears that there is no universally accepted one sentence
definition of this term. Loosely, it may be understood as some mathematical expression
or equation, or some mathematical structures such as graphs and trees, or a division of
sets into disjoint subsets, or a set of logical “if . . . then . . . else . . .” rules, or some
such thing. It may be noted that this is not an exhaustive list.

 One of the main motivations why we develop (computer) programs is to


automate various kinds of (often tedious) processes. Originally, machine
learning was developed as a subfield of Artificial Intelligence (AI), and one of
the goals behind machine learning was to replace the need for developing
computer programs “manually.” Considering that programs are being
developed to automate processes, we can think of machine learning as the
process of “automating automation.”

 In other words, machine learning lets computers “create” programs (often, the
intent for developing these programs is making predictions) themselves. We
can say that machine learning is the process of turning data into programs
(Figure 1).

 In the machine learning community, it is broadly accepted that the term

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

machine learning was first coined by Arthur Lee Samuel, a pioneer in the AI
field, in 19591.

One quotation that almost every introductory machine learning resource cites is
the following, which summarizes the concept behind machine learning nicely
and concisely:
Machine learning is the field of study that gives computers the ability to
learn without being explicitly programmed. 2
— Arthur L. Samuel, AI
pioneer, 1959
Now, before we introduce machine learning more formally, here is what some
other people said about the field:

The field of machine learning is concerned with the question of how to


construct computer programs that automatically improve with experience.
— Tom Mitchell, Professor Machine Learning at Carnegie Mellon University
and author of the popular “Machine Learning” textbook

1
Arthur L Samuel. “Some studies in machine learning using the game of
checkers”. In: IBM Journal of research and development 3.3 (1959), pp. 210–
229.
2
This is not a direct quote but a paraphrased version of Samuel’s sentence
”Programming computers to learn from experience should eventually eliminate
the need for much of this detailed programming effort.”

Machine learning is the hot new thing.


— John L. Hennessy, President of Stanford (2000–2016)

A breakthrough in machine learning would be worth ten Microsofts.


— Bill Gates, Microsoft Co-Founder

The Traditional Programming Paradigm:

Inputs (observations)

Programm Progra Computer Outputs


er m

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

A bit more concrete is Tom Mitchell’s description from his Machine Learning book3:

A computer program is said to learn from experience E with respect to some


class of tasks T and performance measure P , if its performance at tasks in T , as
measured by P , improves with experience E.
— Tom Mitchell, Machine Learning Professor at Carnegie Mellon University

Definition of learning

A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks T, as measured by P, improves with experience
E.

Examples:

i) Handwriting recognition learning problem

• Task T: Recognising and classifying handwritten words within images

• Performance P: Percent of words correctly classified

• Training experience E: A dataset of handwritten words with given classifications

ii) A robot driving learning problem

• Task T: Driving on highways using vision sensors

• Performance measure P: Average distance traveled before an error

• training experience: A sequence of images and steering commands recorded while


observing a human driver

iii) A chess learning problem

• Task T: Playing chess

• Performance measure P: Percent of games won against opponents

• Training experience E: Playing practice games against itself

A computer program which learns from experience is called a machine learning program or
simply a learning program. Such a program is sometimes also referred to as a learner.

How machines learn

Basic components of learning process:

The learning process, whether by a human or a machine, can be divided into four components,

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

namely, data storage, abstraction, generalization and evaluation.

Data storage: Facilities for storing and retrieving huge amounts of data are an important
component of the learning process. Humans and computers alike utilize data storage as a
foundation for advanced reasoning.

• In a human being, the data is stored in the brain and data is retrieved using electrochemical
signals.

• Computers use hard disk drives, flash memory, random access memory and similar devices to
store data and use cables and other technology to retrieve data.

Abstraction

The second component of the learning process is known as abstraction. Abstraction is the process
of extracting knowledge about stored data. This involves creating general concepts about the data
as a whole. The creation of knowledge involves application of known models and creation of
new models.

The process of fitting a model to a dataset is known as training. When the model has been
trained, the data is transformed into an abstract form that summarizes the original information.

Generalization

The third component of the learning process is known as generalisation.The term generalization describes
the process of turning the knowledge about stored data into a form that can be utilized for future action.
These actions are to be carried out on tasks that are similar, but not identical, to those what have been seen
before. In generalization, the goal is to discover those properties of the data that will be most relevant to
future tasks.

Evaluation . It is the process of giving feedback to the user to measure the utility of the learned
knowledge. This feedback is then utilised to effect improvements in the whole learning process.

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

Applications of machine learning

Application of machine learning methods to large databases is called data mining. In data
mining, a large volume of data is processed to construct a simple model with valuable use, for
example, having high predictive accuracy.

The following is a list of some of the typical applications of machine learning.

1. In retail business, machine learning is used to study consumer behaviour.

2. In finance, banks analyze their past data to build models to use in credit applications, fraud
detection, and the stock market.

3. In manufacturing, learning models are used for optimization, control, and troubleshooting.

4. In medicine, learning programs are used for medical diagnosis.

5. In telecommunications, call patterns are analyzed for network optimization and maximizing
the quality of service.

6. In science, large amounts of data in physics, astronomy, and biology can only be analyzed fast
enough by computers. The World Wide Web is huge; it is constantly growing and searching for
relevant information cannot be done manually.

7. In artificial intelligence, it is used to teach a system to learn and adapt to changes so that the
system designer need not foresee and provide solutions for all possible situations.

8. It is used to find solutions to many problems in vision, speech recognition, and robotics.

9. Machine learning methods are applied in the design of computer-controlled vehicles to


steer correctly when driving on a variety of roads.

10. Machine learning methods have been used to develop programmes for playing games
such as chess, backgammon and Go.

Additional Applications of Machine Learning


After the field of machine learning was “founded” more than a half a century ago, we can now
find applications of machine learning in almost every aspect of our life. Popular applications
of machine learning include the following:

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

• Email spam detection

• Face detection and matching (e.g., iPhone X, Windows laptops, etc.)

• Web search (e.g., DuckDuckGo, Bing, Baidu, Google)

• Sports predictions

• Post office (e.g., sorting letters by zip codes)

• ATMs (e.g., reading checks)

• Credit card fraud

• Stock predictions

• Smart assistants (Apple Siri, Amazon Alexa, . . . )

• Product recommendations (e.g., Walmart, Netflix, Amazon)

• Self-driving cars (e.g., Uber, Tesla)

• Language translation (Google translate)

• Sentiment analysis

• Drug design

• Medical diagnoses

• ...

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

TYPES OF MACHINE LEARNING


In general, machine learning algorithms can be classified into three types.

Overview of the Categories of Machine Learning

The three broad categories of machine learning are summarized in Figure 3:

(1) super- vised learning,

(2) unsupervised learning, and

(3) reinforcement learning.

Labeled data
Learning Direct feedback
outcome/future

No feedback

Reinforcement Learning Reward system

Figure 3: Categories of machine learning (Source: Raschka & Mirjalili: Python


Machine Learning, 3rd Ed.).

Supervised Learning
Supervised learning is the subcategory of machine learning that focuses on learning a
classification (Figure 4), or regression model (Figure 5), that is, learning from labeled
training data (i.e., inputs that also contain the desired outputs or targets; basically,
“examples” of what we want to predict).

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

x2

x1

Figure 4: Illustration of a binary classification problem (plus and minus signs denote
class labels) and two feature variables, (x1 and x2). (Source: Raschka & Mirjalili:
Python Machine Learning, 3rd Ed.).

Figure 5: Illustration of a linear regression model with one feature variable (x1) and the target
variable y. The dashed-line indicates the functional form of the linear regression model. (Source:
Raschka & Mirjalili: Python Machine Learning, 3rd Ed.).

Supervised learning is the machine learning task of learning a function that maps an input to an
output based on example input-output pairs.

In supervised learning, each example in the training set is a pair consisting of an input object
(typically a vector) and an output value. A supervised learning algorithm analyzes the training
data and produces a function, which can be used for mapping new examples. In the optimal case,
the function will correctly determine the class labels for unseen instances. Both classification and
regression problems are supervised learning problems.

A wide range of supervised learning algorithms are available, each with its strengths and
weaknesses. There is no single learning algorithm that works best on all supervised learning

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

problems.

A “supervised learning” is so called because the process of algorithm learning from the training
dataset can be thought of as a teacher supervising the learning process. We know the correct
answers (that is, the correct outputs), the algorithm iteratively makes predictions on the training
data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable
level of performance.

Example :

Consider the following data regarding patients entering a clinic. The data consists of the gender
and age of the patients and each patient is labelled as “healthy” or “sick”.

Unsupervised learning
In contrast to supervised learning, unsupervised learning is a branch of machine learning
that is concerned with unlabeled data. Common tasks in unsupervised learning are clustering
analysis (assigning group memberships; Figure 6) and dimensionality reduction
(compressing data onto a lower-dimensional subspace or manifold).

x2

x1

Figure: Illustration of clustering, where the dashed lines indicate potential group
membership assignments of unlabeled data points. (Source: Raschka & Mirjalili:
Python Machine Learning, 3rd Ed.).

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

Unsupervised learning is a machine learning technique in which models are not supervised
using training dataset. Instead, models itself find the hidden patterns and insights from the
given data. It can be compared to learning which takes place in the human brain while learning
new things.

 Unsupervised learning cannot be directly applied to a regression or classification


problem because unlike supervised learning, we have the input data but no corresponding
output data. The goal of unsupervised learning is to find the underlying structure of
dataset, group that data according to similarities, and represent that dataset in a
compressed format.
 It a type of machine learning algorithm used to draw inferences from datasets consisting
of input data without labeled responses.
 In unsupervised learning algorithms, a classification or categorization is not included in
the observations. There are no output values and so there is no estimation of functions.
Since the examples given to the learner are unlabeled, the accuracy of the structure that is
output by the algorithm cannot be evaluated.

 The most common unsupervised learning method is cluster analysis, which is used for
exploratory data analysis to find hidden patterns or grouping in data.

Example :

Consider the following data regarding patients entering a clinic. The data consists of the
gender and age of the patients.

Based on this data, can we infer anything regarding the patients entering the clinic?

Example: Suppose the unsupervised learning algorithm is given an input dataset containing
images of different types of cats and dogs. The algorithm is never trained upon the given dataset,
which means it does not have any idea about the features of the dataset. The task of the
unsupervised learning algorithm is to identify the image features on their own. Unsupervised
learning algorithm will perform this task by clustering the image dataset into the groups
according to similarities between images.

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

Importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to
solve such cases, we need unsupervised learning.

Working of unsupervised learning can be understood by the below diagram:

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means clustering, Decision tree,
etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.

o Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects
of another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.

Reinforcement learning
 Reinforcement is the process of learning from rewards while performing a series of
actions.
 In reinforcement learning, we do not tell the learner or agent (for example, a
robot), which action to take but merely assign a reward to each action and/or the
overall outcome. Instead of having “correct/false” labels for each step, the learner
must discover or learn a behavior that maximizes the reward for a series of actions.
In that sense, it is not a supervised setting.
 RL is somewhat related to unsupervised learning; however, reinforcement learning
really is its own category of machine learning.
 Reinforcement learning is the problem of getting an agent to act in the world so as to
maximize its rewards.A learner (the program) is not told what actions to take as in most
forms of machine learning, but instead must discover which actions yield the most
reward by trying them. In the most interesting and challenging cases, actions may affect
not only the immediate reward but also the next situations and, through that, all
subsequent rewards.

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

 Typical applications of reinforcement learning :involve playing games (chess, Go,


Atari video games) and some form of robots, e.g., drones, warehouse robots, and
more recently self- driving cars.

Environment
Reward

Action

Agent

Figure 7: Illustration of reinforcement learning (Source: Raschka & Mirjalili: Python Machine
Learning, 3rd Ed.).

 For example, consider teaching a dog a new trick: we cannot tell it what to do, but we can
reward/punish it if it does the right/wrong thing. It has to find out what it did that made it get
the reward/punishment. We can use a similar method to train computers to do many tasks,
such as playing backgammon or chess, scheduling jobs, and controlling robot limbs.
Reinforcement learning is different from supervised learning. Supervised learning is learning
from examples provided by a knowledgeable expert.

Semi-supervised learning
 It can be described as a mix between supervised and unsupervised learning.
 In semi-supervised learning tasks, some training examples contain outputs, but
some do not.
 We then use the labeled training subset to label the unlabeled portion of the
training set, which we then also utilize for model training.

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

The Relationship between Machine Learning and Other Fields

Machine Learning and Data Mining

 Data mining focuses on the discovery of patterns in datasets or “gaining knowledge


and insights” from data – often, this involves a heavy focus on computational
techniques, working with databases, etc (nowdays, the term is more or less
synonymous to “data science”). We can then think of machine learning algorithms as
tools within a data mining project.
 Data mining is not “just” but also emphasis data processing, visualization, and tasks
that are traditionally not categorized as “machine learning” (for example,
association rule mining).

Machine Learning, AI, and Deep Learning

 Artificial intelligence (AI) was created as a subfield of computer science focussing on


solving tasks that humans are good at (for example, natural language processing,
image recognition). Or in other words, the goal of AI is to mimick human
intelligence.

 There are two subtypes of AI: Artificial general intelligence (AGI) and narrow AI.
AGI refers to an intelligence that equals humans in several tasks, i.e., multi-
purpose AI. In contrast, narrow AI is more narrowly focused on solving a particular
task that humans are traditionally good at (e.g., playing a game, or driving a car – I
would not go so far and refer to “image classification” as AI).

 In general, AI can be approached in many ways. One approach is to write a


computer program that implements a set of rules devised by domain experts. Now,
hand-crafting rules

13
David H Wolpert. “The lack of a priori distinctions between learning
algorithms”. In: Neural compu- tation 8.7 (1996), pp. 1341–1390.

Faculty-in-charge: Prof Ruchi K Sharma , NMIMS,Mumbai


Machine Learning Fundamentals

can be very laborious and time consuming.


 The field of machine learning then emerged as a subfield of AI – it was concerned with
the development of algorithms so that computers can automatically learn (predictive)
models from data.
 Assume we want to develop a program that can recognize handwritten digits from
images. One approach would be to look at all of these images and come up with a set of
(nested) if- this-than-that rules to determine which digit is displayed in a particular image
(for instance, by looking at the relative locations of pixels). Another approach would be
to use a machine learning algorithm, which can fit a predictive model based on a
thousands of labeled image samples that we may have collected in a database.
 Now, there is also deep learning, which in turn is a subfield of machine learning,
referring to a particular subset of models that are particularly good at certain tasks such as
image recognition and natural language processing.
 Or in short, machine learning (and deep learning) can definitely be helpful with
develop- ing “AI,” however, AI doesn’t necessarily have to be developed using machine
learning – although, machine learning makes “AI” much more convenient.

particular, multi-layered models


that learn representations of data with multiple levels of
abstraction
Machine Learning

Deep Learning
AI

self-learning algorithms that learn models from


“data”
a system that is “intelligent” through rules

Figure 14: Relationship between machine learning, deep learning, and artificial intelligence. Note that there is
also overlap between Machine learning and data mining, data science, statistics, etc. (not shown).

Machine Learning Concepts Summary


Machine Learning Fundamentals

REGRESSION

Linear regression:
Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression.
The linear regression model provides a sloped straight line representing the relationship between
the variables. Consider the below image:

Mathematically, we can represent a linear regression as: y=


a0+a1x+ ε
Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)

Machine Learning Concepts Summary


Machine Learning Fundamentals

a1 = Linear regression coefficient (scale factor to each input value). ε =


random error
The values for x and y variables are training datasets for Linear Regression model representation.
Regression Models
Linear regression can be further divided into two types of the algorithm:
o Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.
o Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.

Linear regression in simple term is answering a question on “How can I use X to predict Y?”
where X is some information that you have, and Y is some information that you want.
Let’s say you wanted a sell a house and you wanted to know how much you can sell it for.
You have information about the house that is your X and the selling price that you wanted to
know will be your Y.
Linear regression creates an equation in which you input your given numbers (X) and it
outputs the target variable that you want to find out (Y).
Linear Regression model representation
Linear regression is such a useful and established algorithm, that it is both a statistical model
and a machine learning model. Linear regression tries a draw a best fit line that is close to the
data by finding the slope and intercept.
Linear regression equation is,
Y=a+bx
In this equation:

 y is the output variable. It is also called the target variable in machine learning or the
dependent variable.
 x is the input variable. It is also referred to as the feature in machine learning or it is called the
independent variable.
 a is the constant

Machine Learning Concepts Summary


Machine Learning Fundamentals

 b is the coefficient of independent variable\

Multiple linear regression


Multiple Linear Regression assumes there is a linear relationship between two or more
independent variables and one dependent variable.
The Formula for multiple linear regression:
Y=B0+B0X1+B2X2+……+BnXn+e

 Y = the predicted value of the dependent variable


 B0 = the y-intercept (value of y when all other parameters are set to 0)
 B1X1= the regression coefficient (B1) of the first independent variable (X1)
 BnXn = the regression coefficient of the last independent variable
 e = model error
Cost-function

The cost function is defined as the measurement of difference or error between actual values and
expected values at the current position and present in the form of a single real number.

 Mean Squared Error represents the average of the squared difference between the original
and predicted values in the data set. It measures the variance of the residuals.

 Root Mean Squared Error is the square root of Mean Squared error. It measures the
standard deviation of residuals.

 The coefficient of determination or R-squared represents the proportion of the variance in the
dependent variable which is explained by the linear regression model. It is a scale- free score i.e.
irrespective of the values being small or large, the value of R square will be less than one.

Machine Learning Concepts Summary


Machine Learning Fundamentals

 Evaluation Metrics

 Mean Squared Error(MSE) and Root Mean Square Error penalizes the large prediction errors
vi-a-vis Mean Absolute Error (MAE). However, RMSE is widely used than MSE to evaluate the
performance of the regression model with other random models as it has the same units as the
dependent variable (Y-axis).

 MSE is a differentiable function that makes it easy to perform mathematical operations in


comparison to a non-differentiable function like MAE. Therefore, in many models, RMSE is
used as a default metric for calculating Loss Function despite being harder to interpret than
MAE.The lower value of MAE, MSE, and RMSE implies higher accuracy of a regression model.
However, a higher value of R square is considered desirable.

 R Squared & Adjusted R Squared are used for explaining how well the independent variables
in the linear regression model explains the variability in the dependent variable. R Squared value
always increases with the addition of the independent variables which might lead to the addition
of the redundant variables in our model. However, the adjusted R-squared solves this
problem.For comparing the accuracy among different linear regression models, RMSE is a
better choice than R Squared.

Logistic Regression
o Logistic regression is one of the most popular Machine Learning algorithms, which comes
under the Supervised Learning technique. It is used for predicting the categorical dependent
variable using a given set of independent variables.
o Logistic regression predicts the output of a categorical dependent variable. Therefore the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
o Logistic Regression is much similar to the Linear Regression except that how they are used.
Linear Regression is used for solving Regression problems, whereas Logistic regression is
used for solving the classification problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
o The curve from the logistic function indicates the likelihood of something such as whether the
Machine Learning Concepts Summary
Machine Learning Fundamentals

cells are cancerous or not, a mouse is obese or not based on its weight, etc.
o Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different types of data and
can easily determine the most effective variables used for the classification. The below image
is showing the logistic function:

Machine Learning Concepts Summary


Machine Learning Fundamentals

Logistic Function (Sigmoid Function):


o The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond this
limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid function
or the logistic function.
o In logistic regression, we use the concept of the threshold value, which defines the probability
of either 0 or 1. Such as values above the threshold value tends to 1, and a value below the
threshold values tends to 0.
Assumptions for Logistic Regression:
o The dependent variable must be categorical in nature.
o The independent variable should not have multi-collinearity.
Logistic Regression Equation:
The Logistic regression equation can be obtained from the Linear Regression equation. The
mathematical steps to get Logistic Regression equations are given below:
o We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above
equation by (1-y):

Machine Learning Concepts Summary


Machine Learning Fundamentals

o But we need range between -[infinity] to +[infinity], then take logarithm of the equation
it will become:

The above equation is the final equation for Logistic Regression.

Type of Logistic Regression:


On the basis of the categories, Logistic Regression can be classified into three types:
o Binomial: In binomial Logistic regression, there can be only two possible types of the
dependent variables, such as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as "cat", "dogs", or "sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of
dependent variables, such as "low", "Medium", or "High".
Multinomial Logistic Regression
Multinomial Logistic Regression is a classification technique that extends the logistic regression
algorithm to solve multiclass possible outcome problems, given one or more independent variables.
Example for Multinomial Logistic Regression:
(a) Which Flavor of ice cream will a person choose?
Dependent Variable:
 Vanilla
 Chocolate
 Butterscotch
 Black Current
Independent Variables:
 Gender
 Age
 Occasion
 Happiness
 Etc.
Multinomial Logistic Regression is also known as multiclass logistic regression, softmax regression,
polytomous logistic regression, multinomial logit, maximum entropy (MaxEnt) classifier and
conditional maximum entropy model.

Machine Learning Concepts Summary


Machine Learning Fundamentals

Dependent Variable:
The dependent Variable can have two or more possible outcomes/classes.
The dependent variables are nominal in nature means there is no any kind of ordering in target
dependent classes i.e. these classes cannot be meaningfully ordered.
The dependent variable to be predicted belongs to a limited set of items defined.
Basic Steps

The basic steps of the SVM are:


1. select two hyperplanes (in 2D) which separates the data with no points between them
(red lines)
2. maximize their distance (the margin)
3. the average line (here the line half way between the two red lines) will be the decision
boundary

This is very nice and easy, but finding the best margin, the optimization problem is not trivial (it is easy in 2D, when
we have only two attributes, but what if we have N dimensions with N a very big number).

Machine Learning Concepts Summary


Machine Learning Fundamentals

Reference Textbooks :
 Machine Learning –Saikat Dutt, Subramanian Chandramouli, Amit Kumar Das, Pearson
 Foundations of Machine Learning, Mehryar Mohri, Afshin Rostamizadeh, Ameet
Talwalkar, MIT Press.
 Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press,2012
 Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning,
Springer2009
 Data Mining–Concepts and Techniques -Jiawei Han and Micheline
Kamber,Morgan Kaufmann

Machine Learning Concepts Summary

You might also like