0% found this document useful (0 votes)
4 views

Lecture2 - General Concepts For ML

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture2 - General Concepts For ML

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN


VNU-University of Engineering and Technology

INT3405 - Machine Learning


Lecture 2: General Concepts for ML

Hanoi, 09/2024
Recap: Traditional Programming vs Machine Learning

https://images.techhive.com/images/article/2017/05/traditional-programming-vs-machine-learning-100723299-large.jpg

FIT-CS INT3405 - Machine Learning 2


Recap: Machine Learning vs. Deep Learning

Source: https://www.linkedin.com/pulse/lets-understand-difference-between-machine-learning-vs-gauri-bapat

FIT-CS INT3405 - Machine Learning 3


Recap: Machine Learning vs. Deep Learning vs AI

Source: https://www.ibm.com/blogs/systems/ai-machine-learning-and-deep-learning-whats-the-difference/

FIT-CS INT3405 - Machine Learning 4


Recap: Types of Machine Learning

Source: https://medium.com/marketing-
and-entrepreneurship/10-companies-
using-machine-learning-in-cool-ways-
887c25f913c3

FIT-CS INT3405 - Machine Learning 5


Outline
● Statistics - Probability
● Typical Data Distribution
● Typical Measurements
○ Entropy, Cross Entropy
○ Mutual Information
○ Kullback-Leibler Divergence
● Learning Theory

FIT-CS INT3405 - Machine Learning 6


What animal is it?

A cat?
or a tiger?

FIT-CS INT3405 - Machine Learning 7


Game: Toss a coin

FIT-CS INT3405 - Machine Learning 8


Random Variable (RV)
• Random variable (RV) is a variable whose possible values are
numerical outcomes of a random phenomena.
Random Variable

Discrete RV Continuous RV
Discrete Random Variable (DRV)
• Let toss the coin twice, and X is the number of
the head occurs.
How many values of X ?

• Let toss the dice twice, and X is the sum of the


dice
How many values of X?

FIT-CS INT3405 - Machine Learning 10


DRV - Probability Density Function (1)
• A discrete random variable X whose values in x1, x2, …, xn.
• Probability Density Function:
• Denotation: pi = f(xi) = P(X=xi)
• Conditions:
x1 x2 Xn-1 xn

• f(x1) f(xn-1) f(xn)

f(x2)

FIT-CS INT3405 - Machine Learning 11


DRV – Probability Density Function (2)
Example: Toss two coins, X is denoted as the occurrences of head.
PDF
S S x P(x)

0 1/4 = .25
S H 1 2/4 = .50

2 1/4 = .25
H S
.50

Probability
H H .25

There are 4 possible outcomes 0 1 2 x

FIT-CS INT3405 - Machine Learning 12


Continuous Random Variable (CRV)

Time to
Weight & Height complete a task

FIT-CS INT3405 - Machine Learning 13


CRV – Probability Density Function (1)
• f(x) is the probability density function of a given continuous random
variable X, if:

FIT-CS INT3405 - Machine Learning 14


Cumulative Probability Distribution (1)
• Consider a random variable X, its cumulative probability function
F(x) is defined as follows:

• The probability of X in (a,b] as:

FIT-CS INT3405 - Machine Learning 15


Cumulative Probability Distribution (2)

4) Probability density function f(x) = F’(x) as long as the derivative exits

FIT-CS INT3405 - Machine Learning 16


Cumulative Probability Distribution (3)
• Discrete Random Variable:

• Continuous Random Variable:

FIT-CS INT3405 - Machine Learning 17


Expected Value of Random Variable
• Discrete Random Variable:

• Continuous Random Variable:

FIT-CS INT3405 - Machine Learning 18


Expected Value - Properties
1) E[C] = C, C: constant
2) E[CX] = C.E[X]
3) E[X + Y]=E[X] + E[Y]
4) E[XY] = E[X].E[Y] if X and Y are independent
5) Given a function h(x), we have
if X is discrete

if X is continuous

FIT-CS INT3405 - Machine Learning 19


Variance of Random Variable

FIT-CS INT3405 - Machine Learning 20


Variance - Properties
1) Var[C] = 0, C: constant
2) Var[CX] = C2 Var[X]
Var[X + C]=Var[X]
3) Var[X + Y] = Var[X] + Var[Y] if X and Y are independent.

FIT-CS INT3405 - Machine Learning 21


Three Rules of Probability

(Bayesian Rule)

FIT-CS INT3405 - Machine Learning 22


Joint Probability

FIT-CS INT3405 - Machine Learning 23


Marginal Probability
• Discrete random variable:

• Continuous random variable:

FIT-CS INT3405 - Machine Learning 24


Conditional Probability
• Bayesian Rule:

• Conditional Probability:

FIT-CS INT3405 - Machine Learning 25


Posterior-, Prior Probability & Likelihood

Posterior Prior
Likelihood
probability probability

FIT-CS INT3405 - Machine Learning 26


Independence & Dependence
• If x, y are independent, so:

• Bayesian Rule:

FIT-CS INT3405 - Machine Learning 27


Outline
● Statistics - Probability
● Typical Data Distribution
● Typical Measurements
○ Entropy, Cross Entropy
○ Mutual Information
○ Kullback-Leibler Divergence
● Learning Theory

FIT-CS INT3405 - Machine Learning 28


Typical Data Distributions

FIT-CS INT3405 - Machine Learning 29


Uniform Distribution

P(X) = 1 / total number of possible outcomes

FIT-CS INT3405 - Machine Learning 30


Uniform Distribution – Example

The probability of doomsday being Monday is 1/7

The probability that a numeric letter appearing in


500k VNĐ note is 1/10

FIT-CS INT3405 - Machine Learning 31


Bernoulli Distribution

FIT-CS INT3405 - Machine Learning 32


Bernoulli Distribution – Example

The probability of male/female baby

FIT-CS INT3405 - Machine Learning 33


Binary Distribution

FIT-CS INT3405 - Machine Learning 34


Binary Distribution – Example
X P(X)
0 0.000977
1 0.009766
2 0.043945
3 0.117188
4 0.205078
5 0.246094
6 0.205078
7 0.117188
8 0.043945
The probability getting X times of head in 10
9 0.009766
times tossing a coin
10 0.000977

FIT-CS INT3405 - Machine Learning 35


Categorical Distribution

It is general form of Bernoulli


distribution with more than 2
possible outcomes

FIT-CS INT3405 - Machine Learning 36


Categorical Distribution - Example

FIT-CS INT3405 - Machine Learning 37


Univariate Normal Distribution

Normal distribution = Gaussian distribution

FIT-CS INT3405 - Machine Learning 38


Univariate Normal Distribution - Example

Score Distribution of a University Entrance Exam IQ Distribution

FIT-CS INT3405 - Machine Learning 39


Multivariate Normal Distribution

FIT-CS INT3405 - Machine Learning 40


Exponential Distribution

FIT-CS INT3405 - Machine Learning 41


Exponential Distribution - Example

Time distribution of call duration over 15


minutes in a Telecom company

FIT-CS INT3405 - Machine Learning 42


Outline
● Statistics - Probability
● Typical Data Distribution
● Typical Measurements
○ Entropy, Cross Entropy
○ Mutual Information
○ Kullback-Leibler Divergence
● Learning Theory

FIT-CS INT3405 - Machine Learning 43


Typical Measurement - Entropy
• Measure the ‘uncertainty’ or ‘surprise’ in data

• Discrete random variable:

• Continuous random variable:

FIT-CS INT3405 - Machine Learning 44


Typical Measurement – Mutual Information
• Measure the mutual dependence between two random variables.

FIT-CS INT3405 - Machine Learning 45


Typical Measurement – Cross Entropy
• Measure the difference between two distributions

• Discrete random variable:

• Continuous random variable:

FIT-CS INT3405 - Machine Learning 46


Typical Measurement – KL Divergence
• Measure the difference between two distributions

• Discrete random variable:

• Continuous random variable:

FIT-CS INT3405 - Machine Learning 47


Quiz
1. What is the probability that the total of two dice will be greater
than 9, given that the first dice is a 5?

2. Suppose there is a school having 60% boys and 40% girls as


students. The girls wear trousers or skirts in equal numbers; all
boys wear trousers. An observer sees a (random) student from a
distance; all the observer can see is that this student is wearing
trousers. What is the probability this student is a girl?

Duc-Trong Le INT3405 - Machine Learning 48


Quiz
3. Suppose we want to predict the next day is Rain or Sunny. We
can modeling the target distribution by which distribution?

4. Suppose we want to predict the GPA of a UET-student, which


distribuion can be used to model the GPA

5. What happen to the KL divergence DKL(P||Q) if the support of P


and Q are not overlapped

Duc-Trong Le INT3405 - Machine Learning 49


Outline
● Statistics - Probability
● Typical Data Distribution
● Typical Measurements
○ Entropy, Cross Entropy
○ Mutual Information
○ Kullback-Leibler Divergence
● Learning Theory

FIT-CS INT3405 - Machine Learning 50


Inductive Learning

FIT-CS INT3405 - Machine Learning 51


The Statistical Learning Framework

52
The Statistical Learning Framework

53
The Statistical Learning Framework

54
The Statistical Learning Framework

55
The Statistical Learning Framework

56
The Statistical Learning Framework

57
Train/Test/Validation

FIT-CS INT3405 - Machine Learning 58


Cross Validation

FIT-CS INT3405 - Machine Learning 59


Generalization- & Empirical Error

FIT-CS INT3405 - Machine Learning 60


Supervised Learning Workflow

FIT-CS INT3405 - Machine Learning 61


Unsupervised Learning Workflow

FIT-CS INT3405 - Machine Learning 62


Summary
● Statistics - Probability
● Typical Data Distribution
● Typical Measurements
○ Entropy, Cross Entropy
○ Mutual Information
○ Kullback-Leibler Divergence
● Learning Theory

FIT-CS INT3405 - Machine Learning 63


Summary
Reading:
Chapter 2 and Chapter 3 of PML.
For next lectures: 3.1 to 3.2, ESL, 11.2 PML

FIT-CS INT3405 - Machine Learning 64


Homework: Parameter Estimation

65
Homework: Parameter Estimation

66
Homework: Parameter Estimation

67
Homework: Parameter Estimation

68
UET
Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN


VNU-University of Engineering and Technology

Thank you
Email me
[email protected]

You might also like