0% found this document useful (0 votes)

2 views

L2_ Mathematical Preliminaries

The document outlines mathematical preliminaries essential for data science applications, covering basic mathematics, probability concepts, and random variables. It discusses key topics such as vectors, matrices, linear algebra, and different probability approaches including Bayesian and Frequentist methods. Additionally, it explains the significance of random variables, their distributions, and specific types like binomial and continuous random variables.

Uploaded by

carolnjeri0123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

L2_ Mathematical Preliminaries

Uploaded by

carolnjeri0123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

ICT583 Data Science Applications

TOPIC 2: Mathematical Preliminaries

Outline
• Basic maths
• Probability
 Bayesian vs. Frequentist
 compound events
 conditional probability
 random variables
Basic Mathematics
Basic symbols and terminology
• Vectors: an object with both magnitude and
direction. It is a 1-dimensional array
representing a series of numbers.
• We use index notations to denote the
element in the vector:
Basic symbols and terminology
• Matrix: 2-dimensional representation of
arrays of numbers.
• n x m (n by m) denotes the dimension of a
matrix, tells the matrix has n rows and m
columns.

• If a matrix has the same number of rows

and columns, it is called a square matrix.
Basic symbols and terminology
• Three offices in different locations, each with
the same three departments: HR,
engineering, and management.
Arthmetic symbols
• The uppercase sigma ∑ symbol is a
universal symbol for addition.
• Whatever is to the right of the sigma symbol
is usually something iterable, meaning that
we can go over it one by one (for example, a
vector).
X = [1, 2, 3, 4, 5]
Arthmetic symbols
• The dot product is an operator like addition and
multiplication. It is used to combine two vectors.

Scalar

• Let's say we have a vector that represents a customer's

sentiments toward three genres of movies: comedy,
romance, and action. On a scale of 1-5, a customer
loves comedies, hates romantic movies, and is alright
with action movies.

Here, 5 denotes their love for comedies, 1 their hatred of romantic

movies, and 3 the customer's indifference toward action movies.
Arthmetic symbols
• Assume that we have two new movies, one of which is a
romantic comedy and the other is a funny action movie. The
movies would have their own vector of qualities ：

Here, m1 is our romantic comedy and m2 is our funny action

movie.
Let's compute the recommendation score for each movie. For
movie 1,
Arthmetic symbols
• The answer we obtain is 28, but what does this number
mean? On what scale is it?
• The best score anyone can ever get is when all values
are 5,

• The lowest possible score is when all values are 1,

Arthmetic symbols
• How about movie 2?

So, between movie 1 and movie 2, we would definitely

recommend movie 2 to our user.
This is, in essence, how most movie prediction engines
work.
Linear algebra
• It is an area of mathematics that deals with the math of
matrices and vectors.
• Matrix multiplication

• Movie recommendation example ： Recall the user's movie

genre preferences of comedy, romance, and action.

• Now suppose we have 10,000 movies, all with a rating for

these three categories. Can you visualize the matrix
multiplication?
Logarithms/exponents
• An exponent tells you how many times you have to
multiply a number by itself

• A logarithm is the number that answers the question

"what exponent gets me from the base to this other
number?"
Logarithms/exponents
• How we can use both versions to say the same
thing.

• Examples:
Logarithms/exponents
• Example: the number e is around 2.718 and has many practical
applications. A very common application is interest calculation for saving.
Suppose you have $5,000 deposited in a bank with continuously
compounded interest at the rate of 3%, then we can use the following
formula to model the growth of your deposit:

A denotes the final amount

P denotes the principal investment (5000)
e denotes a constant (2.718)
r denotes the rate of growth (.03)
t denotes the time (in years)
Introduction to probability
Basic definitions
• Procedure: A procedure is an act that leads to a result,
for example, throwing a die or visiting a website.
• Event: A collection of the outcomes of a procedure,
such as getting a head on a coin flip or leaving a
website after only 4 seconds.
• Sample space of a procedure is the set of all possible
simple events.
For example, an experiment is performed in which a
coin is flipped three times in succession. What is the
size of the sample space for this experiment?
The answer is eight. The results could be any one of the
possibilities in the following sample space: {HHH, HHT, HTT,
HTH, TTT, TTH, THH, or THT}.
Probability
• The probability of an event represents the
frequency, or chance, or the likelihood that
the event will happen.
• If A is an event, P(A) is the probability of the
occurrence of the event.
• Actual probability of an event A

• The maximum probability

of any event is 1.
Frequentist approach
In a Frequentist approach, the probability of an
event is calculated through experimentation. It
uses the past in order to predict the future chance
of an event.

Core idea: relative frequency of an event is how

often an event occurs divided by the total number
of observations.
Frequentist approach
Example: You are interested in ascertaining how often a
person who visits your website is likely to return on a later
date (rate of the repeat visitors).
Event A - being an visitor coming back to the site
Using Frequentist approach, we can take the visitor logs
and calculate the relative frequency of event A. Suppose
we have 1,458 unique visitors in the past week, 452 were
repeat visitors. We can calculate this as:

The law of large numbers: If we repeat a procedure over

and over, the relative frequency probability will approach
the actual probability.
Compound Events

• A compound event is an event that

combines two or more simple events.
• Given events A and B: The probability that
A and B occur is P(A ∩ B); either A or B
occurs is P(A ∪ B)
• Example: our universe is 100 people who
showed up for an experiment in which a
new test for cancer is being developed
• The red circle, A, represents 25 people who
actually have cancer.
• The circle B contains people for whom the
test was positive (it claimed that they had
cancer) - 30 people
• A ∩ B are people for whom the test claimed
they were positive for cancer (A), and they
actually do have cancer - 20 people
Conditional Probability

Example: Let's pick an arbitrary person from a

study of 100 people. You are told that his
result was positive. What is the probability of
him actually having cancer?
So, we are told that event B has already taken
place. The question now is: what is the
probability that they have cancer, that is P(A)?
This is called a conditional probability of A
given B or P(A|B).
Conditional Probability

The conditional probability P(A|B) is defined:

Conditional probabilities get interesting only

when events are not independent, otherwise:
Compound Events and Independence

To calculate P(A ∩ B) = P(A and B), we use the

following formula:
P(A ∩ B) = P(A and B) = P(A)P(B|A)
If events A and B are independent

Independence (zero correlation) is good to

simplify calculations but bad for prediction.

 Correlations are the driving force behind

predictive model.
Bayesian ideas

• Three things and how they are

interact with each other
a prior distribution
a posterior distribution (what are we
finding)
a likelihood
• Another way to say: data shapes and
updates our belief
Bayes' Theorem

Let's try thinking about Bayes using the terms

hypothesis and data. Suppose H = your hypothesis
about the given data and D = the data that you are
given.
Bayes can be interpreted as trying to figure out
P(H|D) (the probability that our hypothesis is
correct, given the data at hand).
Bayes' Theorem

P(H) is the probability of the hypothesis before we

observe the data, called the prior probability, or just
prior
P(H|D) is what we want to compute, the probability of
the hypothesis after we observe the data, called the
posterior
P(D|H) is the probability of the data under the given
hypothesis, called the likelihood
P(D) is the probability of the data under any hypothesis,
called the normalizing constant
Bayes' Theorem

It is not only a powerful tool in the field of

probability, but also is widely used in the field of
machine learning, such as its use in a probability
framework for fitting a model to a training dataset,
referred to as maximum a posteriori (MAP), and in
developing models for classification predictive
modeling problems such as Naive Bayes.
Random variables
Distributions of Random Variables

Random variables are numerical functions where

values come with probabilities.

Probability density functions (pdfs) represent

RVs, essentially as histograms.
Here V is the sum of two dice.
Probability/Cumulative Distributions

• The cdf is the running sum of the pdf:

The pdf and cdf contain

exactly the same
information, one being
the integral/derivative
of the other.
Discrete random variables
• A discrete random variable only takes on a
countable number of possible values, such as the
outcome of a dice roll
Discrete random variables
Properties:
• Expected value (mean) of a random variable:
the mean value of a long run of repeated
samples of the random variable. This is
sometimes called the mean of the variable.
Discrete random variables
Properties:
• Variance of a random variable represents the
spread of the variable. It quantifies the variability
of the expected value.

 represents the expected value of the variable.

Sigma is the standard deviation, which is defined
simply as the square root of the variance.
Binomial random variables
(discrete)
• Look at a setting in which a single event happens
over and over and we try to count the number of
times the result is positive.
• Binomial settings has four conditions:
The possible outcomes are either success or
failure
The outcomes of trials cannot affect the outcome
of another trial
The number of trials was set (a fixed sample size)
The chance of success of each trial must always
be p
Binomial random variables
• The PMF for a binomial random variable is as
follows:

X counts the number of successes in a binomial

setting. n = the number of trials, p = the chance of
success of each trial.
Binomial random variables
• Example: Blood types
A couple has a 25% (p) chance of a having a child
with type O blood. What is the chance that three (X)
of their five (n) kids have type O blood?

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑉ሾ𝑋ሿ = 𝜎𝑥2 = ෍ ሺ 𝑥𝑖 − 𝜇𝑥ሻ 2 (𝑝𝑖 ) = .9375

So, this family can expect to have probably one or two kids with type
O blood.
Binomial random variables
• Binomial random variables have special
calculations for the exact values of the
expected values and variance.
E(X) = np
V(X) = np(1 − p)
• In the previous exmaple, we can use the
formulas to calculate an exact expected value
and variance:
E(X) = .25(5) = 1.25
V(X) = 1.25(.75) = 0.9375
Continuous random variables

• Unlike a discrete random variable, a continuous

random variable can take on an infinite number
of possible values, not just a few countable ones.
• We call the functions that describe the
distribution probability density function
(PDF) instead of PMF.
• If X is a continuous random variable, then there
is a function, f(x), for any constants a and b:

• The f(x) function is known as the probability density function

(PDF).
Continuous random variables

• If X is a continuous random variable, then there

is a function, f(x), for any constants a and b:
Standard normal distribution

• The PDF of this distribution is as follows:

μ is the mean of the variable and σ is the standard

deviation.

μ=5 σ=5

Random Variables and Probability Distribution
80% (5)
Random Variables and Probability Distribution
78 pages
YAS
No ratings yet
YAS
97 pages
Advanced Econometrics - 1985 - 1era Edición - Amemiya
100% (1)
Advanced Econometrics - 1985 - 1era Edición - Amemiya
531 pages
L2 - Mathematical Preliminaries.
No ratings yet
L2 - Mathematical Preliminaries.
42 pages
Lecture2 Math ML Review
No ratings yet
Lecture2 Math ML Review
87 pages
Finals (MS)
No ratings yet
Finals (MS)
3 pages
Probability
No ratings yet
Probability
28 pages
BusStats Finals
No ratings yet
BusStats Finals
15 pages
Rvrlecture 1
No ratings yet
Rvrlecture 1
20 pages
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
27 pages
MIS1122 Probability 2024 Note
No ratings yet
MIS1122 Probability 2024 Note
68 pages
Lecture 01 Probability
No ratings yet
Lecture 01 Probability
51 pages
Unit 2
No ratings yet
Unit 2
102 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
Applied Maths
No ratings yet
Applied Maths
34 pages
Data Analysis for Social Scientists Cheatsheet
No ratings yet
Data Analysis for Social Scientists Cheatsheet
12 pages
DL Unit 2
No ratings yet
DL Unit 2
29 pages
M3 - FDS
No ratings yet
M3 - FDS
38 pages
M3 - FDS
No ratings yet
M3 - FDS
38 pages
Lecture 4 - Basic Probabaility Theory - Full
No ratings yet
Lecture 4 - Basic Probabaility Theory - Full
26 pages
Unit1 - Read-Only
No ratings yet
Unit1 - Read-Only
191 pages
5 Probability (2)
No ratings yet
5 Probability (2)
51 pages
Statisctics & probabilty
No ratings yet
Statisctics & probabilty
63 pages
Basic Probability and Statistical Distribution
No ratings yet
Basic Probability and Statistical Distribution
10 pages
Unit-Ii: Probability I: Introductory Ideas
No ratings yet
Unit-Ii: Probability I: Introductory Ideas
28 pages
Scribe: Naive Bayes Classifier
No ratings yet
Scribe: Naive Bayes Classifier
16 pages
Probability
No ratings yet
Probability
93 pages
ML_Lec 2- Review of probability and statistics
No ratings yet
ML_Lec 2- Review of probability and statistics
30 pages
Basics of Probability Theory
No ratings yet
Basics of Probability Theory
42 pages
Unit II
No ratings yet
Unit II
140 pages
Topic 6 Probability Theory
No ratings yet
Topic 6 Probability Theory
43 pages
02第二课：基于机器学习方法的自然语言处理
No ratings yet
02第二课：基于机器学习方法的自然语言处理
54 pages
Probability: Totalfavourable Events Total Number of Experiments
No ratings yet
Probability: Totalfavourable Events Total Number of Experiments
39 pages
ADS_M1_02
No ratings yet
ADS_M1_02
16 pages
Probability and Random Variables
No ratings yet
Probability and Random Variables
14 pages
Information Retrieval: Venkatesh Vinayakarao
No ratings yet
Information Retrieval: Venkatesh Vinayakarao
57 pages
1
No ratings yet
1
13 pages
Probability Probability Distribution Function Probability Density Function Random Variable Bayes' Rule Gaussian Distribution
No ratings yet
Probability Probability Distribution Function Probability Density Function Random Variable Bayes' Rule Gaussian Distribution
26 pages
340 Printable Course Notes
No ratings yet
340 Printable Course Notes
184 pages
Advanced Business Statistics For Decision Making: Facilitator-Dr. Shilpa Bhaskar Mujumdar
No ratings yet
Advanced Business Statistics For Decision Making: Facilitator-Dr. Shilpa Bhaskar Mujumdar
31 pages
Datascience Python Bayes
No ratings yet
Datascience Python Bayes
124 pages
BK Chap12
No ratings yet
BK Chap12
74 pages
Sam Roweis Probx
No ratings yet
Sam Roweis Probx
12 pages
Chap5 (Bus Analytics)
No ratings yet
Chap5 (Bus Analytics)
2 pages
STATISTICS_AND_PROBABILITY
No ratings yet
STATISTICS_AND_PROBABILITY
12 pages
PTSP
No ratings yet
PTSP
101 pages
Reading Material 02
No ratings yet
Reading Material 02
30 pages
BASIC PROBABILITY - MSC PDF
No ratings yet
BASIC PROBABILITY - MSC PDF
72 pages
II Sem - Last Minute Revision
No ratings yet
II Sem - Last Minute Revision
44 pages
Statistics For Business Topic - Chapter 5 - Probability
No ratings yet
Statistics For Business Topic - Chapter 5 - Probability
1 page
Probability
No ratings yet
Probability
22 pages
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
No ratings yet
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
31 pages
Basics of Probability
No ratings yet
Basics of Probability
41 pages
David Williams - Weighing The Odds A Course in Probability and Statistics
100% (1)
David Williams - Weighing The Odds A Course in Probability and Statistics
567 pages
Moocs Project
No ratings yet
Moocs Project
12 pages
Data Analytics With Python Lecture 2
No ratings yet
Data Analytics With Python Lecture 2
25 pages
On Probability Theory &stochastic Process
No ratings yet
On Probability Theory &stochastic Process
101 pages
Unit-I Probability
No ratings yet
Unit-I Probability
38 pages
Academic Writing
No ratings yet
Academic Writing
8 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Cost II CH 3
No ratings yet
Cost II CH 3
10 pages
Accg200 L12
No ratings yet
Accg200 L12
11 pages
TOPIC 3
No ratings yet
TOPIC 3
24 pages
Correlation of Technology Literacy and Academic Research Skills
No ratings yet
Correlation of Technology Literacy and Academic Research Skills
62 pages
Appreciating Statistics
No ratings yet
Appreciating Statistics
38 pages
EE2211 Lecture 3
No ratings yet
EE2211 Lecture 3
35 pages
financial modelling file using spreadsheet
No ratings yet
financial modelling file using spreadsheet
90 pages
Answer: The Value of N Is 15. Solution: 18 + 16 + 19 + 15 + 17 85 Then 85 ÷ 5 17
No ratings yet
Answer: The Value of N Is 15. Solution: 18 + 16 + 19 + 15 + 17 85 Then 85 ÷ 5 17
6 pages
AGE 302 Introductory Notes-1
No ratings yet
AGE 302 Introductory Notes-1
19 pages
Summative 2
No ratings yet
Summative 2
2 pages
Guo S Manuals SOA Exam C PDF
No ratings yet
Guo S Manuals SOA Exam C PDF
284 pages
(eBook PDF) Essentials of Statistics for Business and Economics 7th Editioninstant download
100% (3)
(eBook PDF) Essentials of Statistics for Business and Economics 7th Editioninstant download
55 pages
3 ES Discrete Random Variables
No ratings yet
3 ES Discrete Random Variables
8 pages
1 RandomVariable
No ratings yet
1 RandomVariable
21 pages
StatsLab 1
No ratings yet
StatsLab 1
19 pages
Institute of Actuaries of India: Subject CT3 - Probability and Mathematical Statistics
100% (2)
Institute of Actuaries of India: Subject CT3 - Probability and Mathematical Statistics
6 pages
Multiple Choice Questions (The Answers Are Provided After The Last Question.)
100% (2)
Multiple Choice Questions (The Answers Are Provided After The Last Question.)
6 pages
08 Researchmethods
0% (1)
08 Researchmethods
88 pages
Time Scaled Event Network - CE 434-CE41S1 - Construction Methods and Project Management
No ratings yet
Time Scaled Event Network - CE 434-CE41S1 - Construction Methods and Project Management
4 pages
Business Econometrics Lecture Notes Quiz Econ2271
No ratings yet
Business Econometrics Lecture Notes Quiz Econ2271
2 pages
Prob Stat Exam 3rd Quarter
100% (1)
Prob Stat Exam 3rd Quarter
3 pages
Six Sigma DOE PDF
No ratings yet
Six Sigma DOE PDF
618 pages
Probability (4th Quarter Examination) - FINAL
100% (1)
Probability (4th Quarter Examination) - FINAL
2 pages
A Survey On TOA Based Wireless Localization and NLOS Mitigation Techniques
No ratings yet
A Survey On TOA Based Wireless Localization and NLOS Mitigation Techniques
18 pages
Econ 335 Wooldridge CH 8 Heteroskedasticity
No ratings yet
Econ 335 Wooldridge CH 8 Heteroskedasticity
23 pages
Certified Financial Planner Module 4: Investment Planning
No ratings yet
Certified Financial Planner Module 4: Investment Planning
137 pages
International Journal of Business, Economics and Management
No ratings yet
International Journal of Business, Economics and Management
15 pages
Measurement Error and The Hot Hand
No ratings yet
Measurement Error and The Hot Hand
17 pages

L2_ Mathematical Preliminaries

Uploaded by

L2_ Mathematical Preliminaries

Uploaded by

ICT583 Data Science Applications

TOPIC 2: Mathematical Preliminaries

• If a matrix has the same number of rows

• Let's say we have a vector that represents a customer's

Here, 5 denotes their love for comedies, 1 their hatred of romantic

Here, m1 is our romantic comedy and m2 is our funny action

• The lowest possible score is when all values are 1,

So, between movie 1 and movie 2, we would definitely

• Movie recommendation example ： Recall the user's movie

• Now suppose we have 10,000 movies, all with a rating for

• A logarithm is the number that answers the question

A denotes the final amount

• The maximum probability

Core idea: relative frequency of an event is how

The law of large numbers: If we repeat a procedure over

• A compound event is an event that

Example: Let's pick an arbitrary person from a

The conditional probability P(A|B) is defined:

Conditional probabilities get interesting only

To calculate P(A ∩ B) = P(A and B), we use the

Independence (zero correlation) is good to

 Correlations are the driving force behind

• Three things and how they are

Let's try thinking about Bayes using the terms

P(H) is the probability of the hypothesis before we

It is not only a powerful tool in the field of

Random variables are numerical functions where

Probability density functions (pdfs) represent

• The cdf is the running sum of the pdf:

The pdf and cdf contain

 represents the expected value of the variable.

X counts the number of successes in a binomial

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑉ሾ𝑋ሿ = 𝜎𝑥2 = ෍ ሺ 𝑥𝑖 − 𝜇𝑥ሻ 2 (𝑝𝑖 ) = .9375

• Unlike a discrete random variable, a continuous

• The f(x) function is known as the probability density function

• If X is a continuous random variable, then there

• The PDF of this distribution is as follows:

μ is the mean of the variable and σ is the standard

You might also like