0% found this document useful (0 votes)
13 views

Statistics

Statistics book of Mcom ch1

Uploaded by

mr.subrata825882
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Statistics

Statistics book of Mcom ch1

Uploaded by

mr.subrata825882
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Statistics

• Statistics is a body of knowledge that is useful for collecting, organizing, presenting, analyzing,
and interpreting data (collections of any number of related observations) and numerical facts.
• Descriptive statistics deals with the presentation and organization of data. These types of
statistics summarize numerical information.
• Inferential statistics deals with the use of sample data to infer, or reach, general
conclusions about a much larger population.

Data Description and Presentation


• What is Data?
Situation 1
Given: “Salka is 30 years old and has two Children"
Is it a data?
• Subject: Kamal • Age: 30 years • No. of Children: 02 • Gender: ???????
Sources of Data
• After identifying a research problem and selecting the appropriate statistical methodology,
researchers must collect the data that they will then go on to analyze. There are two
sources of data:
• Primary data are data collected specifically for the study in question.
• Secondary data were not originally collected for the specific purpose of the study at hand
but rather for some other purpose.
Population & Sample
• Statistical Population ⇒ A set whose every element is characterized by K –
Characteristics is called a K – Variate population, where K is a positive integer.
K = 1 ⇒ Population is Univariate.
K = 2 ⇒ Population is Bivariate etc.
Note in this context that this notion of population is not always the same as the notion of
population in ordinary conversation.
• It is the reference set based on which statistical hypotheses are made.
• Usually, a statistical population is a set of measurable quantities.
Population & Sample ……….
We are basically interested in measuring some characteristics of the population. This can be
done in two ways.
(1) Examine every element of the population known as census or complete enumeration.
(2) Some population parameters are inferred by studying only a part of the population known
as sampling.
Population & Sample ……….
Population: Set of all elements being considered.
Sample: Set of items chosen for study. ⇒ Sampling is a part of the whole
population. Parameter: Any characteristics of a population is known as
parameter.
Statistic: Any characteristic of a sample is known as a statistic.
Sampling Frame
The complete list of all the members/ units of the population from which each sampling unit
is selected is known as the sampling frame. It should be free from error.
A perfect sampling frame will contain each unit only once. The basic characteristics of a
perfect sampling frame is as follows:
(i) Presence of all legitimate units in the frame improves its accuracy.
(ii) Absence of non-existing units in the frame improves its accuracy.
(iii) The structure of the frame should make it adequate in terms of covering the entire
population.
(iv) The unit of the frame must be up to date in terms of its content.

Types of Sampling
Probability Sampling (Random Non-Probability Sampling
Sampling) (Non-random Sampling)


The decision whether a particular element is The decision whether a particular
included in the sample or not, is governed element is included in the sample
by Chance. or is not governed by Chance.
⇓ ⇓
Each element in the population has equal Each element in the population
probability (>0) of incorporating in the sample may not have an equal chance of
being incorporated in the sample.

Sampling Methods
Probability Sampling
Non-probability Sampling
(i) Sampling Random Sampling.
(i) Conveniences Sampling.
(a) With replacement.
(b) Without replacement. (ii) Purposive Sampling.
(ii) Systematic Sampling. (iii) Judgment Sampling.
(iii) Stratified Sampling. (iv) Cluster Sampling. (iv) Quota Sampling
Data Description
• Systematic gathering of crude information regarding any subject which is relevant for
our decision making problem.

Nature of Data
Qualitative Categorical Do you practice yoga? Yes/ No

Ordinal How to rank the tv channels? Star sports-Rank 1


Ten Sports-Rank 2

Numerical Discrete How many books do you have in 56789


(Quantitative) your library?

Continuous What is your monthly income? A range of values

Measurement Scale
Scale of Characteristics of Basic Empirical Examples
Measurement Measurement Operations
Nominal Identifies groups which can not Gender, Nationality, place
Order: No
be ordered of work
Distance: No
Unique origin: No

Ordinal Order: Yes Number allow ranking but preference


Distance: No no arithmetic
Unique origin: No

Interval Order: Yes Intervals between numbers are temperature


Distance: Yes meaningful
Unique origin: No

Ratio Order: Yes Has an absolute measurement When exact figure on


Distance: Yes point and has a unique zero objective factors are
Unique origin: Yes origin. desired

Level of Measurement and Choice of Statistical Measures


Major Determinants of Choice of Statistical Method
a) Number of variables [One or two or more than two]
b) Level of Measurement of Variables [Numerical OR Categorical]
c) Distribution of the variable [Parametric OR Non-parametric]
d) Nature of the Hypothesis [Hypotheses of Association/ Causation OR
Hypotheses of Differences]
e) Dependence and Independence Structure [Independent OR Paired]
f) Sample Size [n<=29 OR n>29]

Statistical Methods for Univariate Data

Statistical Methods for Bi-variate Data

7
Statistical Methods for Multivariate Data

Syllabus
• Probability and Probability Distributions
• Sampling Distributions and Estimating
• Hypothesis Testing
• Chi-square Tests and Factorial Experiments • Index
Number and Forecasting Techniques
UNIT 1-Probability and Probability Distributions
Content
• Definition of Probability- Unconditional Probability statement
• Conditional Probability statement and its implications
• Bayes’ Theorem and its applications
• Joint Probabilities and its implications
• Mathematical Expectations
• Theoretical Probability Distributions – Binomial, Poisson and Normal (Their
Characteristics and Applications in Business).

Introduction
• The concept of ‘Probability’ is as old as civilization itself – originated from an age old malaise,
Gambling used to make bets.
• The probability theory was first applied to Gambling and later to other socio-economic
problems.
• Lately, quantitative analysis has become the backbone of statistical application in business
decision making and research.

• Statistics and especially the theory of probability play a vital role in making decisions in
situations, when there is a lack of certainty.

There are 3 basic problems:


(1) To describe the situation or to specify the set on which probability statements are made;
(2) To define a numerically measure for a probability statement; and
(3) To evaluate numerically the probabilities for a particular event.

Terminology
Experiment🡺 An act which can be repeated under some given condition.
Random 🡺 Depending on chance. Random Experiment is an experiment whose result depends on
chance.

Properties:
• They can be repeated physically or conceptually.
• Outcome set, ‘S’ is specified in advance. • Repetition may not yield some
result
Outcome / sample space (S)
Random experiment => Result => Outcome Examples
(1) Tossing a coin=>Result: H or T =>Sample space = {H, T}
(2) Throwing a die=>Result: 1, 2,3,4,5 or 6 =>Sample space = {1, 2, 3, 4, 5, 6}
Trial: To conduct an experiment once is trial.
Events: Possible outcomes or combination of outcomes is termed as events.
Example: Trial - Tossing a coin;
Event – Occurrence of either H or T
Events
1) Elementary – which can not be decomposed into further simpler events.
Example: Tossing a coin sample space= {H, T}
2) Composite – which can be further decomposed into further simpler events
Example: Throwing a die sample space = {odd no., even no.} decomposed further = {1, 2, 3, 4, 5, 6}
• Equally likely events: Two or more events are said to be equally likely if none of them
expected to be in better performance than the other.
• Mutually exclusive events: Two or more events are said to be mutually exclusive if they
cannot occur simultaneously.
• Mutually Exhaustive Events: Two or more events are said to be exhaustive if at least one
of them must occur.

Set Theory
S: Universal set / sample space/ outcome set φ: Null Set
Let Ai ⊆S, ∀ i= 1, 2
And A1 = {x| xϵA1, A1 ⊆ S}
A2 = {x| x ϵ A2, A2 ⊆ S}
Then, A1 ∪ A2 = {x| x ϵ A1and / or x ϵ A2} A1 ∩ A2 = {x| x ϵ A1and x ϵ A2}
Ᾱ1 = {x∉ A1, A1 ⊆ S}
• Result 1 A1 ∪ Ᾱ1 = S, Mutually exhaustive.
• Result 2 A1 ∩Ᾱ1 =φ, mutually exclusive.
• Result 3 [combining (1) & (2)] A1and Ᾱ1 are disjoint partition of the sample space.
• Disjoint union of sample space (forA1,A2) Mutually exclusive A1∩A2 = φ Mutually exhaustive
A1 ∪ A2 = S
• Result 4 For every A1,A2 ⊆ S. A1∩A2 ⊆A1 and A1∩A2 ⊆A2. A1, A2 ⊆ (A1 ∪ A2)
7/21/2024 27

Probability of an event
Let A⊆S be any event.

Definition: The probability of occurrence of an event A, denoted by P(A) is defined as a real


number satisfying the following conditions.
(i) 0 ≤ P(A) ≤ 1
(ii) P (A∪B) = P (A) + P (B) when A∩B =φ. (iii) P(S) = 1 when S is a sure event.
Approaches to probability
There are three approaches to determine probability. These are:
➔ • Classical Approach.
➔ • Relative Frequency of Occurrence Approach.
➔ • Subjective Probability Approach.

Classical Approach to probability


Result:
When there is symmetry in the elementary events of an outcome set containing
a finite number of elements, the probability of any event A, is
(𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑎𝑟𝑦 𝑒𝑣𝑒𝑛𝑡𝑠 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝐴)
PA= 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑎𝑟𝑦 𝑒𝑣𝑒𝑛𝑡𝑠

Method of Relative Frequency


🡺 The classical approach offers no answers to probability when there is a lack of symmetry of
elementary events.
🡺 Against this, the method of relative frequency is an empirical method of estimating the
probability of estimating the probability of event.

Subjective Probability
🡺 Probability is based on the experience and judgment of the person making the estimate.
🡺 This may differ from person to person, depending on one’s perception of the situation and
the past experience.
Sometimes, subjective probability is also termed as the set theoretic approach to probability.

Rules of Probability
Rule 1 Let A1,A2 ⊆ S. If A1and A2are mutually exclusive events then A1∩A2 =φ
⇒ P (A1∪A2) = P (A1) + P (A2).
Rule 2 Let A1,A2 ⊆ S. For any A1,A2 ⊆ S. P (A1∪A2) = P (A1) + P (A2) – P (A1∩A2)

Rules of Probability…………Some Corollary


Corollary 1: Let A ⊆ S be any event Then P (Ᾱ) = 1 – P (A).
Corollary 2: Let A1,A2 ⊆ S be any two events If A1⊆A2 ⇒ P (A1) ≤ P (A2)

Conditional Probability
To discuss conditional probability, it would be better to start from the concept of
unconditional probabilities.
Under unconditional probabilities, when even an event A is considered, already an
outcome set S of which A ⊆ S is considered.
Hence, it would be more appropriate to call the probability of A as the probability of A
given S. This may be written as P (A) = P (A/S).
Conditional Probability …………………………………

Conditional Probability
Independence of Events:
Two events A1and A2are said to be independent iff P (A1∩A2) = P (A1) ⋅ P (A2)
• Result 1: If A1and A2are independent events then Ᾱ1and Ᾱ2 are independent.
• Result 2: If A1and A2are independent events then Ᾱ1and A2 are independent.
• Result 3: If A1and A2are independent events then A1and Ᾱ2 are independent.

Pairwise and Mutual Independence of Events:


Definition: A set of events {A1, A2, ……… An }are said to be pair wise independent
if P (Ai∩Aj) = P (Ai) ⋅ P (Aj); ∀ i ≠ j=1, 2, ….n,
On the other hand, a set of events {A1,A2, ………. An} are said to be mutually independent if

Beyes’ Rule
Lemma If A1,A2,………. An denoted a disjoint partition of the outcome set and if P (Aj) ≠ 0, ∀ j = 1, 2,……..n.

Then for any event B,


Beyes’ Rule
Rule: If be A1,A2,………. An be a disjoint partition of the outcome set S and B be any event
with P (Aj) ≠ 0 ∀ j =1, 2,…. n and P (B) ≠ 0. Then

Probability Distribution
• A probability distribution is essentially an extension of the theory of probability.
• To discuss the concept of probability distribution, we should start with distinguishing
frequency distribution with probability distribution.
• Probability distribution is essentially an extension of the relative frequency distribution
approach.

Probability Distribution
Consider the case of sample space associated with tossing a coin twice.
The sample space S = {HH, HT, TH, TT}.
Let X: be any variable which denotes the number of head appear.
No. of Heads (x) Frequency (f) Relative frequency / Probability
0 1 ¼
1 2 1/2
2 1 ¼
Total N=4 100 % = 1

Probability Distribution
X 0 1 2 Total

P(x) 1/4 ½ 1/4 1

Here, the variable ‘X’ has a special characteristics i.e. corresponding to each and
every value of X, ∃ an associated probability.
⇒X is known as the random variable, as the occurrence of any value depends on chance.
Definition: A real Random Variable (x) is a variable defined on the outcome set of a
random experiment such that the probability statement P {X≤C} is
defined for every real number C

Outcome set of Random experiment

Notation
• P {a ≤ x ≤ b} ⇒ Probability that x lies between ‘a’ and ‘b’ (both ‘a’ and ‘b’ inclusive)
• P {a ≤ x <b} ⇒ Probability that x lies between ‘a’ and ‘b’ (only ‘a’ inclusive)
• P {x ≤ c} ⇒ Probability that ‘x’ takes value less than or equal to c
• P {x ≥ d} ⇒ Probability that x takes the value greater than or equal to ‘d’
Mathematical Expectation
Define. If X is a random variable with probability function P (x) and φ (x) is a function
of X which is again a random variable then the mathematical expectation of P (X) is defined
as: E {φ (x)} = Σφ (x) ⋅ P (x) if x is discrete
-∝<x<+∝
Case 1. φ (x) = X, E (X) = Σ X⋅ P (x) = μ
-∝<x<+∝
2 2 2
Case 2. φ (x) =X , E (X ) = Σ X ⋅ P (x).
-∝<x<+∝

Define σ2 = var (x) = E {x-E (x)}2 = E (x2) – {E(x)}2 So, Standard deviation = σ
Standardized variable z ={ x – E (x)} / s.d(X)
⇒ E(z) = 0;Var(z) = 1
Discrete Probability Distribution
• Binomial Probability Distribution
• Poisson Probability Distribution
Note: Poisson Probability Distribution is a limiting case of Binomial
Probability Distribution
Binomial Probability Distribution
The binomial distribution describes discrete data resulting from a random experiment known
as Bernoulli process. Binomial distribution is a probability distribution expressing the
probability of one set of dichotomous alternatives i.e. success or failure.
The Binomial probability model is appropriate in the following experimental situations:
suppose that a random experiment is such that.
(i) Any trial results in a success or a failure;
(ii) There are n repeated trials which are independent;
(iii) The probability of success in any trial; 0 <p< 1 => Probability of failure = (1-p)
Binomial Probability Distribution
Let, : denote the exact number of success in n trials.
Then X ~ Bin (n, P)
The corresponding probability function is:

Properties of Binomial Probability Distribution


Binomial Proportion
Define

Recurrence Relation under Binomial Model


If we know n, p then p (0) is available, accordingly, we can get p(1), p(2), …….p(n) Etc.

1/2024 53

Fitting a Binomial Distribution


Step 1: Determine P
Step 2: Expand the Binomial {P+ (1-P)} n
Step 3: Multiply each coefficient of expanded Binomial to get the expected
frequency in each category.
Example
8 coins are tossed at a time 256 times. No. of heads observed at each trial is
recorded and given below. Find the expected frequencies. What are the theoretical
values of mean and s.d.? Also calculate mean and s.d. of the observed frequency.
Fitting a Binomial Distribution…………
No. of heads (x) at a throw 0 Observed Expected
Frequency (f) 02 Frequency (fe) 01

1 06 08

2 30 28

3 52 56

4 67 70

5 56 56

6 32 28

7 10 08

8 01 01

Poisson Probability Distribution


The Poisson probability law is a limiting case of a Binomial law. Let X be the random variable

Poisson Probability Distribution


Characteristics
Nature of variable: discrete.
Range of variable: x = 0, 1, 2, ………∝ Parameter λ
E(x) λ
var (x) λ
s.d. (x) √λ
Shape of distribution: positively skewed. 7/21/2024 57

Poisson Probability Distribution………………..


Comment
• Under Poisson law, P → 0, n →∝ such that np = λ finite. For which is known as a law of rare
events.
• Moreover Binomial law approximation to Poisson law particularly when n is very large and P is very
small. Now the problemisthatwhichcasestobeconsideredtoberareevents.
Rule of Thumb
⮚ A satisfactory good approximation is available for N ≤ 20 providednp<5,underBinomial
law,
⮚ where as if n > 20 and p ≤ 0.05, it would better to use Poisson Approximation.
Fitting of a Poisson distribution
Step1: Determine n and p such that np = λ is found.

Step2: The Probabilities of various values of the random variables (x) to be determined by
using the p.m.f.

Step3 Multiply each and every term in step 2 by ‘n’ to get the expected frequencies.
Fitting of a Poisson distribution……………

No. of defects (x) Observed frequency (f) Expected frequency (fe)=n.p (X = x)

0 214 212.62

1 92 93.34

2 20 20.49

3 03 3.00

4 01 0.33

N = 330

Continuous Probability Distribution


⮚ Among the continuous univariate probability models, Normal distribution as Gaussian distribution is the
most versatile of all the continuous probability models.
⮚ It is useful is statistical inferences, in characterizing uncertainties in many real life situations, and
in approximating other probability models.
Normal Distribution
Parameters: Only two parameters (μ, σ2) are needed to define the Normal distribution.
Let X: be the continuous random variable. Range of X : - ∝< x < + ∝
E (x) =μ, Var (x) = σ2
Unimodal => Bell Shaped
Symmetric ⇒ Skewness = 0
The two tails approaches asymptotically
Normal Distribution
Let X ~ N(μ, σ2)
Then the probability function is

Standard Normal Distribution

How to Use Standard Normal Distribution


2
Let X ⟶ N( μ σ )
2
So , E (X ) = μ and var(X ) = σ
Define the standardized variate z

𝑥−μ
Z= σ
~ 𝑁 (0, 1)
Step 1: If x ~ N (μ, σ2) convert it into
Z = x - μ/ σ~ N (0, 1)
Step 2: The sum of area under the Normal curve is 1,
Moreover f (z) = f (-z).
Step 3: Looking from the table identify the required probability.

SUBRATA SUTRADHAR

You might also like