0% found this document useful (0 votes)

18 views

Statistics theory( Soyaib)

Uploaded by

alif2201062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Statistics theory( Soyaib)

Uploaded by

alif2201062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Statistics (Theory)

2020:

Binomial Distribution: A random variable X is said to follow binomial distribution if it assumes only non-negative
values and its Probability mass function is given by-

It is characterized by two parameters: the number of trials and the probability of success on each trial.

Median: In statistics, the median is the middle value in a set of ordered data. To calculate the median, we first order
the data from least to greatest. If the data set has an odd number of values, the median is the middle value. If the data
set has an even number of values, the median is the average of the two middle values.

Mode: The mode is the most frequent value in a data set. There can be one mode, two modes, or even more modes in
a data set. If there is no mode, the data set is said to be unimodal.

Sample space: A sample space is the set of all possible outcomes of an experiment. For example, the sample space for
flipping a coin is {heads, tails}. The sample space for rolling a die is {1, 2, 3, 4, 5, 6}.
Ex: Suppose you have a bag containing 3 red balls and 2 blue balls. The sample space for drawing a ball from the bag is
{red, red, red, blue, blue}.
Conditional probability: Conditional probability is the probability of an event happening given that another event has
already happened. It is denoted by P(A|B), where A is the event that we are interested in and B is the event that has
already happened.
EX: Suppose we draw a ball from the bag in the previous example and it is red. What is the probability that the next
ball drawn will also be red? The sample space for the second draw is {red, red, red, blue, blue}, but we know that the
first ball drawn was red, so we can remove the blue balls from the sample space. This leaves us with a sample space of
{red, red, red}. The probability of drawing a red ball from this sample space is 3/3, or 1.

Event: An event in statistics is a set of outcomes in a sample space. For example, the event "drawing a red ball" from
the bag in the previous example is a set that contains the outcomes {red, red, red}.
Ex: Suppose we roll a die twice and record the results. The sample space for this experiment is {(1, 1), (1, 2), ..., (6, 6)}.
The event "rolling a double" is a set that contains the outcomes {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)}.

Importance of Poisson distribution:

The Poisson distribution is a discrete probability distribution that models the number of events that occur in a given interval of
time or space. It is important because it can be used to model a wide variety of real-world phenomena, including:

• The number of goals scored in a soccer game

• The number of customers arriving at a store in a given hour
• The number of radioactive particles emitted from a source in a given second
• The number of typos in a page of text
• The number of mutations in a DNA sequence

The Poisson distribution is also important because it is mathematically tractable, meaning that it can be easily analyzed and
manipulated. This makes it a valuable tool for scientists and engineers who need to model complex systems.

Here are some specific examples of how the Poisson distribution is used in the real world:

• Insurance companies use the Poisson distribution to model the number of claims they are likely to receive in a given
period of time. This information helps them to set rates and develop risk management strategies.
• Traffic engineers use the Poisson distribution to model the number of cars that will arrive at an intersection in a given
period of time. This information helps them to design traffic signals and other traffic control measures.
• Quality control engineers use the Poisson distribution to model the number of defects in a batch of products. This
information helps them to develop quality control procedures and identify areas where the manufacturing process can
be improved.
• Biologists use the Poisson distribution to model the number of mutations that will occur in a DNA sequence. This
information helps them to study the evolution of life and to develop new diagnostic and therapeutic tools.

Limitation of Normal distribution:

The normal distribution is a continuous probability distribution that is symmetrical around the mean, with most of the
values concentrated around the center and the tails tapering off to infinity. It is one of the most important and widely
used distributions in statistics.

However, the normal distribution also has some limitations. Here are a few:

• It is not always appropriate for modeling data that is skewed or has outliers. Skewed data means that
the distribution is not symmetrical, with more values on one side of the mean than the other. Outliers are
values that are far from the mean.
• The normal distribution assumes that the data is continuous. This means that the values can take on any
value within a certain range. However, some data is discrete, meaning that the values can only take on certain
values.
• The normal distribution is not always the most accurate model for data. In some cases, other
distributions may be more accurate
Comment on the following: mean binomial distribution is 3 and variance is 4
The mean of a binomial distribution is equal to the product of the number of trials and the probability of success on
each trial. The variance of a binomial distribution is equal to the product of the number of trials, the probability of
success on each trial, and the probability of failure on each trial.

Therefore, if the mean of a binomial distribution is 3 and the variance is 4, then we must have:

np = 3
npq = 4

where n is the number of trials, p is the probability of success on each trial and q is the probability of failure on each
trial.

Solving these two equations, we find that n = 4 and p = 3/4.

Therefore, the binomial distribution with mean 3 and variance 4 is the distribution of the number of successes in 4
trials, where the probability of success on each trial is 3/4.

This distribution is also known as the Bernoulli distribution with parameter 3/4.

What are you meant by Correlation and Regression?

Correlation is a measure of the linear relationship between two variables. It is calculated using a formula that takes
into account the means, standard deviations, and covariance of the two variables. Correlation can range from -1 to
1, with a value of 0 indicating no linear relationship and a value of 1 indicating a perfect positive linear relationship. A
value of -1 indicates a perfect negative linear relationship.

Regression is a statistical technique that can be used to model the relationship between two variables. It is used to
predict the value of one variable (the dependent variable) based on the value of the other variable (the independent
variable). Regression analysis can be used to identify the factors that contribute to a particular outcome and to make
predictions about future outcomes.
2019:
Mention the use of standard deviation:

Standard deviation is used in a wide variety of fields, including:

• Quality control: Standard deviation can be used to monitor the quality of products and processes. For
example, a manufacturer might use standard deviation to track the variation in the weight of their products.
• Financial analysis: Standard deviation can be used to measure the risk of an investment. For example, an
investor might use standard deviation to compare the risk of two different stocks.
• Scientific research: Standard deviation is used to measure the variability of experimental results. For
example, a scientist might use standard deviation to compare the results of two different treatment groups.

In general, standard deviation can be used to:

• Identify outliers in a data set

• Determine the reliability of a measurement
• Compare the variability of two or more data sets
• Make predictions about future outcomes

Here are some specific examples of how standard deviation is used in the real world:

• A pharmaceutical company uses standard deviation to ensure that the amount of active ingredient in their
drugs is consistent from batch to batch.
• A financial analyst uses standard deviation to determine how risky it is to invest in a particular company.
• A teacher uses standard deviation to assess the performance of their students on a test.
• A scientist uses standard deviation to determine if the results of their experiment are statistically significant.

coefficient of variation: The coefficient of variation (CV), also known as the normalized standard deviation, is a
statistical measure of the dispersion of data points around the mean, relative to the mean. It is defined as the ratio of
the standard deviation to the mean

CV = standard deviation / mean

What are the essential characteristics of an ideal average?

An ideal average should have the following characteristics:

• Rigidly defined: The definition of the average should be clear and unambiguous.
• Easy to calculate and understand: The average should be easy to calculate and understand, both for
laypeople and for statisticians.
• Based on all items: The average should be based on all of the items in the data set.
• Suitable for further algebraic treatment: The average should be amenable to further algebraic
treatment, such as addition, subtraction, multiplication, and division.
• Stable: The average should be stable, meaning that it should not be unduly affected by small changes
in the data set.
• Resistant to outliers: The average should be resistant to outliers, meaning that it should not be overly
influenced by extreme values in the data set.
• Uniquely defined: The average should be uniquely defined for a given data set, meaning that there
should be only one "correct" average for a given data set.
Difference between raw moment and central moment in group data:

Statistical estimate: A statistical estimate is a value that is calculated from a sample and used to estimate
the value of a population parameter.

Normal Distribution: The normal distribution is a continuous probability distribution that is symmetrical
around the mean, with most of the values concentrated around the center and the tails tapering off to infinity.

Null and Alternative hypothesis:

The null hypothesis and alternative hypothesis are two competing hypotheses that researchers weigh the
evidence for and against using a statistical test.

• Null hypothesis (H0): There is no effect or relationship between variables.

• Alternative hypothesis (Ha or H1): There is an effect or relationship between variables.

Here is an example of a null hypothesis and alternative hypothesis:

• Null hypothesis: The average height of men and women is the same.
• Alternative hypothesis: The average height of men is greater than the average height of
women.
2018:
Explain skewness and kurtosis:

Skewness is a measure of the asymmetry of a distribution.

• Positive skewness occurs when the tail of the distribution extends to the right. This means that there are more values
on the left side of the mean than on the right side of the mean.
• Negative skewness occurs when the tail of the distribution extends to the left. This means that there are more values
on the right side of the mean than on the left side of the mean.

Kurtosis is a measure of the peakedness of a distribution.

• Positive kurtosis occurs when the distribution is more peaked than a normal distribution. This means that there are
more values near the mean and fewer values in the tails of the distribution.

• Negative kurtosis occurs when the distribution is less peaked than a normal distribution. This means that there are
fewer values near the mean and more values in the tails of the distribution.

Explain Normal curve and skew curve:

• Normal curve: A normal curve is a bell-shaped probability distribution that is symmetrical around the mean.
Most of the values are clustered around the mean, and the tails of the curve taper off to infinity.
• Skew curve: A skew curve is a probability distribution that is not symmetrical around the mean. It can be skewed
to the left or to the right.

Level of confidence and level of significance:

• Level of confidence is how sure you are that your results are accurate. The higher the confidence level, the more
sure you can be that your results are accurate.
• Level of significance is the probability of getting your results if the null hypothesis is true. The lower the
significance level, the less likely it is that you will get your results if the null hypothesis is true.
Difference between simple correlation and rank correlation:

• Simple correlation: Simple correlation measures the linear relationship between two variables. It ranges from -1
to 1, where -1 is a perfect negative correlation, 0 is no correlation, and 1 is a perfect positive correlation.
• Rank correlation: Rank correlation measures the association between two rankings of the same variables. It also
ranges from -1 to 1, where the same interpretations apply.

2017:
Histogram: A bar chart that shows the distribution of continuous data.
Ogive curve: A graph that shows the cumulative frequency of a distribution.
Frequency polygon: A line graph that connects the midpoints of the bars of a histogram.
Pie chart: A circular graph that shows the proportion of a whole that each category represents.

Limitations of Binomial distribution:

• Applicable to only 2 possible outcomes
• Trials are independent
• Probability of success is the same for each trial

Simple event: A simple event is an event that cannot be broken down into any smaller events.
• Example: Flipping a coin and getting heads is a simple event.

Mutually exclusive event: Two events are mutually exclusive if they cannot both happen at the same time.
• Example: Flipping a coin and getting heads and tails at the same time is mutually exclusive.

Conditional probability: he probability of event A happening given that event B has already happened.
• Example: The probability of drawing a red card from a deck of cards after drawing a black card is conditional
probability.

Sample space: The set of all possible outcomes of an experiment.

• Example: The sample space of flipping a coin is {heads, tails}.

Importance of Normal distribution:

The normal distribution is the most important probability distribution in statistics. It is characterized by its bell-shaped
curve, which is symmetrical around the mean and has a positive skew. The normal distribution describes the
distribution of values for many natural phenomena, such as human height, IQ scores, and errors in measurement.

• Finance: The normal distribution is used to model the prices of stocks and bonds, as well as the risk and return
of investments.
• Quality control: The normal distribution is used to monitor the quality of production processes and to identify
outliers.
• Medicine: The normal distribution is used to design clinical trials and to analyze data from medical studies.
• Science: The normal distribution is used to analyze data from experiments and to make inferences about
populations.

Normal distribution is foundational to many statistical tools and models and has a wide range of applications in
diverse fields due to its bell-shaped curve and symmetricity around its mean.
2016:
Write down three properties of i) Negative Binomial distribution ii) Normal distribution

Normal Distribution:

o Continuous probability distribution

o Symmetrical bell-shaped curve
o Mean, median, and mode are all equal

Negative binomial distribution:

o It is a discrete probability distribution

o The probability of success in each trial is constant
o The probability of failure in each trial is constant

Necessary assumptions for the binomial distribution:

1. Each trial has only two possible outcomes.

2. The probability of success is the same for each trial.
3. Each trial is independent.
4. A fixed number of trials is performed.

2015:
Point estimate: A point estimate is a single value that is used to estimate the population parameter.
Interval estimate: An interval estimate is a range of values that is likely to contain the population parameter.

Difference between Histogram and Frequency polygon:

• Type of data: Histograms are typically used for continuous data, while frequency polygons can be used for
both continuous and discrete data.
• Visual representation: Histograms use bars to represent the frequency of each value, while frequency
polygons use lines to represent the frequency of each value.
• Shape of the distribution: Histograms are better for displaying the shape of the distribution, while frequency
polygons are better for displaying the trends and patterns in the data

What are the different measure of central of tendency?

• Mean: The mean is the most commonly used measure of central tendency. It is calculated by adding
up the values of all the data points and then dividing by the number of data points.
• Median: The median is the middle value in a data set that has been ordered from highest to lowest or
lowest to highest.
• Mode: The mode is the most frequent value in a data set.

- 2101106

Ancient Wonders - Crystalline World (Single)
No ratings yet
Ancient Wonders - Crystalline World (Single)
2 pages
B128 Expt9 Sem 2
No ratings yet
B128 Expt9 Sem 2
8 pages
Maths
No ratings yet
Maths
10 pages
AML - Unit -2
No ratings yet
AML - Unit -2
29 pages
Different Types of Distributions
No ratings yet
Different Types of Distributions
12 pages
Classify Sample Observation
No ratings yet
Classify Sample Observation
2 pages
ST2187_Block 6 Common probability distributions in business applications (1)
No ratings yet
ST2187_Block 6 Common probability distributions in business applications (1)
15 pages
R-6 Theory
No ratings yet
R-6 Theory
4 pages
Discrete probability distributions
No ratings yet
Discrete probability distributions
5 pages
Probability Distributions-Sarin B
No ratings yet
Probability Distributions-Sarin B
20 pages
Reading Material Mod 3 Statistical Methods (1)
No ratings yet
Reading Material Mod 3 Statistical Methods (1)
15 pages
5. Probability Distribution (1)
No ratings yet
5. Probability Distribution (1)
23 pages
Probability in A Nutshell
No ratings yet
Probability in A Nutshell
3 pages
CE Module 8 - Statistics and Probability (Principles)
No ratings yet
CE Module 8 - Statistics and Probability (Principles)
3 pages
Probability Notes
No ratings yet
Probability Notes
39 pages
EMDP: Logistics & Supply Chain Management (LSCM-3) : Quantitative Methods Assignment
No ratings yet
EMDP: Logistics & Supply Chain Management (LSCM-3) : Quantitative Methods Assignment
14 pages
Paper 2 UNIT 3.dox
No ratings yet
Paper 2 UNIT 3.dox
12 pages
9.1. Prob - Stats
No ratings yet
9.1. Prob - Stats
19 pages
Statistical Analysis: Dr. Shahid Iqbal Fall 2021
No ratings yet
Statistical Analysis: Dr. Shahid Iqbal Fall 2021
65 pages
Lecture Slides - Inferential Statistics
No ratings yet
Lecture Slides - Inferential Statistics
42 pages
BPT-Probability-binomia Distribution, Poisson Distribution, Normal Distribution and Chi Square Test
No ratings yet
BPT-Probability-binomia Distribution, Poisson Distribution, Normal Distribution and Chi Square Test
41 pages
UNIT 1 SSMDA NOTES
No ratings yet
UNIT 1 SSMDA NOTES
35 pages
Mayhs
No ratings yet
Mayhs
4 pages
Unit 3 Statistical References (1)
No ratings yet
Unit 3 Statistical References (1)
21 pages
MBA 1st Sem Unit-4 Business Statistics
No ratings yet
MBA 1st Sem Unit-4 Business Statistics
13 pages
Chapter 1 - Statistics
No ratings yet
Chapter 1 - Statistics
2 pages
Probability Handout
No ratings yet
Probability Handout
21 pages
Probability Distribution
No ratings yet
Probability Distribution
4 pages
Basic Statistics
100% (1)
Basic Statistics
106 pages
Assignment On Application of Poisson
No ratings yet
Assignment On Application of Poisson
10 pages
Week04 - 2303 Aplikasi Bisnis S2 UI
No ratings yet
Week04 - 2303 Aplikasi Bisnis S2 UI
46 pages
Unit 3 r as a Set of Statistical Tables
No ratings yet
Unit 3 r as a Set of Statistical Tables
31 pages
Level 3 Comp PROBABILITY PDF
100% (1)
Level 3 Comp PROBABILITY PDF
30 pages
Statistical and Probability Tools For Cost Engineering
No ratings yet
Statistical and Probability Tools For Cost Engineering
16 pages
Stat - G. Assignment
No ratings yet
Stat - G. Assignment
21 pages
Stats
No ratings yet
Stats
24 pages
Unit 3
No ratings yet
Unit 3
70 pages
Unit3 Business Stats Hypothesis
No ratings yet
Unit3 Business Stats Hypothesis
119 pages
FRM Part 1: Distributions
No ratings yet
FRM Part 1: Distributions
25 pages
Binomial, Poisson & Normal Distribution
No ratings yet
Binomial, Poisson & Normal Distribution
38 pages
Binomial Distribution
No ratings yet
Binomial Distribution
11 pages
Day 02-Random Variable and Probability - Part (I)
No ratings yet
Day 02-Random Variable and Probability - Part (I)
34 pages
Finals (MS)
No ratings yet
Finals (MS)
3 pages
Probability Distribution
No ratings yet
Probability Distribution
16 pages
Probability Distributions 2
No ratings yet
Probability Distributions 2
36 pages
MSD_Discrete_count_models_2
No ratings yet
MSD_Discrete_count_models_2
42 pages
Chapter 3 Radiation
100% (1)
Chapter 3 Radiation
36 pages
Probability
No ratings yet
Probability
10 pages
Screenshot 2024-10-04 at 14.09.22
No ratings yet
Screenshot 2024-10-04 at 14.09.22
23 pages
Binomail Distribution
No ratings yet
Binomail Distribution
37 pages
ProbabilityDistributions_BRSM_SP2022_Lecture3
No ratings yet
ProbabilityDistributions_BRSM_SP2022_Lecture3
45 pages
Distributions
No ratings yet
Distributions
61 pages
inbound4421484962866478386
No ratings yet
inbound4421484962866478386
68 pages
Probability
No ratings yet
Probability
22 pages
UNIT VI - Probability ppt
No ratings yet
UNIT VI - Probability ppt
7 pages
DA UNIT-4
No ratings yet
DA UNIT-4
37 pages
QM Formula Class
No ratings yet
QM Formula Class
31 pages
Probability Notes
No ratings yet
Probability Notes
7 pages
Statistics and Probability
No ratings yet
Statistics and Probability
41 pages
Definition of Statistics
No ratings yet
Definition of Statistics
19 pages
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
4pm1 01 Que 20220527
No ratings yet
4pm1 01 Que 20220527
32 pages
Unit 10
No ratings yet
Unit 10
3 pages
Aruna
No ratings yet
Aruna
21 pages
Pad Test FINAL
No ratings yet
Pad Test FINAL
24 pages
Thermo-Workbook2020 Appendix PDF
No ratings yet
Thermo-Workbook2020 Appendix PDF
22 pages
Danfoss Expansion Valve
No ratings yet
Danfoss Expansion Valve
42 pages
Nanotechnology List Gy
No ratings yet
Nanotechnology List Gy
52 pages
Specifications: Disc Ceramic Capacitors
No ratings yet
Specifications: Disc Ceramic Capacitors
15 pages
Spectrophotometer
No ratings yet
Spectrophotometer
4 pages
Textura de Las Superficies ANSI B34.1 NGEL
No ratings yet
Textura de Las Superficies ANSI B34.1 NGEL
48 pages
Fields Selections: Woodworking Emissions Calculator Revision C July 2007
No ratings yet
Fields Selections: Woodworking Emissions Calculator Revision C July 2007
4 pages
Cengr 1210
No ratings yet
Cengr 1210
28 pages
Reserch Paper - Modified
No ratings yet
Reserch Paper - Modified
16 pages
1.8 Casing Design1.9 Burst, Collapse, Tension
100% (1)
1.8 Casing Design1.9 Burst, Collapse, Tension
19 pages
Macgrid Asia Bi
No ratings yet
Macgrid Asia Bi
2 pages
Hecht - Chapter 9
No ratings yet
Hecht - Chapter 9
33 pages
Food Engineering Handbook Two Volume Set Food Engineering Handbook Food Engineering Fundamentals 1st Edition Theodoros Varzakas - The ebook is ready for download to explore the complete content
100% (2)
Food Engineering Handbook Two Volume Set Food Engineering Handbook Food Engineering Fundamentals 1st Edition Theodoros Varzakas - The ebook is ready for download to explore the complete content
49 pages
Pendulum
No ratings yet
Pendulum
14 pages
Reducing The Dimensionality of Data With Neural Networks: Reports
No ratings yet
Reducing The Dimensionality of Data With Neural Networks: Reports
5 pages
June 2014 (IAL) MS - Unit 3 Edexcel Physics A-Level
No ratings yet
June 2014 (IAL) MS - Unit 3 Edexcel Physics A-Level
12 pages
Physics 08-06 Electric Potential in A Uniform Electric Field
No ratings yet
Physics 08-06 Electric Potential in A Uniform Electric Field
2 pages
180907-Design and Detailing of Slabs (By NBN)
No ratings yet
180907-Design and Detailing of Slabs (By NBN)
49 pages
Chapter 6, Friction
No ratings yet
Chapter 6, Friction
31 pages
Review Session 2 - Midterm 2 - Solution
No ratings yet
Review Session 2 - Midterm 2 - Solution
7 pages
Ferry Viscoelastic Properties of Polymers PDF
0% (1)
Ferry Viscoelastic Properties of Polymers PDF
2 pages
Una Introducción A Elementos Finitos
No ratings yet
Una Introducción A Elementos Finitos
674 pages
5.Performance Improvement of LC-Based Beam-Steering Leaky-Wave Holographic Antenna Using Decoupling Structure
No ratings yet
5.Performance Improvement of LC-Based Beam-Steering Leaky-Wave Holographic Antenna Using Decoupling Structure
8 pages
ME8391-Engineering Thermodynamics PDF
No ratings yet
ME8391-Engineering Thermodynamics PDF
20 pages
Stresses From Radial Loads and External Moments in Spherical Pressure Vessels
No ratings yet
Stresses From Radial Loads and External Moments in Spherical Pressure Vessels
11 pages

Statistics theory( Soyaib)

Uploaded by

Statistics theory( Soyaib)

Uploaded by

Statistics (Theory)

Importance of Poisson distribution:

• The number of goals scored in a soccer game

Limitation of Normal distribution:

Solving these two equations, we find that n = 4 and p = 3/4.

What are you meant by Correlation and Regression?

Standard deviation is used in a wide variety of fields, including:

In general, standard deviation can be used to:

• Identify outliers in a data set

CV = standard deviation / mean

What are the essential characteristics of an ideal average?

An ideal average should have the following characteristics:

Null and Alternative hypothesis:

• Null hypothesis (H0): There is no effect or relationship between variables.

Here is an example of a null hypothesis and alternative hypothesis:

Skewness is a measure of the asymmetry of a distribution.

Kurtosis is a measure of the peakedness of a distribution.

Explain Normal curve and skew curve:

Level of confidence and level of significance:

Limitations of Binomial distribution:

Sample space: The set of all possible outcomes of an experiment.

Importance of Normal distribution:

o Continuous probability distribution

Negative binomial distribution:

o It is a discrete probability distribution

Necessary assumptions for the binomial distribution:

1. Each trial has only two possible outcomes.

Difference between Histogram and Frequency polygon:

What are the different measure of central of tendency?

You might also like