0% found this document useful (0 votes)
4 views

Unit 5 SP(Notes Questionbank)

Uploaded by

sharanabasavasm1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Unit 5 SP(Notes Questionbank)

Uploaded by

sharanabasavasm1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Applications of Statistics

Statistical analysis in various fields (business, healthcare, social sciences), Design and interpretation
of experiments, Quality control and process improvement, Introduction to regression analysis.

1|Page
Contents
Sta s cal analysis in various fields (business, healthcare, social sciences) ............................................ 3
Sta s cal Analysis Methods for Business ........................................................................................... 4
sta s cal analysis in healthcare .......................................................................................................... 5
sta s cal analysis in social science ..................................................................................................... 6
Design and interpreta on of experiments .............................................................................................. 7
Quality control and process improvement ............................................................................................. 9
Introduc on to regression analysis ....................................................................................................... 11
Why do we use Regression Analysis?................................................................................................ 12
Linear Regression: ............................................................................................................................. 12
Regression Equa on:............................................................................................................................. 13
Regression equa ons Using regression co-efficient(Actual values of X and Y series): ..................... 14
Ques on Bank ....................................................................................................................................... 16

2|Page
Sta s cal analysis in various fields (business, healthcare, social
sciences)
Sta s cal analysis is the process of collec ng and analyzing data in order to discern pa erns and
trends. It is a method for removing bias from evalua ng data by employing numerical analysis. This
technique is useful for collec ng the interpreta ons of research, developing sta s cal models, and
planning surveys and studies.

Sta s cal analysis is a scien fic tool in AI and ML that helps collect and analyze large amounts of data
to iden fy common pa erns and trends to convert them into meaningful informa on. In simple
words, sta s cal analysis is a data analysis tool that helps draw meaningful conclusions from raw and
unstructured data.

The conclusions are drawn using sta s cal analysis facilita ng decision-making and helping
businesses make future predic ons on the basis of past trends. It can be defined as a science of
collec ng and analyzing data to iden fy trends and pa erns and presen ng them. Sta s cal analysis
involves working with numbers and is used by businesses and other ins tu ons to make use of data
to derive meaningful informa on.

Types of Sta s cal Analysis

Given below are the 6 types of sta s cal analysis:

 Descrip ve Analysis

Descrip ve sta s cal analysis involves collec ng, interpre ng, analyzing, and summarizing data to
present them in the form of charts, graphs, and tables. Rather than drawing conclusions, it simply
makes the complex data easy to read and understand.

 Inferen al Analysis

The inferen al sta s cal analysis focuses on drawing meaningful conclusions on the basis of the data
analyzed. It studies the rela onship between different variables or makes predic ons for the whole
popula on.

 Predic ve Analysis

Predic ve sta s cal analysis is a type of sta s cal analysis that analyzes data to derive past trends
and predict future events on the basis of them. It uses machine learning algorithms, data
mining, data modelling, and ar ficial intelligence to conduct the sta s cal analysis of data.

 Prescrip ve Analysis

The prescrip ve analysis conducts the analysis of data and prescribes the best course of ac on based
on the results. It is a type of sta s cal analysis that helps you make an informed decision.

 Exploratory Data Analysis

Exploratory analysis is similar to inferen al analysis, but the difference is that it involves exploring
the unknown data associa ons. It analyzes the poten al rela onships within the data.

 Causal Analysis

The causal sta s cal analysis focuses on determining the cause and effect rela onship between
different variables within the raw data. In simple words, it determines why something happens and

3|Page
its effect on other variables. This methodology can be used by businesses to determine the reason
for failure.

Importance of Sta s cal Analysis

Sta s cal analysis eliminates unnecessary informa on and catalogs important data in an
uncomplicated manner, making the monumental work of organizing inputs appear so serene. Once
the data has been collected, sta s cal analysis may be u lized for a variety of purposes. Some of
them are listed below:

 The sta s cal analysis aids in summarizing enormous amounts of data into clearly diges ble
chunks.

 The sta s cal analysis aids in the effec ve design of laboratory, field, and survey
inves ga ons.

 Sta s cal analysis may help with solid and efficient planning in any subject of study.

 Sta s cal analysis aid in establishing broad generaliza ons and forecas ng how much of
something will occur under par cular condi ons.

 Sta s cal methods, which are effec ve tools for interpre ng numerical data, are applied in
prac cally every field of study. Sta s cal approaches have been created and are increasingly
applied in physical and biological sciences, such as gene cs.

 Sta s cal approaches are used in the job of a businessman, a manufacturer, and a
researcher. Sta s cs departments can be found in banks, insurance businesses, and
government agencies.

 A modern administrator, whether in the public or commercial sector, relies on sta s cal data
to make correct decisions.

 Poli cians can u lize sta s cs to support and validate their claims while also explaining the
issues they address.

Sta s cal Analysis Methods for Business


Hypothesis Tes ng
Hypothesis tes ng is a sta s cal method used to substan ate a claim about a popula on. This is
done by formula ng and tes ng two hypotheses: the null hypothesis and the alterna ve
hypothesis.
The null hypothesis (denoted by H₀) is a statement about the issue at hand, generally based on
historical data and conven onal wisdom. A hypothesis test always starts by assuming the null
hypothesis is true and then tes ng to see if it can be nullified.
The alterna ve hypothesis (denoted by H₁) represents the theory or assump on being tested and
is the opposite of the null hypothesis. If the data effec vely nullifies the null hypothesis, then the
alterna ve hypothesis can be substan ated.

Single Variable Linear Regression


Linear regression analysis is used for two main purposes: to iden fy and evaluate the rela onship
between two variables and forecast a variable based on its rela onship to another one.
In single variable linear regression analysis, the rela onship between a dependent variable and an
independent variable is evaluated by iden fying the line of best fit.

4|Page
Mul ple Regression
Whereas single variable linear regression analysis studies the rela onship between two
variables—a dependent variable and an independent variable—mul ple regression
analysis inves gates the rela onship between a dependent variable and mul ple independent
variables.
Forecas ng with mul ple regression analysis is similar to using single variable linear regression.
However, instead of entering only one value for an independent variable, a value is input for each
independent variable.

sta s cal analysis in healthcare


1. Epidemiology

 Tracks the incidence and prevalence of diseases.

 Iden fies risk factors and pa erns of disease spread.

 Examples: COVID-19 infec on rates, cancer prevalence studies.

2. Clinical Trials

 Evaluates the efficacy and safety of new treatments or drugs.

 Involves hypothesis tes ng, randomiza on, and control groups.

 Example: Comparing the effec veness of two medica ons.

3. Pa ent Outcomes

 Measures and analyzes pa ent recovery rates, mortality, and sa sfac on.

 Helps tailor treatment plans for be er results.

 Example: Analyzing post-surgery recovery rates based on demographic factors.

4. Predic ve Analy cs

 Uses sta s cal models to predict future health outcomes.

 Example: Forecas ng hospital readmission rates or the likelihood of developing chronic


diseases.

5. Healthcare Opera ons

 Op mizes hospital workflows, staffing, and resource u liza on.

 Example: Analyzing pa ent admission data to improve scheduling and reduce wait mes.

6. Genomic Studies

 Analyzes gene c data to iden fy correla ons with diseases.

 Enables personalized medicine based on gene c predisposi ons.

 Example: Iden fying genes associated with diabetes or heart disease.

5|Page
Common Sta s cal Techniques in Healthcare

 Descrip ve Sta s cs: Summarizes data using measures like mean, median, and mode.

 Inferen al Sta s cs: Makes predic ons or inferences about a popula on based on sample
data.

 Regression Analysis: Iden fies rela onships between variables (e.g., age and disease risk).

 Survival Analysis: Examines me-to-event data, such as pa ent survival mes.

 Machine Learning and AI: Advanced models for analyzing complex datasets.

sta s cal analysis in social science


1.Public Policy Development

 Evaluates the impact of policies on popula ons.

 Example: Analyzing the effects of minimum wage increases on employment rates.

2. Educa on Research

 Studies factors influencing student performance and learning outcomes.

 Example: Assessing the rela onship between class size and academic achievement.

3. Sociology

 Examines social structures, inequali es, and cultural phenomena.

 Example: Inves ga ng income inequality and its correla on with educa on levels.

4. Psychology

 Analyzes behavioral data to understand mental health, cogni on, and emo ons.

 Example: Evalua ng the effec veness of therapy techniques using experimental designs.

5. Poli cal Science

 Studies voter behavior, elec on outcomes, and policy preferences.

 Example: Predic ng elec on results based on demographic data.

6. Market and Consumer Research

 Explores consumer preferences and social influences on purchasing behavior.

 Example: Analyzing the impact of social media marke ng on brand loyalty.

Common Sta s cal Techniques in Social Science

1. Descrip ve Sta s cs:

o Summarizes data using measures like mean, median, standard devia on, and
frequency distribu ons.

o Example: Demographic sta s cs summarizing age, gender, and income levels.

6|Page
2. Inferen al Sta s cs:

o Draws conclusions about popula ons based on sample data.

o Techniques: t-tests, ANOVA, chi-square tests.

o Example: Determining if a new teaching method improves test scores.

3. Regression Analysis:

o Examines rela onships between independent and dependent variables.

o Example: Analyzing how educa on levels influence income.

4. Factor Analysis:

o Reduces data dimensions to uncover latent variables.

o Example: Iden fying underlying factors of job sa sfac on.

5. Cluster Analysis:

o Groups individuals or en es with similar characteris cs.

o Example: Segmen ng popula ons based on poli cal ideologies.

6. Longitudinal Analysis:

o Studies changes over me.

o Example: Tracking changes in public opinion about climate change.

7. Structural Equa on Modeling (SEM):

o Tests complex rela onships among variables.

o Example: Exploring the impact of socioeconomic status on educa onal a ainment.

Design and interpreta on of experiments


What is design of experiments?

Design of experiments (DOE) is a systema c, efficient method that enables scien sts and engineers
to study the rela onship between mul ple input variables (aka factors) and key output variables (aka
responses). It is a structured approach for collec ng data and making discoveries.

When to use DOE?

 To determine whether a factor, or a collec on of factors, has an effect on the response.

 To determine whether factors interact in their effect on the response.

 To model the behavior of the response as a func on of the factors.

 To op mize the response.

Ronald Fisher first introduced four enduring principles of DOE in 1926: the factorial principle,
randomiza on, replica on and blocking. Genera ng and analyzing these designs relied primarily on
hand calcula on in the past; un l recently prac oners started using computer-generated designs
for a more effec ve and efficient DOE.

7|Page
DOE is useful:

 In driving knowledge of cause and effect between factors.

 To experiment with all factors at the same me.

 To run trials that span the poten al experimental region for our factors.

 In enabling us to understand the combined effect of the factors.

Types of Experimental Designs

There are different types of experimental designs of research. They are:

 Pre-experimental Research Design

 True-experimental Research Design

 Quasi-Experimental Research Design

Pre-experimental Research Design

The simplest form of experimental research design in Sta s cs is the pre-experimental research
design. In this method, a group or various groups are kept under observa on, a er some factors are
recognised for the cause and effect. This method is usually conducted in order to understand
whether further inves ga ons are needed for the targeted group. That is why this process is
considered to be cost-effec ve. This method is classified into three types, namely,

 Sta c Group Comparison

 One-group Pretest-pos est Experimental Research Design

 One-shot Case Study Experimental Research Design

True-experimental Research Design

This is the most accurate form of experimental research design as it relies on the sta s cal
hypothesis to prove or disprove the hypothesis. This is the most commonly used method
implemented in Physical Science. True experimental research design is the only method that
establishes the cause and effect rela onship within the groups. The factors which need to be
sa sfied in this method are:

 Random variable

 Variable can be manipulated by the researcher

 Control Groups (A group of par cipants are familiar with the experimental group, but the
experimental rules do not apply to them)

 Experimental Group (Research par cipants where experimental rules are applied)

Quasi-Experimental Design

A quasi-experimental design is similar to a true experimental design, but there is a difference


between the two.

8|Page
In a true experiment design, the par cipants of the group are randomly assigned. So, every unit has
an equal chance of ge ng into the experimental group.

In a quasi-experimental design, the par cipants of the groups are not randomly assigned. So, the
researcher cannot make a cause or effect conclusion. Thus, it is not possible to assign the
par cipants to the group.

Quality control and process improvement


Sta s cal methods in quality improvement are defined as the use of collected data and quality
standards to find new ways to improve products and services. They are a formalized body of
techniques characteris cally involving a empts to infer the proper es of a large collec on of data.

Importance of Quality Control and Process Improvement

1. Consistency: Ensures that products and processes meet predefined quality standards.

2. Customer Sa sfac on: Maintains and improves product quality to meet or exceed customer
expecta ons.

3. Cost Reduc on: Minimizes waste, defects, and opera onal inefficiencies.

4. Decision-Making: Provides data-driven insights for improving processes and addressing


quality issues.

Key Concepts in Quality Control

1. Defects and Variability:

o Iden fying and minimizing defects in products.

o Understanding and reducing variability in processes.

2. Standards and Specifica ons:

o Defining acceptable limits or tolerances for product a ributes (e.g., weight, size,
performance).

3. Con nuous Improvement:

o Implemen ng itera ve changes to processes for be er outcomes (e.g., Lean, Six


Sigma).

Applica ons of Sta s cal Methods

1. Sta s cal Process Control (SPC)

 Monitors and controls processes using sta s cal tools.

 Control Charts:

o Tools to detect process variability over me.

o Types:

 X and R Charts: For variables data (e.g., temperature, length).

9|Page
 P and C Charts: For a ribute data (e.g., defects per unit, pass/fail rates).

o Example: Monitoring the diameter of manufactured bolts to ensure it stays within


tolerance.

2. Process Capability Analysis

 Assesses whether a process can consistently produce products within specifica on limits.

 Key Metrics:

o Cp: Measures poten al process capability.

o Cpk: Adjusts Cp for process mean shi s.

 Example: Evalua ng a produc on line to ensure products meet size requirements.

3. Root Cause Analysis

 Iden fies underlying causes of defects or quality issues.

 Tools:

o Pareto Charts: Highlights the most common causes of problems.

o Fishbone Diagrams (Ishikawa): Maps poten al causes of a defect.

4. Hypothesis Tes ng

 Tests assump ons about process changes or improvements.

 Example: Determining whether a new material reduces defect rates compared to the current
material.

5. Design of Experiments (DOE)

 Systema cally inves gates the effects of mul ple variables on a process.

 Example: Op mizing baking me and temperature to improve cookie texture.

6. Six Sigma and DMAIC

 Uses sta s cal tools to improve process performance.

 DMAIC:

o Define: Iden fy the problem.

o Measure: Collect data and establish baselines.

o Analyze: Iden fy root causes of defects.

o Improve: Implement solu ons.

o Control: Maintain improvements.

7. Reliability Analysis

 Ensures products perform as expected over me.

 Example: Calcula ng the mean me to failure (MTTF) for electronic components.

10 | P a g e
Challenges in Quality Control and Process Improvement

1. Data Collec on:

o Ensuring accurate and sufficient data for meaningful analysis.

2. Resistance to Change:

o Overcoming organiza onal resistance to process adjustments.

3. Complex Processes:

o Managing variability in mul faceted or automated systems.

4. Balancing Costs:

o Implemen ng quality improvements without excessive expenditure.

Introduc on to regression analysis


Regression analysis is a sta s cal method to model the rela onship between a dependent (target)
and independent (predictor) variables with one or more independent variables. More specifically,
Regression analysis helps us to understand how the value of the dependent variable is changing
corresponding to an independent variable when other independent variables are held fixed. It
predicts con nuous/real values such as temperature, age, salary, price, etc.

Example: Suppose there is a marke ng company A, who does various adver sement every year and
get sales on that. The below list shows the adver sement made by the company in the last 5 years
and the corresponding sales:

11 | P a g e
Now, the company wants to do the adver sement of $200 in the year 2019 and wants to know the
predic on about the sales for this year. So to solve such type of predic on problems in machine
learning, we need regression analysis.

Why do we use Regression Analysis?


Regression analysis helps in the predic on of a con nuous variable. There are various scenarios in
the real world where we need some future predic ons such as weather condi on, sales predic on,
marke ng trends, etc., for such case we need some technology which can make predic ons more
accurately.

Reasons for using Regression analysis:

o Regression es mates the rela onship between the target and the independent variable.

o It is used to find the trends in data.

o It helps to predict real/con nuous values.

o By performing the regression, we can confidently determine the most important factor, the
least important factor, and how each factor is affec ng the other factors.

Linear Regression:
o Linear regression is a sta s cal regression method which is used for predic ve analysis.

o It is one of the very simple and easy algorithms which works on regression and shows the
rela onship between the con nuous variables.

o It is used for solving the regression problem in machine learning.

o Linear regression shows the linear rela onship between the independent variable (X-axis)
and the dependent variable (Y-axis), hence called linear regression.

o If there is only one input variable (x), then such linear regression is called simple linear
regression. And if there is more than one input variable, then such linear regression is
called mul ple linear regression.

o The rela onship between variables in the linear regression model can be explained using the
below image. Here we are predic ng the salary of an employee on the basis of the year of
experience.

12 | P a g e
o Below is the mathema cal equa on for Linear regression:

1. Y= aX+b

Here, Y = dependent variables (target variables),


X= Independent variables (predictor variables),
a and b are the linear coefficients

Regression Equa on:

Regression Equa ons of X on Y

∑x = Na + b∑y
∑xy = a∑y + b∑y2
Regression Equa ons of Y on X

∑y = Na + b∑x
∑xy = a∑x + b∑x2

13 | P a g e
Ques on: Calculate the regression equa ons of x on y from the following data by the method of
least square:

X 1 2 3 4 5
Y 2 5 3 8 7

Solu on:

**Note: if in the ques on men oned that solve using “Least Square” method, it means “Normal
Equa on”

So. For normal formula, X on Y:


Regression Equa ons of X on Y

∑x = Na + b∑y----------------------------(1)
∑xy = a∑y + b∑y2------------------------(2)

Let get the values from the table


X Y Xy Y2
1 2 2 4
2 5 10 25
3 3 9 9
4 8 32 64
5 7 35 49
∑x=15 ∑y=25 ∑xy=88 ∑y2=151

Placing values in the equa ons


(1)=>15=5a+b25-----------(3)
(2)=>88=25a+15b----------(4)

Solving (3) & (4),


Equa on of regression is: x=0.5+0.5y

HomeWork:
Obtain the regression equa on of Y on X by the least square method for the following data. Also
es mate the value of y when x=10

X 1 2 3 4 5
Y 9 9 10 12 11

Regression equa ons Using regression co-efficient(Actual values of X and Y series):


X On Y Y on X
(x-x’)=bxy(y-y’) (y-y’)=byx(x-x’)
Bxy is the regression coefficient Bxy=(n∑xy-∑x. ∑y)÷(N∑x2-(∑x)2)
Bxy=(N∑xy-∑x. ∑y)÷(N∑y2-(∑y)2)

14 | P a g e
Ques on:
Calculate the regression equa ons of X on Y and Y on X from the following data using regression
coefficient:

X 1 2 3 4 5
Y 2 5 3 8 7
Solu on:
X On Y
(x-x’)=bxy(y-y’)
Bxy is the regression coefficient
Bxy=(N∑xy-∑x. ∑y)÷(N∑y2-(∑y)2)

X Y Xy Y2 X2
1 2 2 4 1
2 5 10 25 4
3 3 9 9 9
4 8 32 64 16
5 7 35 49 25
∑x=15 ∑y=25 ∑xy=88 ∑Y2=151 ∑x2=55

X’=(∑x)/N=15/5=3
Y’=(∑y)/N=25/5=5
(∑x)2=252=625
Bxy=(5*88-15*25)÷(5*151-625)=0.5

(x-x’)=bxy(y-y’)
A er Placing values, equa on will be,
X=0.5y+0.5

Y on X
(y-y’)=byx(x-x’)
Bxy=(n∑xy-∑x. ∑y)÷(N∑x2-(∑x)2)
Bxy=(5*88-15*25)÷(5*55-625)=1.3

(y-5)=1.3*(x-3)
Y=1.3x+1.1

Ques on H/W)
Calculate the regression equa ons of X on Y and Y on X from the following data using regression
coefficient:

X -1 5 3 2 1 1 7 3
Y -6 1 0 0 1 2 1 5

15 | P a g e
Ques on Bank

Sl Ques ons Marks


1 What is the Importance of Quality Control and Process Improvement? 5
2 What is design of experiments? 3
3 What is Hypothesis Tes ng? 3
4 What are the types of Experimental Designs? 5
5 Ques on: Calculate the regression equa ons of x on y from the following data by 3
the method of least square:

X 1 2 3 4 5
Y 3 6 4 9 8
6 Obtain the regression equa on of Y on X by the least square method for the 3
following data. Also es mate the value of y when x=10

X 1 2 3 4 5
Y 9 9 10 12 11
7 Simple linear regression vs Mul ple linear regression? 3
8 Calculate the regression equa ons of X on Y and Y on X from the following data 5
using regression coefficient:

X -1 5 3 2 1 1 7 3
Y -6 1 0 0 1 2 1 5
9 Calculate the regression equa ons of X on Y and Y on X from the following data 5
using regression coefficient:

X 1 2 3 4 5
Y 2 5 3 8 7
10 Explain about regression and uses of regression. 5

16 | P a g e

You might also like