0% found this document useful (0 votes)
12 views

unit 7 8614

Uploaded by

Saheefah Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

unit 7 8614

Uploaded by

Saheefah Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

INFERENTIAL

STATISTICS:
CORRELATION
EDUCATIONAL STATISTICS
AND
REGRESSIO
N Unit 7
After completion of unit, the students will be able to,
1. Briefly explains correlation and uses of
correlation.
 2. Write considerations in interpreting
correlation.
3. Know formula is used to calculate Pearson
Learning correlation.
4. Know formula is used to calculate Spearman
correlation.
objectiv  5. Able to understand the Term “regression”
es  6. Explain the use of regression analysis.
 7. Write down the types of regression.
 8. Mention p-value and interpretation of p-
value.
• Correlation is used to test
relationships
between quantitative
variables or categorical
variables. In other words, it’s
Correlatio a measure of how things are
related. The study of how
n variables are correlated is
called correlation analysis.
• A correlation is a relationship
between two variables.
• The purpose of using
correlation in research is to
determine the degree to which
a relationship exists between
two or more variables.
Correlati • Correlation is important in
research because several
on hypotheses are stated in terms
of correlation or lack of
correlation between two
variables, so correlational
studies are directly related to
such hypotheses.
• In a positive correlation both variables tend to
change into same direction. When variable X
increases, the variable Y also increases. And if
the variable X decreases, the variable Y also
Characteristics of decreases.
Relationship • In a negative correlation both variables do not
That correlation tend to change into same direction. They go in
Measures opposite direction of each other.
The Direction of
1. The Direction of a Relationship The correlation measure tells
us about the direction of the
the Relationship relationship between the two variables.
In a positive relationship both variables tend to
Positive move in the same direction: If one variable
increases, the other tends to also increase.
correlati
on
In a negative relationship the variables tend
to move in the opposite directions: If one
Negative variable increases, the other tends to
decrease, and vice-versa.
correlati
on
The form of correlation measures
how well the data fit the specific
2. The form form being considered.
of the A linear correlation measures how
Relationship well the data points fit on a straight
line.
• The degree of relationship is measured
by the numerical value of the
correlation.
3.The Degree • This value varies from 1.00 to –1.00. A
(strength) of the perfect correlation is always identified
Relationship by a correlation of 1.00 and indicates
a perfect fit. + 1.00 will indicate
perfect positive correlation and –1.00
will indicate perfect negative
correlation.
• A correlation of 0 indicates no
correlation or no fit at all.
The Pearson Correlation

• The most commonly used correlation is the Pearson


Correlation.
• It is also known as Pearson product-moment
Correlation.
• It measures the degree and the direction of
linear relationship of between two variables.
• It is denoted by r.
Prediction Validity
Using and
Interpreting
Pearson
Correlation Reliability Theory
Verification
Predication

• If two variables are known to be related in some


systematic way, it is possible to use one variable to
make prediction about the other.
Validity
• Suppose a researcher develops a new test for measuring
intelligence.
• It is necessary that he should show that this new test valid
and truly measures what it claims to measure.
• One common technique for demonstrating validity is to
use correlation.
Reliability
 Apart from determining validity, correlations are also used to
determine reliability.
 A measurement procedure is reliable if it produces stable and
consistent measurement.
 It means a reliable measurement procedure will produce the same
scores when the same individuals are measured under the same
conditions.
Theory Verification
 Many psychological theories make specific predictions
about the relationship between two variables.
 A theory may predict a relationship between brain size
and learning ability; between the parent IQ and the
child IQ etc. In each case, the prediction of the theory
could be tested by determining the correlation
between two variables.
The Spearman Correlation

• The most commonly used measure of relationship is the Pearson


correlation. It measures the degree of linear relationship between
two variables and is used with interval or ratio data. Other measures
of correlation have been developed for non-linear relationship and
for other type of data.
• One such measure is the Spearman Correlation. The Spearman
correlation is used in two situations.
• Pearson correlation measures the degree of linear relationship
between two variables, the spearman correlation measures the
consistency of relationship.
Regression
• Regression is a statistical method used in finance,
investing, and other disciplines that attempts to
determine the strength and character of the relationship
between one dependent variable (usually denoted by
Y) and a series of other variables (known as
independent variables.
• On the other hand regression finds the best line that predicts
dependent variables from the independent variable.
• The decision of which variable is calls dependent and
which calls independent is an important matter in
regression, as it will get a different best-fit line if we
exchange the two variables, i.e. dependent to independent
and independent to dependent.
• The line that best predicts independent variable from
dependent variable will not be the same as the line that
predicts dependent variable from independent variable.
Objectives of Regression
Analysis
The regression analysis is used to explain variability in dependent
variable by mean of one or more of independent variables and to
analyze relationships among variables to answer the question of how
much dependent variable changes with the changes in the
independent variables and to forecast or predict the value of
dependent variable based on the values of the independent variable.
The primary objective of the regression is to develop a relationship
between a response variable and the explanatory variable for the
purpose of prediction, assumes that a functional relationship exists,
and alternative approaches are superior.
Why do we use Regression
Analysis?
• Regression analysis estimates the relationship between two or
more variables and is used for forecasting or finding cause
and effect relationship between the variables.
• There are multiple benefits of using regression analysis.
• It indicates significant relationship between independent and
dependent variables.
• It indicate strength of impact of multiple variables on
dependent variable.
1. Linear Regression. It is one of the most
widely known modeling technique. ...
2. Logistic Regression.
3. Polynomial Regression.
Types of 4. Stepwise Regression.
Regressi 5. Ridge Regression.
on 6. Lasso Regression. .
7. Elastic Net Regression.
• It is the most commonly used types of
regression. In this technique the dependent
variable is continuous, and the independent
1. Linear variable can be continuous or discrete and
Regressio the nature of regression line is linear.
n • Linear regression establishes a relationship
between dependent variable (Y) and one or
more independent variables(X) using best
fit straight line (also known as regression
line)
Logistic regression is used
to describe data and to
explain the relationship
2. Logistic between one dependent
binary variable and one or
Regressio more nominal, ordinal,
n interval or ratio-level
independent variables.
• It is a form of regression analysis in which the
relationship between independent
3.Polynomial variable X and dependent variable Y is
Regression modeled as an nth degree polynomial in x.
• This type of regression fits a non-linear relationship
between the values of X with the corresponding
values of Y.
• It is a method of fitting regression model in which
the choice of predictive variables is carried out by an
automatic procedure.
• In each step, a variable is considered for addition or
4. Stepwise subtraction from the set of explanatory variables based
on some pre-specified criteria.
Regression • The general idea behind this procedure is that we
build our regression model from a set of predictor
variable by entering and removing predictors in our
model, in a stepwise manner, until there is no
justifiable reason to enter or remove any more.
• It is a technique for analyzing
multiple regression data that suffer
from multicollinearity
(independent variables are highly
correlated).
• When multicollinearity occurs,
5. Ridge least squares estimates are
unbiased, but their variances are
Regressio large so that they may be far from
n the true value.
• By adding the degree of bias to the
regression estimates, ridge
regression reduces the standard
errors.
• LASSO or lasso stands for Least
Absolute Shrinkage and Selection
Operator.
• It is a method that performs both
variable selection and
regularization in order to enhance
6. LASSO the prediction accuracy and
interpretability of the statistical
model it produces.
Regressi • This type of regression uses
shrinkage.
on • Shrinkage is where data values are
shrunk towards a central point, like
the mean.
• This type of regression is
a hybrid of lasso and ridge
7. Elastic regression techniques.
Net • It is useful when there are
Regression multiple features which
are correlated.
P-Value

• The p-value is the level of marginal significance within a


statistical hypothesis test representing the probability of
occurrence of a given event.
• This value is used as analternative to rejection points to
provide the smallest level of significance at which the null
hypothesis would be rejected.
• A p-value is used in hypothesis testing to help researcher
support or reject the null hypothesis.
• It is evidence against the null hypothesis.
A relatively simple way to interpret p-value is to think of them as
representing how likely a result would occur by chance.
For a calculated p-value of .01, we can say that the observed
outcomes would be expected to occur by chance only 1 in
100 times in repeated tests on different samples of the
population.
Similarly a p-value of .05 would represent the expected outcome
to occur by chance only 5 times out of 100 times in repeated
tests.
 In case of p-value .01, the researcher is 99% confident of
getting similar results if same test is repeated for 100
times.
Similarly in case of p-value .05, the researcher is 95%
confident and in case of p-value .001, he is 999%
confident of getting similar results if same test is
repeated for 100 times and 1000 times respectively.
• Q. 1 Briefly explains correlation and uses
of correlation?
• Q. 2 Write down considerations kept in mind while
interpreting correlation.
• Q. 3 Which formula is used to calculate Pearson
Assessmen correlation?
t • Q.4 Which formula is used to calculate Spearman
correlation?
• Q. 5 What do you understand by “regression”?
• Q. 6 Why do we use regression analysis?
• Q. 7 Write down the types of regression.
• Q. 8 Write down a brief note on p-value?

You might also like