0% found this document useful (0 votes)
83 views7 pages

Bivariate Statistics: by The End of This Sub-Topic, Learners Should Be Able To

1) Bivariate statistics analyzes the relationship between two variables (X and Y) using methods like correlation and simple linear regression. 2) Correlation determines the strength and direction of the relationship between variables using Spearman's rank-order correlation for ordinal/interval/ratio data and Pearson's correlation coefficient for interval/ratio data. 3) Simple linear regression finds the best-fitting straight line through data points plotted on a scatter plot to determine the relationship between an independent and dependent variable.

Uploaded by

Prince Mpofu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views7 pages

Bivariate Statistics: by The End of This Sub-Topic, Learners Should Be Able To

1) Bivariate statistics analyzes the relationship between two variables (X and Y) using methods like correlation and simple linear regression. 2) Correlation determines the strength and direction of the relationship between variables using Spearman's rank-order correlation for ordinal/interval/ratio data and Pearson's correlation coefficient for interval/ratio data. 3) Simple linear regression finds the best-fitting straight line through data points plotted on a scatter plot to determine the relationship between an independent and dependent variable.

Uploaded by

Prince Mpofu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

BIVARIATE STATISTICS

BY THE END OF THIS SUB-TOPIC, LEARNERS SHOULD BE


ABLE TO:
1. Outline the methods used to test bivariate statistics.

2. Explain how each method is carried out.

3. Describe geographic data using correlation and simple


linear regression methods.

4. Apply bivariate statistics to given geographic phenomena.

Introduction
 Bivariate statistics is used when analysing a data set with two
variables which are usually denoted as X and Y.

 It is used to find out if there is a relationship between two variables.

 For example, when studying weather patterns of Zimbabwe one may


choose to focus on precipitation and temperature distributions only
to see if there is any relationship between the two.

 Thus temperature and precipitation becomes your two variables.

 Bivariate statistics is also used for two sets of variables that are
dependent on each other such as volume of traffic flow into a
town’s central business district and time.

 After carrying out a bivariate analysis, its findings are plotted on a


two-column data table such as the one below.
 One variable will be an independent variable whilst another will be a
dependent variable.

 An independent variable is a variable whose characteristic variation


is not affected by that of the other variable.

 It is used to determine its effects on the dependent variable and is


usually denoted by X.

 A change on it has direct effects on the dependent variable.

 A dependent variable on the other hand is a variable being analysed


during the experiment/investigation.

 It is denoted by Y in a data table.

 Graphs such as the scatter plot and box plot are used to plot
bivariate data.

 The focus in this subtopic will be on correlation and simple linear


regression.

Correlation (Spearman and Pearson’s Correlation)


 This is used to measure relationships between two variables.

 Two theories are used to determine correlation of two variables


namely Spearman’s and Pearson’s.

Spearman’s Rank-order Correlation


 It measures both the strength and direction of a relationship
between two variables.

 Spearman’s rank-order correlation is used when testing for ordinal,


interval or ratio data.

 It is represented by ρ or r and the following formula is used to


calculate it:

 ρ=1-6∑di2n(n2-1)
Where n = number of values in the sample
di = difference in paired ranks

 The values range from -1 to +1.

 -1 indicates a weaker relationship between the variables whilst 0


shows no relationship.

 A +1 value shows a perfect association between the two variables.

 A positive correlation coefficient occurs when the value of X


(independent variable) increases as Y (dependent variable) also
increases and vice versa.

 Before calculating the Spearman’s rank-order correlation, you first


of all to rank the values of both variables.

 The values are ranked in a descending order with the highest value
given a rank of 1.

 The example  2.3.1 shows how Spearman’s rank-order correlation


is calculated:

A study was carried out to see if there is a relationship


between income levels and the size of families. The
findings were recorded in the table 2.3.2.

Example 2.3.1
Table 2.3.2

∑di2= 16+4+0+4+16
= 40
Now we put this into the formula.
 ρ=1-6∑di2n(n2-1)
                              = =1-6(40)5(52-1)
 =1-240120
= 1-2
= -1
Therefore, there is a weak relationship between the values.

Pearson’s Correlation Coefficient


 It is also referred to as the Pearson product-moment correlation
coefficient.

 It is measures the strength and direction of a linear relationship


between two variables.

 Pearson’s correlation coefficient is used to measure interval and


ratio data.

 In an equation, it is denoted by r as shown in its formula below:


rxy=n∑xiyi-∑xi∑yin∑xi2-(∑xi)2  n∑yi2-(∑yi)2.

Where n = number of values in the sample data


          x = values of the independent variable
          y = values of the dependent variable

 The correlation coefficient r shows how far away the data points are
to the line of best fit.

 It therefore, attempts to draw a line of best fit through the two data
variables’ scatter points.

 Just like Spearman’s, its values range from -1 to 1.

 1 shows a total/complete positive linear relationship.

 0 shows that there is no linear relationship between the variables


whilst -1 shows a totally negative one.

 It is important to note that the nearer the scatter points are to a


straight line, the higher the strength of the relationship between the
variables.

 The example bellows shows how Pearson’s correlation coefficient is


calculated from a given data set.

Example
A study into the number of reported outbreaks of typhoid
in Mbare for the past 7 years shows the following data.

Simple Linear Regression


 This is a statistical method used to summarise and study
relationships between two variables.
 In other words, it investigates bivariate relations between variables.

 This model therefore tries to explain the investigated relationship


between two or more variables in a straight line.

 For example, finding out if there is a relationship between the rain


season and road accidents.

 In a simple linear regression model, one variable is denoted as X


whilst the other is represented by Y.

 X represents the independent variable whilst Y represents the


dependent variable.

 In regression models, the independent variable is called the


predictor whilst the dependent is called the response.

 It is called a simple linear regression because there is only one


independent/predictor variable.

 The values of the variables are plotted on a scatter plot.

 It then makes of use of finding the best-fitting straight line through


the plotted points.

 This line is called the regression line.

 This line allows a researcher to see if there is any relationship.

Example
 A study carried out in Harare to determine if there is any
relationship between typhoid outbreaks and the rain season of
every year. The following table shows the findings.
 Use this information to find out if there is typhoid outbreaks are on
the increase every rain season.

Solution

 From the scatter plot above, there is an increase in typhoid


outbreaks every year.

You might also like