0% found this document useful (0 votes)
167 views

Bivariate Data Analysis Olympics Project Lef-2

This document analyzes the relationship between the year of the Olympic Games and the winning distances of women's shot put. It finds a moderate, positive, nonlinear correlation. A linear regression line is not appropriate as distances increased then fluctuated around 20-21 meters. The 2020/2021 winning distance follows this trend. The regression line predicts a 2024 distance of 20.52 meters. The 1996 residual is small and positive, strengthening the correlation. The r-value of 0.686 indicates a moderate correlation, and r-squared is 0.471.

Uploaded by

api-484251774
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
167 views

Bivariate Data Analysis Olympics Project Lef-2

This document analyzes the relationship between the year of the Olympic Games and the winning distances of women's shot put. It finds a moderate, positive, nonlinear correlation. A linear regression line is not appropriate as distances increased then fluctuated around 20-21 meters. The 2020/2021 winning distance follows this trend. The regression line predicts a 2024 distance of 20.52 meters. The 1996 residual is small and positive, strengthening the correlation. The r-value of 0.686 indicates a moderate correlation, and r-squared is 0.471.

Uploaded by

api-484251774
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Liesel Fazekas

Mrs. Jones

Bivariate Data Analysis Olympics Project: Women’s Shot Put

27 August 2021

Winning Distances of Women’s Shot Put in the Olympics

Year/Olympic Games: Distance:

1948 13.75

1952 15.28

1956 16.59

1960 17.32

1964 18.14

1968 19.61

1972 21.03

1976 21.16

1980 22.41

1984 20.48

1988 22.24

1992 21.06

1996 20.56

2000 20.56

2004 19.59

2008 20.56

2012 20.7

2016 20.63

2020 20.58
Sources:

https://olympics.com/tokyo-2020/olympic-games/en/results/athletics/result-women-s-shot-put-fn

l-000100-.htm

https://olympics.com/en/olympic-games/olympic-results

The relationship between the distances and years is a moderate, positive, and nonlinear

correlation.
Sources: https://istats.shinyapps.io/LinearRegression/

A linear model is not appropriate for the relationship between the year or olympic games and the

distance because the distance increases at first and then fluctuates around the same value in the

later years. In the earlier games, each winning distance would break the record of the one before,

but when the distance reached around 20-21 meters, in the later years, it stopped increasing. The

scatterplot shows a slightly curved, nonlinear shape. The residual plot shows a more curved

shape, as well. Therefore this data is not fit for a linear model.

The winning distance for the 2020 or 2021 olympic games is about the same or a little less than

the games before. I don’t think the delay of the games due to covid affected my event much,

statistically. This year's value follows the trend of staying around the same values of 20-21

meters for the winning distance. An extra year gives the athletes more time to prepare and train

for the event, but covid also made it difficult to train in a normal setting.

Least Squares Regression Line:

Explanatory variable (x): Years/Olympic Games

Response variable (y): Distance in meters

Ŷ = -121.16 + 0.07x

Slope interpretation:

For every additional year, the model predicts an increase of 0.07 meters for the winning distance

in Olympic women’s shot put.


Y-intercept interpretation:

When the year is 0, the winning distance in meters is -121.16. This is not relevant because the

Olympic games were not invented within the first year of human life.

Predicted winning distance for 2024:

Ŷ = -121.16 + 0.07(2024) = 20.52 meters

Residual for 1996 winning distance:

Residual = observed y - predicted y

Ŷ = -121.16 + 0.07(1996) = 18.56 = predicted y

20.56 = observed y

Resid = 20.56 - 18.56

Residual for 1996 = 2

This residual is positive, meaning it is above the LSR line. A residual of 2 is a small residual,

which is strengthening the correlation.

R-value:

R = 0.686

This r value indicates that the relationship between the years or olympic games and the winning

women's shot put distances is moderate in strength. A moderate correlation has an r value

between 0.5-0.9, so 0.686 falls inside that range.


R-squared value:

r^2 = 0.471

The r-squared value tells you how much of the data can be explained by the LSR line. In this

correlation, 47.1% of the variability of the winning distances is explained by the least squares

regression line for the relationship between the years and the distances.

Mean and Standard Deviation of X & Y:

Explanatory-

Mean = 1984

Standard Deviation = 22.51

Response-

Mean = 19.59

Standard Deviation = 2.33

Proving Slope:

B = 0.07

B = r (Sy/Sx)

0.686(2.33/22.51) = 0.07 = b

0.07 = 0.07

Proving LSRL passes through (x mean, y mean):

X mean = 1984

Y mean = 19.59
Ŷ = -121.16 + 0.07x

Ŷ = -121.16 + 0.07(1984)

Ŷ = 17.72

17.72 ≠ 19.59

Although it is close, the LSRL does not pass through the point (1984, 19.59). This is most likely

because there is not a linear relationship between years or olympic games and the winning

women's shot put distances.

You might also like