Project 1 - Descriptive Statistics

The document provides definitions and analysis of descriptive statistics for 3 variables: unemployment (X1), inflation (X2), and net savings (X3). Key findings include: X1 has a right skewed distribution with mean > median, high standard deviation, and some outliers. X2 is approximately symmetric with multiple modes. X3 also has a right skewed distribution with mean > median and a high standard deviation. The document analyzes measures of central tendency, dispersion, shape, and outliers to characterize the distributions of each variable.

Uploaded by

Ana Chikovani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views11 pages

Project 1 - Descriptive Statistics

Uploaded by

Ana Chikovani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Topic: Descriptive Statistics

Task: We have been given 3 variables to analyze. They are: X1 – unemployment level, X2 –
inflation level, X3 – Net savings level. Our job is to provide basic descriptive statistics analysis
on the three variables.

Content
 Definitions of the terms
 Analysis of the given Data
 Conclusion

Definitions of the terms:

Measures of location:
 Mean – is an average value of variable. The mean provides a measure of central location
for the data.
 Std. Error of Mean – he standard error of the mean, or simply standard error, indicates
how different the population mean is likely to be from a sample mean. It tells you how
much the sample mean would vary if you were to repeat a study using new samples from
within a single population.
 Mode – is the value that occurs with greatest frequency.
 Median – The median is the value in the middle when the data are arranged in ascending
order (smallest value to largest value). With an odd number of observations, the median
is the middle value. An even number of observations has no single middle value, so the
median is the average number of those middle values.
 Percentiles – A percentile provides information about how the data are spread over the
interval from the smallest value to the largest value. For a data set containing n
observations, the pth percentile divides the data into two parts: approximately p% of the
observations are less than the pth percentile, and approximately (100 – p) % of the
observations are greater than the pth percentile.
 Quartiles – it is often desirable to divide a data set into four parts, with each part
containing approximately one-fourth, or 25%, of the observations. These division points
are referred to as the quartiles and are defined as follows.
Q1 = first quartile, or 25th percentile
Q2 = second quartile, or 50th percentile (also the median)
Q3 = third quartile, or 75th percentile
Because quartiles are specific percentiles, the procedure for computing percentiles can be
used to compute the quartiles.

1
Measures of variables:
 Range – is the difference between the largest value and the smallest value. It is
determined by only the two extreme data values.
 Interquartile Range – is the measure of variability that overcomes the dependency on
extreme values. This measure of variability is the difference between the third quartile,
Q3, and the first quartile, Q1. In other words, the interquartile range is the range for the
middle 50% of the data.
 Variance –A measure of dispersion around the mean, equal to the sum of squared
deviations from the mean divided by one less than the number of cases. The variance is
measured in units that are the square of those of the variable itself.
 Standard Deviation – is defined to be the positive square root of the variance.

Measures of Distribution Shape, Relative Location, and Outliers:

 Skewness – is a measure of the asymmetry of a distribution. A distribution is
asymmetrical when its left and right side are not mirror images. A distribution can have
right (or positive), left (or negative), or zero skewness.
 Std. Error of Skewness – normal range for the skewness is from +1 To -1. The ratio of
skewness to its standard error can be used as a test of normality (that is, you can reject
normality if the ratio is less than -2 or greater than +2). A large positive value for
skewness indicates a long right tail; an extreme negative value indicates a long-left tail.
 Histogram – is a graphical representation of data points organized into user-specified
ranges. Similar in appearance to a bar graph, the histogram condenses a data series into
an easily interpreted visual by taking many data points and grouping them into logical
ranges or bins.
 Minimum – The smallest value of a numeric variable.
 Maximum – The largest value of a numeric variable.
 Five-number summary– A five-number summary is especially useful in descriptive
analyses or during the preliminary investigation of a large data set. A summary consists
of five values: the most extreme values in the data set (the maximum and minimum
values), the lower and upper quartiles, and the median. These values are presented
together and ordered from lowest to highest: minimum value, lower quartile (Q1),
median value (Q2), upper quartile (Q3), maximum value.

2
 Box Plots– is a type of chart often used in explanatory data analysis. Box plots visually
show the distribution of numerical data and skewness through displaying the data
quartiles (or percentiles) and averages. Box plots show the five-number summary of a set
of data: including the minimum score, first (lower) quartile, median, third (upper)
quartile, and maximum score.
 Outliers – An outlier is an observation that lies an abnormal distance from other values
in a random sample from a population. In a sense, this definition leaves it up to the
analyst (or a consensus process) to decide what will be considered abnormal. Before
abnormal observations can be singled out, it is necessary to characterize normal
observations.
 z-Scores – reveal to statisticians and traders whether a score is typical for a specified
data set or if it is atypical. Z-scores also make it possible for analysts to adapt scores
from various data sets to make scores that can be compared to one another more
accurately.

Analytics:
Statistics
x1 x2 x3
N Valid 59 59 59
Missing 0 0 0
Mean 11.3441 9.1453 12.9002
Std. Error of Mean .53791 .61191 .86699
Median 10.2800 8.8800 11.2300
Mode 9.70 2.70a 11.23a
Std. Deviation 4.13173 4.70016 6.65945
Variance 17.071 22.091 44.348
Skewness 2.151 .392 1.065
Std. Error of Skewness .311 .311 .311
Range 24.87 20.17 31.90
Minimum 4.24 1.35 3.28
Maximum 29.11 21.52 35.18
Sum 669.30 539.57 761.11
Percentiles 25 9.0900 5.1300 8.1300
50 10.2800 8.8800 11.2300
75 12.7200 12.7200 17.0000

3
a. Multiple modes exist. The smallest value is shown

We have 59 Data Points for each variable (x1, x2, x3). None of them are missing.
x1:

 Mean (11.34) and Median (10.28) values are not that far from each other. Mean is bigger
than the median we can conclude that the skewness is right.
 Mode (9.7) and it is observed 2 times in whole data.
 Skewness is 2.151 curve is Highly Skewed Right. This means that there are some data
that are far left from the majority of other data (As we will see in the histogram down
below). This also might imply that there is an outlier in the data set.
 Percentiles - as we see Q1, Q2(median) and Q3 are close to each other. 50% of the data
is in the range 9.09 (Q1) and 12.72 (Q3).
 Range (Maximum - minimum) is 24.87 compared to the Q3 – Q1 (3.63) the number is
high, which means that we have some data scattered at the beginning and in the end.
Histogram – x1 is visual representation of what we discussed above.

4
x2

 Mean (9.15) and Median (8.88) values are really close. Mean is bigger than the median
we can conclude that the skewness is a little right.
 Mode is multiple in this case; it means it is not helpful measure for our analyses.
 Skewness is 0.39 this means that the curve is almost Symmetric, a little skewed to the
right.
 Percentiles - as we see Q1, Q2(median) and Q3 are moderately close to each other. 50%
of the data is in the range 5.13 (Q1) and 12.72(Q3).
 Range (Maximum - minimum) is 21.52 compared to the Q3 – Q1 (7.59) the number is
high, which means that we have some data scattered at the beginning and in the end.

Histogram – x2 is visual representation of what we discussed above.

5
x3

 Mean (12.90) and Median (11.23) values are really close. Mean is bigger than the median
we can conclude that the skewness is a right.
 Mode is multiple in this case; it means it is not helpful measure for our analyses.
 Skewness is 1.065 this means that the curve is moderately skewed right.
 Percentiles - as we see Q1, Q2(median) and Q3 are moderately close to each other. 50%
of the data is in the range 8.13(Q1) and 17.00(Q3).
 Range (Maximum - minimum) is 31.90 compared to the Q3 – Q1 (8.87) the number is
high, which means that we have some data scattered at the beginning and in the end.

Histogram – x3 is visual representation of what we discussed above.

6
Box plot
x1 – we see that there are 2 outliers (26, 29). mean is bigger than median, this means we have
positive skewness. Also, box plot is comparatively short and show that there is lower variability.
Distance between Q1 and Q3 are almost same. Also, X1 has almost normal distribution if we
exclude outliers (but excluding outliers need further investigations).

7
Q-Q plot x1 – as box plot show that there are 2 outliers. If we exclude outliers we can predict the
future variables, because it follows the straight line.

x2 – we see that there are no outliers. mean is little bigger than median; Also, box plot is
comparatively large and show that there is higher variability. Distance between Q1 is smaller
than, distance Q3.

8
Q-Q plot x2 – as box plot show that there are no outliers we can predict the future variables,
because it follows the straight line.

x3 – From X3 Box plot we see that there is 1 outlier (36). Mean is bigger than median, so we
have highly positive skewness. Also, box plot is comparatively large and show that there is
higher variability. Distance between Q1 is smaller than, distance Q3.

9
Q-Q plot x3 – as box plot show that there is one outlier. we can predict the future variables,
because it follows the straight line.

Conclusion
We analyzed three variables, x1-unemployment level, x2- inflation level and x3 - Net savings level.

10
 x1- The curve is highly skewed right (2.15). There are 2 outliers. And it Still needs further
analysis too exclude or not these outliers.
 x2- There are no outliers. The curve is skewed little bit right. with Skewness ( 0.392) almost 0.
this information is valid.
 x3- The curve is moderately skewed to the right. with skewness 1.065. there is one outlier. So,
this data might need further analysis.

Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
MBA5112 - Project 1 - Luka Khmaladze
No ratings yet
MBA5112 - Project 1 - Luka Khmaladze
15 pages
Class Test 1 Revision Notes
No ratings yet
Class Test 1 Revision Notes
10 pages
BUSINESS AND STATISTICS
No ratings yet
BUSINESS AND STATISTICS
29 pages
RM-EBBA-class-8-CH0-11-Quatitative-analysis
No ratings yet
RM-EBBA-class-8-CH0-11-Quatitative-analysis
37 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
EECM3724_Unit_1_Ch3_slides_2022
No ratings yet
EECM3724_Unit_1_Ch3_slides_2022
48 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Notes Stats Quiz 2
No ratings yet
Notes Stats Quiz 2
10 pages
Chapter 5
No ratings yet
Chapter 5
6 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
Descriptive Statistics (1)
No ratings yet
Descriptive Statistics (1)
63 pages
Business Statstics Complete
No ratings yet
Business Statstics Complete
13 pages
chapter2-statistical analysis
No ratings yet
chapter2-statistical analysis
86 pages
AP Stats Semester 1 Finals Prep
No ratings yet
AP Stats Semester 1 Finals Prep
4 pages
statistics-concept-review
No ratings yet
statistics-concept-review
54 pages
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
No ratings yet
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
33 pages
1 Basics of Stat (Statistics IEM 2-2)
No ratings yet
1 Basics of Stat (Statistics IEM 2-2)
29 pages
Stats
No ratings yet
Stats
109 pages
Fundamentals of Statistics With MS Excel
No ratings yet
Fundamentals of Statistics With MS Excel
83 pages
Unit-3 DS Students
No ratings yet
Unit-3 DS Students
35 pages
Class 1 - 20th August 2024 - Descriptive Statistic
No ratings yet
Class 1 - 20th August 2024 - Descriptive Statistic
6 pages
Stat
No ratings yet
Stat
16 pages
Ch 3_250408_170537
No ratings yet
Ch 3_250408_170537
33 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Lecture 2b - Describing Data-Numerical
No ratings yet
Lecture 2b - Describing Data-Numerical
47 pages
Descriptive-Statistics
No ratings yet
Descriptive-Statistics
25 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
Assignment No 3
No ratings yet
Assignment No 3
16 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
DSILYTC Session 5 - Descriptive Statistics
No ratings yet
DSILYTC Session 5 - Descriptive Statistics
99 pages
Topic1 Summarizing and Visualizing Data PDF
No ratings yet
Topic1 Summarizing and Visualizing Data PDF
29 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
TDA1
No ratings yet
TDA1
57 pages
Arm & Sa Spring 13
No ratings yet
Arm & Sa Spring 13
64 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
02Data (2)
No ratings yet
02Data (2)
36 pages
03b-Numerical Descriptive Measures
No ratings yet
03b-Numerical Descriptive Measures
23 pages
Data Analytics Summary
No ratings yet
Data Analytics Summary
80 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Numerical Descriptive Measures
No ratings yet
Numerical Descriptive Measures
52 pages
Week - 1 Day - 1 Descriptive Statistics
No ratings yet
Week - 1 Day - 1 Descriptive Statistics
40 pages
Quantitative Analysis: Dr. Basheer Ahmad Samim
No ratings yet
Quantitative Analysis: Dr. Basheer Ahmad Samim
71 pages
Business Statistics: Dr. Basheer Ahmad Samim
No ratings yet
Business Statistics: Dr. Basheer Ahmad Samim
70 pages
Statistics ClassNotes - 2
No ratings yet
Statistics ClassNotes - 2
10 pages
S1181 U03 Notes
No ratings yet
S1181 U03 Notes
5 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Statistics
No ratings yet
Statistics
21 pages
Data Analytics Summary
No ratings yet
Data Analytics Summary
89 pages
Statistics - Imp Points
No ratings yet
Statistics - Imp Points
6 pages
Topic 11_Measures of Dispersion
No ratings yet
Topic 11_Measures of Dispersion
109 pages
6 CE 411 - HYDROLOGY (Statistical Measures)
No ratings yet
6 CE 411 - HYDROLOGY (Statistical Measures)
33 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
59 pages
Descriptive Statistics
100% (1)
Descriptive Statistics
37 pages
Unit 3 - Measures of Central Tendency
No ratings yet
Unit 3 - Measures of Central Tendency
2 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Chapter 3 statistics
No ratings yet
Chapter 3 statistics
8 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Outliers Z-Score
No ratings yet
Outliers Z-Score
1 page
Distributions Normal Binominal
No ratings yet
Distributions Normal Binominal
1 page
Finding Outliers 2 Wayes Z-Score and Interquortile Range
No ratings yet
Finding Outliers 2 Wayes Z-Score and Interquortile Range
1 page
Box Plot Consect
No ratings yet
Box Plot Consect
2 pages
Final Project - Group 1
No ratings yet
Final Project - Group 1
6 pages
A Double Roll Crusher Applied
No ratings yet
A Double Roll Crusher Applied
7 pages
Sampling
No ratings yet
Sampling
21 pages
What Are The Similarities and Differences Between An Essay and A Research Paper
No ratings yet
What Are The Similarities and Differences Between An Essay and A Research Paper
8 pages
Siemens GASGuard8
No ratings yet
Siemens GASGuard8
6 pages
4 Pretest and Posttest Analysis
No ratings yet
4 Pretest and Posttest Analysis
6 pages
KP Sop
No ratings yet
KP Sop
27 pages
Elpezmeneses,+art 4
No ratings yet
Elpezmeneses,+art 4
13 pages
Xample 8 - 1: 8.1 Hypothesis Tests For Two Population Means (Large Samples) 375
No ratings yet
Xample 8 - 1: 8.1 Hypothesis Tests For Two Population Means (Large Samples) 375
25 pages
The Impact of Training On Employees' Work Performance Motivation and Job Satisfaction
No ratings yet
The Impact of Training On Employees' Work Performance Motivation and Job Satisfaction
15 pages
Article Critique
No ratings yet
Article Critique
3 pages
Sliding Mode Control of Active Suspensions For A Full Vehicle Model
No ratings yet
Sliding Mode Control of Active Suspensions For A Full Vehicle Model
16 pages
Stats Old Paper July 2024
No ratings yet
Stats Old Paper July 2024
2 pages
Exploring The Relationship Between Celebrity Endorser Effects and Advertising Effectiveness
No ratings yet
Exploring The Relationship Between Celebrity Endorser Effects and Advertising Effectiveness
27 pages
1259am - 14.EPRA JOURNALS-5597
No ratings yet
1259am - 14.EPRA JOURNALS-5597
5 pages
A Project On Shoppers Stop Ltd.
83% (6)
A Project On Shoppers Stop Ltd.
43 pages
Persuasive Essay - Refined
No ratings yet
Persuasive Essay - Refined
7 pages
Lean Six Sigma Green Belt Certification Training Manual CSSC 2018 06b (1)[201 250]
No ratings yet
Lean Six Sigma Green Belt Certification Training Manual CSSC 2018 06b (1)[201 250]
50 pages
The Effects of Organizational Culture On
No ratings yet
The Effects of Organizational Culture On
10 pages
Business Research Report 1
No ratings yet
Business Research Report 1
38 pages
Unit - 1: Statistics: Meaning, Significance & Limitations
No ratings yet
Unit - 1: Statistics: Meaning, Significance & Limitations
11 pages
Ijpab 2017 5 4 84 92
No ratings yet
Ijpab 2017 5 4 84 92
9 pages
Beckett 2009
No ratings yet
Beckett 2009
98 pages
Strategi Pemasaran
No ratings yet
Strategi Pemasaran
15 pages
CAT240 Motor - EXAKT Analysis and Model Building Example at Cerrejoncoal
No ratings yet
CAT240 Motor - EXAKT Analysis and Model Building Example at Cerrejoncoal
22 pages
Bias in Data Collection
100% (1)
Bias in Data Collection
14 pages
Pattern of Information Technology Use: The Impact On Buyer-Suppler Coordination and Performance
No ratings yet
Pattern of Information Technology Use: The Impact On Buyer-Suppler Coordination and Performance
19 pages
Mb0034 Research Methodology Set 1
No ratings yet
Mb0034 Research Methodology Set 1
15 pages
BRM For MBA
No ratings yet
BRM For MBA
27 pages
Skema Permarkahan Science
No ratings yet
Skema Permarkahan Science
7 pages
Lecture 7 Slides
No ratings yet
Lecture 7 Slides
33 pages

Project 1 - Descriptive Statistics

Uploaded by

Project 1 - Descriptive Statistics

Uploaded by

Topic: Descriptive Statistics

Definitions of the terms:

Measures of Distribution Shape, Relative Location, and Outliers:

Histogram – x2 is visual representation of what we discussed above.

Histogram – x3 is visual representation of what we discussed above.

You might also like