0% found this document useful (0 votes)

17 views51 pages

EECM3724_Unit_1_Ch2_slides_2022

The document discusses methods for summarizing both categorical and quantitative data, including frequency distributions, relative frequencies, and graphical representations like pie charts and histograms. It emphasizes the importance of using descriptive statistics and visual tools to derive insights from raw data, illustrated through examples such as guest ratings at a lodge and parts costs at an auto repair shop. Additionally, it covers concepts like Simpson's paradox and the selection of class limits and widths for quantitative data analysis.

Uploaded by

johannesbotle

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views51 pages

EECM3724_Unit_1_Ch2_slides_2022

Uploaded by

johannesbotle

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Unit 1 (continued):

Graphical and Numerical of Data

Anderson et al., ch. 2

21/07/2022
Overview
From chapter 2:

• Summarizing categorical data

• Summarizing quantitative data

• Summarizing relationships between two categorical variables

• Summarizing relationships between two quantitative variables

• Explaining the Simpson’s paradox (use your own numerical example)

Objectives
• Use descriptive statistics to describe and summarize qualitative and quantitative data.

• Construct and interpret graphs representing the distribution of variables in a data set;

• Construct and interpret cross-tabulations and scatter diagrams;

• Explain Simpson’s paradox.

Summarizing qualitative data
• Frequency distribution

• Relative frequency distribution These are the commonly

used ways of summarizing
this type of data. In this unit,
• Percentage frequency distribution
we describe the techniques
and illustrate them. Because
• Bar charts the reader must not struggle
to get the sense of the
• Pie charts summary of the data, choice
of which technique to use
and when is always
important.
Frequency distribution
•

• A frequency distribution is a tabular summary of data showing the number (frequency) of items in each of
several non-overlapping classes.

• The objective of using frequency distribution is to provide insights about the data that cannot be quickly
obtained by looking only at the raw data.

• We find the frequency of qualitative data by counting the number of observations that fall into a given
class/category.

• Think of your close family relatives. How many are female? How many are males? The quantities you get for
these two categories are the frequency.
Example: Bains game lodge
• Mandisa was hired by Bains game lodge to conduct a survey of the quality of accommodation offered by
the lodge. She designed a questionnaire and asked guests staying at the lodge to rate the quality of their
accommodation as being excellent, above average, average, below average, or poor.

• The ratings provided by a sample of 20 guests are:

Below Average Average Above Average

Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average

• What insights can Mandisa draw from these responses?

• Mandisa cannot quickly get meaningful insights from the raw data above.
• How about if she do frequency distribution?
Frequency distribution

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20

• Now Mandisa has a story to tell the management of the lodge.

• ‘Above average’ category is the mode i.e. rating with highest frequency.
• Excellent is the rating least selected as only one guest selected it.
Relative frequency distribution
•

• The relative frequency of a class is the fraction or proportion of the total number of data
items belonging to the class.

𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠

• Relative frequency of each class=
𝑛

• n is the number of observations.

• The relative frequency distribution is a tabular summary of a set of data showing the
relative frequency for each class.

In the next two slides, we will provide relative frequency for the Bains Game Lodge data. Before you get to those
slides, can you attempt to calculate relative frequencies for each class/category of ratings?
Percentage frequency distribution

• The percentage frequency of a class is the relative frequency multiplied by 100

• Percentage frequency of each class=Relative frequencyx100

• A percentage frequency distribution is a tabular summary of a set of data showing the percentage
frequency for each class.

In the next slide, we will provide percentage relative frequency distribution for the Bains Game Lodge data. Before
you get to the slide, can you attempt to calculate percentage relative frequencies for each class/category of ratings?
Relative Frequency and
Percentage Frequency Distributions
Relative Percentage
Rating Frequency Frequency
Poor 0.10 10
Below Average 0.15 15
Average 0.25 25 0.10(100) = 10
Above Average 0.45 45
Excellent 0.05 5
Total 1.00 100

1/20 = 0.05
Display of qualitative data
• Pie chart Find frequency
• Bar Chart distribution

• These graphical tools are most appropriate when the

raw data can be naturally categorized in a meaningful
manner.
• “A picture is worth a thousand words”
• These charts provide important insights into this data
Pie Chart
• The pie chart is a very popular tool for presenting relative frequency distributions for
qualitative data.

• The pie chart is a circle subdivided into a number of slices corresponding to the relative
frequency for each class.

• In other words, the size of each slice is proportional to the percentage corresponding to the
category it represents

When the Finance Minister presents a budget, he is sharing the economic cake among different functions.
The cake is the revenue. Each function e.g. learning and culture, get its share.
Pie Chart
Bains game lodge Quality Ratings
• Half of the customers
surveyed gave Bains a Excellent
quality rating of “above 5%
average” or higher Poor
(45%+5%). 10%
• 10% of the guests gave Below
Average
a poor rating, while
Above 15%
• 15% gave a below Average
average rating. 45%
• Given these findings, Average
what can Mandisa tell
25%
the manager of Bains
game lodge?
Bar Chart
• A bar chart, or bar graph, is a graphical tool for depicting qualitative data.

• On one axis (usually the horizontal axis), we specify the labels that are used for each of the classes.

• A frequency, relative frequency, or percentage frequency scale can be used for the other axis (usually

the vertical axis).

• Using a bar of fixed width drawn above each class label, we extend the height appropriately.

• The bars are separated to emphasize the fact that each class is a separate category.
Bar Graph
Bains game lodge Quality Ratings
• From raw data to:
10
• Frequency distribution
• Relative frequency 9
• Percentage relative 8
frequency
• Pie chart. Frequency 7
• Bar graph 6
• This explains why we
5
must have this course
in our degree 4
programme. 3
2
1

Poor Below Average Above Excellent

Average Average
Rating
Summarizing Quantitative Data
• Frequency Distribution

• Relative Frequency Distribution

• Percentage Frequency Distribution As you work through

this part, see the
• Histogram similarities and
differences with the
categorical data
• Cumulative Distributions
Example: Bloem Auto Repair
• Daniel is the manager of Bloem Auto and he would like to have a better understanding of the cost of
parts used in the engine tune-ups performed in the shop.

• He examines 50 customer invoices for tune-ups.

• The costs of parts, rounded to the nearest Rand for the 50 customers, are listed below
• Daniel cannot quickly
91 78 93 57 75 52 99 80 97 62 get insight from the raw
data above.
71 69 72 89 66 75 79 75 72 76 • What can Daniel do?
Can he do frequency

104 74 62 68 97 105 77 65 80 109 distribution?

• But then there are no
classes/categories. How
85 97 88 68 83 68 71 69 67 74 do we solve that
problem?
62 82 98 101 79 105 79 69 62 73
Frequency Distribution for quantitative data
• Guidelines for Selecting Number of Classes

• Use between 5 and 20 classes.

• Data sets with a larger number of elements usually require a larger number of classes.

• Smaller data sets usually require fewer classes

• Use enough classes to show the variation in the data.

• Do not use so many classes that some contain only a few data items or nothing at all.
Frequency Distribution – Number of classes
• To determine the classes for a frequency distribution with quantitative data we need to determine
• (1) the number of non-overlapping classes; (2) the width of each class and (3) the class limits.

• The number of classes are generally determined by trial and error but to minimize the trials one can use the 2𝑘
rule.

• The rule states that 2𝑘 ≥ 𝑛, where 𝑘 is the number of classes and 𝑛 is the number of observations. From
Maths class, how to you solve for 𝑘? Remember: k𝑙𝑜𝑔2 ≥ 𝑙𝑜𝑔𝑛

• For our example, 𝑛 =50 and by trial: 25 =32; 26 =64. So 26 ≥ 50, meaning 𝑘=6
Frequency Distribution – Class width

• Guidelines for Selecting Width of Classes

• The choices for the number of classes and class width are not independent as more classes mean
smaller class width.
• For consistency and reducing the chances of inappropriate interpretations, use the same class width.
• The number of observations (n=50)
• The minimum = 52 and the maximum = 109
• 𝐿 - Approximate Class Width
𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒−𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 (109−52)
• 𝐿= = = 9.5 ≈ 10
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 6

• We can round off the number of class width to get 10

• Now we need to create the frequency distribution….
Frequency Distribution

For Bloem Auto Repair, if we choose 6 classes, with class width of 10:

Parts Cost (R) Frequency

50-59 2
There is some
60-69 13 picture now.
70-79 16 Where do we see
80-89 7 high frequency?
Low frequency?
90-99 7
100-109 5
Total 50
You can proceed and draw a bar graph with parts costs as classes and on x-axis. The bar graph can have frequency; relative
frequency, percentage relative frequency. Do not hesitate to ask if you do not understand.
Frequency Distribution – Class limit
⚫ Class limits must be chosen so that each data item belongs to one class.
⚫ The lower-class limit identifies the smallest possible data value assigned to the class
⚫ The upper-class limit identifies the largest possible data value assigned to the class.
⚫ In the Table above, 52 is the lowest value in the data, thus we select 50 as the lower-
class limit and 59 as the upper-class limit of the first class.
⚫ The difference between the lower-class limits of adjacent classes is the class width.
⚫ Using the last 2 upper class limits of 90 and 100, we see that the class width is 100-
90=10.
⚫ To have the frequency distribution, we count the number of values in each class.
Frequency Distribution – Class midpoint
• To derive some of the statistics, we need to derive the midpoints of each class in a frequency
distribution for quantitative data.
• The class midpoint is the value halfway between the lower- and upper class limits.
• For example, for the class 90-100, the midpoint is 95.

But what are those other statistics that we need midpoints values for? Soon we talk about them.
Relative Frequency and
Percentage Relative Frequency Distributions

• Only 4% of the parts costs are in

the R50-59 class
Parts Relative Percentage
Cost (R) Frequency Frequency
• 30% of the parts costs are under
R70.
50-59 0.04 4
60-69 0.26 2/50 26 0.04(100)
• The greatest percentage (32% or
almost one-third) of the parts 70-79 0.32 32
costs are in the R70-79 class. 80-89 0.14 14
• 10% of the parts costs are R100 90-99 0.14 14
or more.
100-109 0.10 7/50 10 0.14(100)
Total 1.00 100
Graphical Techniques for Quantitative Data - Histogram

• A common graphical presentation of quantitative data is a histogram.

• The variable of interest is placed on the horizontal axis.

• A rectangle is drawn above each class interval with its height corresponding to the interval’s
frequency, relative frequency, or percentage frequency.

• Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes.

• The bars within a histogram do not correspond to named categories, as in the bar chart.

• In the histogram the bars correspond to an interval on the number line.

• This interval is constructed so that they are all of equal length.

Graphical Techniques for Quantitative Data - Histogram
• In constructing a histogram avoid choosing class width which is of awkward lengths.

• If the class width is made too narrow, the histogram looks “spikey”, and if
width is too wide, the histogram is “blurred”.
Histogram - Bloem Auto Repair 6 classes, with class width of 10
Tune-up Parts Cost
18
What can you say about this
16 histogram?
14
Apart from drawing it, you
should be able to interpret
Frequency 12 what you see in the graph.
10
8
6
4
2

50-59 60-69 70-79 80-89 90-99 100-110

Parts
Cost (R)
Histogram
• From analyzing a histogram we can get information about the shape of the distribution of the data.

• This can be symmetrical, negatively (left) skewed or positively (right) skewed.

• A histogram is negatively skewed if its tail extends further to the left.

• It is positively skewed if the tail extends to the right.

• It is symmetrical if the right tail mirrors the left tail.

• While a symmetrical distribution is the most ideal, histogram for real data are never perfectly symmetrical, it
might be roughly symmetrical.
Histogram
• Symmetrical
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
• The mean and the median are equal

.35
.30
Relative Frequency
.25
.20
.15
.10
.05
0
Histogram
• Moderately Skewed Left (negative skewness)
• A longer tail to the left Actual calculations will
• Example: exam scores confirm this aspect of
• The mean is less than the median, and they are mean < median < mode
both less than the mode
.35
.30
Relative Frequency
.25
.20
.15
.10
.05
0
Histogram
• Moderately Right Skewed (Positive Skewness)
Actual calculations will
• A longer tail to the right confirm this aspect of
• Example: housing values mean > median > mode
• The mean is the largest, while the mode is the smallest.
.35
.30
Relative Frequency
.25
.20
.15
.10
.05
0
Histogram
• Highly Skewed Right (Positive Skewness)
• A very long tail to the right
• Example: executive salaries

.35
.30
Relative Frequency

.25
.20
.15
.10
.05
0
Bell shaped histogram

▪ Many statistical techniques require that the population be bell-shaped.

▪ A bell shape suggests that the data is normally distributed

▪ Drawing the histogram helps verify the shape of the population in question

Using real data do you think we

can get a bell-shaped histogram?
For real data, its never perfectly
symmetrical, it might be roughly
symmetrical.
Cumulative Distributions
• Cumulative frequency distribution - shows the number of items with values less than or equal to
the upper limit of each class

• Cumulative relative frequency distribution – shows the proportion of items with values less than
or equal to the upper limit of each class.

• Cumulative percentage frequency distribution – shows the percentage of items with values less
than or equal to the upper limit of each class.

• The cumulative frequency distribution uses the number of classes, class width and class limits
adopted for the frequency distribution.

• The last entry in the cumulative relative frequency distribution always equals 1 and for
percentage equals 100.
Cumulative Distributions
• Bloem Auto Repair

Cumulative Cumulative
Cumulative Relative Percentage
Cost (R) Frequency Frequency Frequency
< 59 2 0.04 4
< 69 15 0.30 30
< 79 31 2 + 13 0.62 15/50 62 0.30(100)
< 89 38 0.76 76
< 99 45 0.90 90
< 109 50 1.00 100
Cross-tabulations
• So far, we have focused on presentations that are used to summarize the data for one variable at a time.

• A cross-tabulation is a tabular summary of data for two variables simultaneously.

• It allows us to determine the relationship between two variables.

• Cross-tabulation can be used when:

• One variable is qualitative, and the other is quantitative

• Both variables are qualitative ,

• Both variables are quantitative

Cross-tabulations - Textbook example p 37

• Cross-tabulation of quality rating and meal price for 300 restaurants in

Bloemfontein
• The left and top margin labels define the classes for the two variables.
Quantitative
Qualitative variable
variable
Meal Price
Quality Rating R100-190 R200-290 R300-390 R400-490 Total
Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Total 78 118 76 28 300
Frequency distribution
Frequency distribution for the quality rate
for the meal price
Insights Gained from Preceding Cross-tabulation.
• The greatest number of restaurants (64) have a very good rating and a meal price of R200-290.

• For the most expensive restaurants (R400-490), none of them have a good rating with most having excellent
rating (22/28=78.6%).

• Of the 78 least expensive restaurants (R100-190), only 2 have an excellent rating (2/78=2.6% ) but 42 of
them having a good rating (42/78=53.8%).

• The right and bottom margins of the cross-tabulation provide the frequency distributions of quality rating
and meal price.

• The right margin (in Red) is the frequency distribution of quality rating and the bottom margin (in Green) is
the frequency distribution of meal price.

• Dividing the totals in the bottom margin (78, 118, 76 and 28) by the total for that row (300) provides relative
and percentage frequency distributions for meal price.

• Looking at R100-190 – (78/300=0.26), implying 26% of the restaurants were charging a meal price of R100-
Cross-tabulations

• The frequency and relative frequency distributions derived from the margins of a cross-
tabulation provide information about each variable.

• They do not shed light on the relationship between the two variables.

• To explore the relationship between the 2 variables we need to convert the entries in a cross-
tabulation into a row or column percentages.

• The example in the textbook looked at the relationship based on row percentages

• I will do the col percentages to have a complete picture of this example.

•
Cross-tabulation: Column Percentages

• Each col in the table is a percentage frequency distribution of quality rating for one
of the meal prices (for first column 100*(42/78)=53.8)

100* (42/78) =53.8

• You are expected to know how to derive the row and col
percentage tables and interpret the results.
Cross-tabulation: Column Percentages

• Of the restaurants charging low price (R100-190), the greatest proportion are rated good (54%)

• For restaurants charging high price (R400-490), the greatest proportion are rated highly (79%
have an excellent rating).

• Based on this it seems meal prices are positively associated with quality restaurants.

• In doing this example did you see the mix of qualitative (quality) and quantitative (price)
variables.

• While qualitative variables are categorical already, for quantitative we need to create classes
before using the variable in a crosstab.
Cross-tabulation: Simpson’s Paradox

• Simpson’s Paradox: A phenomenon in statistics in which the conclusions based upon an

aggregated cross-tabulation can be completely reversed if we look at the disaggregated data.

• Data in two or more cross-tabulations are often aggregated to produce a summary cross-
tabulation.

• Patterns previously seen in the aggregated data may be reversed or disappear altogether in the
non-disaggregated data

• For example, the results of a cross-tabulation of health status and youth status might change
once we disaggregate youth status by gender.

• Health status of male and female youths is likely to differ.

• We must be careful in drawing conclusions about the relationship between the two variables in
the aggregated cross-tabulation.
Simpson’s paradox
Aggregated data Disaggregated data
Scatter Diagram and Trend Line

• A scatter diagram is a graphical presentation of the relationship between two quantitative

variables.

• One variable is shown on the horizontal axis and the other variable is shown on the vertical
axis.

• The general pattern of the plotted points suggests the overall relationship between the
variables.

• A trend line is an approximation of the relationship, which can be positive, negative or no

relationship
A scatter diagram can indicate 3 possible relationships between 2 variables:
A positive relationship showing a upward sloping trend line,
A negative relationship showing a downward sloping trend line
No apparent relationship showing a close to a horizontal trend line.
Scatter Diagram and Trend Line

• A Positive Relationship
y

x
Scatter Diagram and Trend Line

• A Negative Relationship
y

x
Scatter Diagram and Trend Line

• No Apparent Relationship
y

x
Summary: Tabular and Graphical Procedures

Data

Qualitative Data Quantitative Data

Tabular Graphical Tabular Graphical

Methods Methods Methods Methods

• Frequency • Bar Graph • Frequency Dist.

• Dot Plot
Distribution • Pie Chart • Rel. Freq. Dist.
• Histogram
• Relative Freq. • % Freq. Dist.
• Stem-and-
Distribution • Cum. Freq. Dist.
Leaf Display
• Percent Freq. • Cum. Rel. Freq.
• Scatter
Distribution Distribution
Diagram
• Cross-tabulation • Cum. % Freq.
Distribution
• Cross-tabulation
End of Chapter 2

Attempt questions provided at the end of the chapter in the textbook.

Attempt questions provided at the end of the chapter in the textbook.
Next: UNIT 1, CONT…..
UNIT 1: DESCRIPTIVE STATISTICS

Seven Databases in Seven Weeks A Guide To Modern Databases and The NoSQL Movement 2nd Edition Luc Perkins 2024 Scribd Download
100% (5)
Seven Databases in Seven Weeks A Guide To Modern Databases and The NoSQL Movement 2nd Edition Luc Perkins 2024 Scribd Download
62 pages
Descriptive Statistics: Descriptive Statistics Are Used by Researchers To Report On Populations and Samples
100% (1)
Descriptive Statistics: Descriptive Statistics Are Used by Researchers To Report On Populations and Samples
41 pages
CS619-CS519 Final VIVA Prepration By JUNAID
No ratings yet
CS619-CS519 Final VIVA Prepration By JUNAID
35 pages
Hotel Management System PDF
No ratings yet
Hotel Management System PDF
15 pages
Hemant's Resume
No ratings yet
Hemant's Resume
1 page
Chapter 2. Presenting Data in Tables and Charts: Objectives
No ratings yet
Chapter 2. Presenting Data in Tables and Charts: Objectives
44 pages
Part 2 - Descriptive Statistics
No ratings yet
Part 2 - Descriptive Statistics
78 pages
Digital Documentation - MCQ Questions - Set4
No ratings yet
Digital Documentation - MCQ Questions - Set4
3 pages
Cardinality-in-Power-BI
No ratings yet
Cardinality-in-Power-BI
8 pages
Cp4152-Database Practices Lab
No ratings yet
Cp4152-Database Practices Lab
55 pages
EBA2123 2.Tables and Graphs (1)
No ratings yet
EBA2123 2.Tables and Graphs (1)
67 pages
It6202 Lab - 002
No ratings yet
It6202 Lab - 002
6 pages
Chapter_02_SSM-FINAL
No ratings yet
Chapter_02_SSM-FINAL
30 pages
Lecture 2
No ratings yet
Lecture 2
28 pages
Chap 002 Frequency Tables
No ratings yet
Chap 002 Frequency Tables
24 pages
20S15023 FDM Review Questions Exercises PDF
No ratings yet
20S15023 FDM Review Questions Exercises PDF
28 pages
Soal Uts Pti C 2018
No ratings yet
Soal Uts Pti C 2018
3 pages
6.5 Measures of Relative Position
No ratings yet
6.5 Measures of Relative Position
2 pages
Database Systems Lab 1 Presentation Lectures V2
No ratings yet
Database Systems Lab 1 Presentation Lectures V2
18 pages
The Idea by Woobensky
No ratings yet
The Idea by Woobensky
4 pages
Academic Year: 2020: Course Title: Data Structures and Algorithms Lab
No ratings yet
Academic Year: 2020: Course Title: Data Structures and Algorithms Lab
7 pages
Chap002
No ratings yet
Chap002
15 pages
Grade 10-Quartiles
No ratings yet
Grade 10-Quartiles
17 pages
Fundamentals of Database Systems
No ratings yet
Fundamentals of Database Systems
105 pages
ECO2004_Ch2
No ratings yet
ECO2004_Ch2
7 pages
Chapter 2 Summarising Data
No ratings yet
Chapter 2 Summarising Data
13 pages
K.S.K. Academy: Ip Project File
No ratings yet
K.S.K. Academy: Ip Project File
31 pages
UIUC ECON 490: Applied Machine Learning in Economics
No ratings yet
UIUC ECON 490: Applied Machine Learning in Economics
28 pages
NET Core Interview Question & Answers - ByteHide Blog
No ratings yet
NET Core Interview Question & Answers - ByteHide Blog
19 pages
Apache Helix
No ratings yet
Apache Helix
54 pages
Unit-3
No ratings yet
Unit-3
58 pages
Lekcija 3 - Frekvencije
No ratings yet
Lekcija 3 - Frekvencije
57 pages
Chap 1 - 2: Business Statistics
No ratings yet
Chap 1 - 2: Business Statistics
38 pages
Chapter 02
No ratings yet
Chapter 02
22 pages
Fundamentals of Statistics-Frequency Distribution: WEEK # 06
No ratings yet
Fundamentals of Statistics-Frequency Distribution: WEEK # 06
39 pages
Power Bi Developer
No ratings yet
Power Bi Developer
4 pages
Lecture 2 Statistics
No ratings yet
Lecture 2 Statistics
38 pages
Lec 2 BUSINESS STATISTICS DANISH 10032021 102617am
No ratings yet
Lec 2 BUSINESS STATISTICS DANISH 10032021 102617am
29 pages
Descriptive Statistics - Tabular & Graphical
No ratings yet
Descriptive Statistics - Tabular & Graphical
57 pages
chapter 2
No ratings yet
chapter 2
33 pages
Describing Data - Frequency
No ratings yet
Describing Data - Frequency
27 pages
Chapter02
No ratings yet
Chapter02
61 pages
Design and Implementation of A Computerised Stadium Management Information System
No ratings yet
Design and Implementation of A Computerised Stadium Management Information System
33 pages
Lecture 2 Descriptive Statistics - Tabular and Graphical Presentation
No ratings yet
Lecture 2 Descriptive Statistics - Tabular and Graphical Presentation
51 pages
Chapter 2a
No ratings yet
Chapter 2a
41 pages
QMM 2 6 2017
No ratings yet
QMM 2 6 2017
87 pages
Chap002.ppt
No ratings yet
Chap002.ppt
15 pages
Chap.02 Descriptive Statistics
No ratings yet
Chap.02 Descriptive Statistics
53 pages
2-Organizing and Displaying Data
No ratings yet
2-Organizing and Displaying Data
65 pages
Business Statistics For R: Name PRN
No ratings yet
Business Statistics For R: Name PRN
30 pages
Ibm Infosphere Admin Course
No ratings yet
Ibm Infosphere Admin Course
4 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
32 pages
QR Lec 9, Presentation of Data
No ratings yet
QR Lec 9, Presentation of Data
48 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Recap
No ratings yet
Recap
75 pages
Statsintro
No ratings yet
Statsintro
44 pages
COMPUTERIZED GRADING SYSTEM FOR KADUNA STATE POLYTECHNIC
No ratings yet
COMPUTERIZED GRADING SYSTEM FOR KADUNA STATE POLYTECHNIC
39 pages
Lecture 2
No ratings yet
Lecture 2
41 pages
Teamcenter Rich Client and Awc Content Details
No ratings yet
Teamcenter Rich Client and Awc Content Details
21 pages
CSE3002 Programming in Java
No ratings yet
CSE3002 Programming in Java
2 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
24 pages
Lecture-02 Data Organization and Presentation
No ratings yet
Lecture-02 Data Organization and Presentation
36 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
35 pages
Tabular and Graphical Methods in Statistics
No ratings yet
Tabular and Graphical Methods in Statistics
63 pages
HotDocs 2008 Standard
No ratings yet
HotDocs 2008 Standard
122 pages
Lecture 2 Part A Descriptive Statistics Tabular and Graphical Displays
No ratings yet
Lecture 2 Part A Descriptive Statistics Tabular and Graphical Displays
77 pages
Describing Data Chapter 2 BW 4 Slides Per Page PDF
No ratings yet
Describing Data Chapter 2 BW 4 Slides Per Page PDF
7 pages
Chapter 2
No ratings yet
Chapter 2
24 pages
Exponential
No ratings yet
Exponential
54 pages
Week 02 Data Organizatiion and Presentaion
No ratings yet
Week 02 Data Organizatiion and Presentaion
51 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
15 pages
Lesson 2 Frequency Distribution and Data Presentation 18
No ratings yet
Lesson 2 Frequency Distribution and Data Presentation 18
11 pages
SBE12 CH 02 A
No ratings yet
SBE12 CH 02 A
47 pages
Chapter 2 Describing Data Using Tables and Graphs
No ratings yet
Chapter 2 Describing Data Using Tables and Graphs
16 pages
Data Organization
No ratings yet
Data Organization
69 pages
QMM 2
No ratings yet
QMM 2
68 pages
Chapter1 Data Description PDF
No ratings yet
Chapter1 Data Description PDF
24 pages
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
No ratings yet
Describing Data:: Frequency Tables, Frequency Distributions, and Graphic Presentation
14 pages
Advanced Certification Course in Data Science - Brochure
No ratings yet
Advanced Certification Course in Data Science - Brochure
15 pages
Business Statistics: A Decision-Making Approach: Graphs, Charts, and Tables - Describing Your Data
No ratings yet
Business Statistics: A Decision-Making Approach: Graphs, Charts, and Tables - Describing Your Data
47 pages
Organizing and Graphing Data
No ratings yet
Organizing and Graphing Data
83 pages
Bcs Database - Complete Reference 2022
No ratings yet
Bcs Database - Complete Reference 2022
109 pages
Name Null? Type Emp - No Not Null Number (5) Last - Name VARCHAR2 (10) Dept - No Not Null Number (5) Salary NUMBER (6,2)
No ratings yet
Name Null? Type Emp - No Not Null Number (5) Last - Name VARCHAR2 (10) Dept - No Not Null Number (5) Salary NUMBER (6,2)
22 pages
Chapter 2-190810 074149
No ratings yet
Chapter 2-190810 074149
19 pages
AK - STATISTIKA - 01 - Describing Data
No ratings yet
AK - STATISTIKA - 01 - Describing Data
26 pages
GE 4 Module 10
No ratings yet
GE 4 Module 10
16 pages
Frequency Distributions: Describing, Exploring and Comparing Data
No ratings yet
Frequency Distributions: Describing, Exploring and Comparing Data
28 pages
Math 140 Chapter 2 Notes
No ratings yet
Math 140 Chapter 2 Notes
5 pages
GMAT Advanced Quant
From Everand
GMAT Advanced Quant
Manhattan Prep
No ratings yet
The Ultimate TMUA Guide: Complete revision for the Cambridge TMUA. Learn the knowledge, practice the skills, and master the TMUA
From Everand
The Ultimate TMUA Guide: Complete revision for the Cambridge TMUA. Learn the knowledge, practice the skills, and master the TMUA
Chloe Bowman
No ratings yet

EECM3724_Unit_1_Ch2_slides_2022

Uploaded by

EECM3724_Unit_1_Ch2_slides_2022

Uploaded by

Unit 1 (continued):

Graphical and Numerical of Data

Anderson et al., ch. 2

• Summarizing categorical data

• Summarizing quantitative data

• Summarizing relationships between two categorical variables

• Summarizing relationships between two quantitative variables

• Explaining the Simpson’s paradox (use your own numerical example)

• Construct and interpret cross-tabulations and scatter diagrams;

• Explain Simpson’s paradox.

• Relative frequency distribution These are the commonly

• The ratings provided by a sample of 20 guests are:

Below Average Average Above Average

• What insights can Mandisa draw from these responses?

• Now Mandisa has a story to tell the management of the lodge.

𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠

• n is the number of observations.

• The percentage frequency of a class is the relative frequency multiplied by 100

• Percentage frequency of each class=Relative frequencyx100

• These graphical tools are most appropriate when the

the vertical axis).

Poor Below Average Above Excellent

• Relative Frequency Distribution

• Percentage Frequency Distribution As you work through

• He examines 50 customer invoices for tune-ups.

104 74 62 68 97 105 77 65 80 109 distribution?

• Use between 5 and 20 classes.

• Smaller data sets usually require fewer classes

• Use enough classes to show the variation in the data.

• Guidelines for Selecting Width of Classes

• We can round off the number of class width to get 10

Parts Cost (R) Frequency

• Only 4% of the parts costs are in

• A common graphical presentation of quantitative data is a histogram.

• The variable of interest is placed on the horizontal axis.

• In the histogram the bars correspond to an interval on the number line.

• This interval is constructed so that they are all of equal length.

50-59 60-69 70-79 80-89 90-99 100-110

• This can be symmetrical, negatively (left) skewed or positively (right) skewed.

• A histogram is negatively skewed if its tail extends further to the left.

• It is positively skewed if the tail extends to the right.

• It is symmetrical if the right tail mirrors the left tail.

▪ Many statistical techniques require that the population be bell-shaped.

▪ A bell shape suggests that the data is normally distributed

Using real data do you think we

• A cross-tabulation is a tabular summary of data for two variables simultaneously.

• It allows us to determine the relationship between two variables.

• Cross-tabulation can be used when:

• One variable is qualitative, and the other is quantitative

• Both variables are qualitative ,

• Both variables are quantitative

• Cross-tabulation of quality rating and meal price for 300 restaurants in

• I will do the col percentages to have a complete picture of this example.

100* (42/78) =53.8

• Simpson’s Paradox: A phenomenon in statistics in which the conclusions based upon an

• Health status of male and female youths is likely to differ.

• A scatter diagram is a graphical presentation of the relationship between two quantitative

• A trend line is an approximation of the relationship, which can be positive, negative or no

Qualitative Data Quantitative Data

Tabular Graphical Tabular Graphical

• Frequency • Bar Graph • Frequency Dist.

Attempt questions provided at the end of the chapter in the textbook.

You might also like