0% found this document useful (0 votes)
0 views

1. Descriptive Statistics I

The document provides an overview of descriptive statistics, focusing on graphical techniques and methods for summarizing data. It discusses frequency distributions, various types of graphs such as histograms and pie charts, and emphasizes the importance of good presentation skills in conveying data effectively. Additionally, it covers the construction of frequency distribution tables and the use of logarithms in statistical analysis.

Uploaded by

stilesmoonlight
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

1. Descriptive Statistics I

The document provides an overview of descriptive statistics, focusing on graphical techniques and methods for summarizing data. It discusses frequency distributions, various types of graphs such as histograms and pie charts, and emphasizes the importance of good presentation skills in conveying data effectively. Additionally, it covers the construction of frequency distribution tables and the use of logarithms in statistical analysis.

Uploaded by

stilesmoonlight
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 33

QUANTITATIVE ANALYSIS FOR BUSINESS

(BUSI1006)DESCRIPTIVE STATISTICS I

Dr Jing ZhangRoom C27, Business School South Building


[email protected]
DESCRIPTIVE STATISTICS:
GRAPHICAL TECHNIQUES
DESCRIBING DATA (DESCRIPTIVE STATISTICS)
Graphical Methods (charts Numerical Methods
and graphs) (summary measures)
Smoking Prevalence:
1948-2012 Percentage, Great
Prevalence
Britain
98.7% of undergraduates from
Nottingham University
Business School had secured
work or further study within six
months of graduation. The
average starting salary was
£24,000 with the highest being
£42,000.

Source: cancer
research UK
DESCRIPTIVE STATISTICS
Methods of organizing, summarizing, and presenting data in
a convenient and informative way (graphical & numerical
methods)

Detailed data may not make much sense.

Need to summarize and extract meaningful information from


data.

Graphical techniques- one way of summarizing the data.

Graphical techniques: use of tables, graphs and charts.


DATA TO DESCRIPTIVE STATISTICS Frequency
Distribution

Dat
a

Histogram
(graph)
1. FREQUENCY DISTRIBUTIONS
1000 individuals surveyed and asked about their age and whether
they smoke. 273 responded that they smoke.

A frequency distribution
table, constructed from
raw data collected on
the age of smokers

A grouping of raw data into mutually exclusive categories showing the


number of observations that fall into each category (an interval).
Mutually exclusive: if one observations falls into one category, it cannot fall
into another (>=15 and <20; >=20 and <30; … )
1. FREQUENCY DISTRIBUTIONS

Some terms/concepts used in frequency distributions:


Class Midpoint: A point that divides a class into two equal parts. This is
the average of the upper and lower class limits (or average of the lower
class limits of two consecutive classes)
Class Frequency: Number of observations in each class
Class Interval: The class interval is obtained by subtracting the lower limit
of a class from the lower limit of the next class.
Class Interval:20- Age Number
15=5 Class Frequency

15 up to 20 12

Class Midpoint: 20 up to 30 37
(15+20) / 2=17.5

Lower class limit Upper class limit


CONSTRUCTING A FREQUENCY DISTRIBUTION
Example-Hours Studied
We have an unordered data set containing 30 observations on
hours spent studying QAB.

Minimu Maximu
m m
Construct a frequency distribution table
How many classes? What is the class interval? How to set class limits?
Need to follow some steps to construct the frequency distribution
table.
CONSTRUCTING A FREQUENCY DISTRIBUTION
Step 1 - Decide on the number of classes using the formula:

where k = number of classes, n = number of observations


We can find the value of k by trial and error:
There are 30 observations, so n = 30.
24 = 16, 25 = 32 so we should have at least 5 classes
Using logarithms:
Taking natural logs on both sides of the inequaltion: ,
Using the calculation rule of logs, we get , then

Note: 5 classes is not the final number of classes.
We can confirm this only after having worked out the interval width
from step 2.
CONSTRUCTING A FREQUENCY DISTRIBUTION
Step 2 – Determine the class interval or width using the
formula:

where H = highest number, L = lowest number, k = no. of classes (from


step 1)
Thus, 
Round up to nearest whole number, for an interval of 5 hours.
Set the lower limit of the first class at 7.5 hours (less than the
lowest number)
This gives a total of 6 classes (at least 5 classes)
Note: some judgement needed on the width of the interval
CONSTRUCTING A FREQUENCY DISTRIBUTION
Step 3 – Define the individual class limits, tally and count the
number of items in each class and arrange them into a table:
The first category
will contain the
smallest number in
the data set

The last category


will contain the
Our raw data has been summarised into alargest number in
frequency
distribution table the data set
CONSTRUCTING A FREQUENCY DISTRIBUTION
A Relative Frequency Distribution table shows the
proportion / percent of observations in each class.

Note: to compute relative frequencies, we simply divide each frequency by


the total no. of observations (30)
2. GRAPHS
Information more easily understood through graphs and
pictures.
Some commonly used graphs:
I.

Graphical representations
II. Histograms of frequency distributions
III. Frequency Polygons based on quantitative data
IV. Ogive curves
V. Line Graphs – changes over time
VI. Bar charts – commonly to present a qualitative data
VII. Pie charts – proportion each class in total
I) HISTOGRAMS

A Histogram is a graph in which the class midpoints (or limits in


some textbooks) are marked on the horizontal axis and the class
frequencies on the vertical axis. The class frequencies are
represented by the heights of the bars and the bars are drawn
adjacent to each other.
It is a graphical representation of a frequency distribution
II) FREQUENCY POLYGONS

A Frequency Polygon consists of line segments connecting the


points formed by the class midpoint and the class frequency.
Frequency polygon is simpler
than its histogram
counterpart
Sketches an outline of data
pattern more clearly.
Makes comparisons easier.
The polygon becomes
increasingly smooth and
curve-like as we increase the
number of classes and the
number of observations.
III) LINE GRAPHS
A line graph is typically used to show the change or trend in a
variable over time (days, weeks, months, year, etc)

Net variations of the values for FTSE100, HSBC and Tesco 2017-
2021
Source: London Stock Exchange
IV) BAR CHARTS
A Bar Chart shows measures across different categories. It
can be used to depict any of the levels of measurement
(nominal, ordinal, interval, or ratio).
It is similar to a histogram but is not a measure of distribution.
IV) BAR CHARTS – HORIZONTAL BARS
Long labels are easier to display and read
Many categories are easier to display

Total excess
death since
Covid-19
outbreaks began
until August
2021
Source: ft.com
V) PIE CHARTS
A Pie Chart is useful for displaying a relative frequency
distribution. A circle is divided proportionally to the relative
frequency and portions of the circle are allocated for the
different groups.
3. CROSS TABULATIONSCATEGORICAL DATA THAT
COME IN PAIRS
Used to represent two or more variables in categories
Example: data collected on the survival of start-up shows that 63
out of the total 78 in services survived while 33 out of the 82 start-
up in manufacturing failed. Tabulate this information.
3. CROSS TABULATIONS (ROW PERCENTAGES)
Can work out percentages to allow comparisons across rows
% of services among survivors = = 56 %
% of services in total = = 49 %
Report row percentages and row totals in the table
3. CROSS TABULATIONS (COLUMN PERCENTAGES)
Percentages allow comparisons across columns as well
% of services surviving = = 81 %
% of survivors in total = = 70 %
Report column percentages and column totals
GOOD PRESENTATION SKILLS
Identify the table/graph that is most appropriate for your
data

Label your graphs/tables clearly.

Be as informative as possible: a graph should be able to


convey meaningful information at one glance.

Yet, avoid crowding in too much/irrelevant info.


It can confuse the reader and defeats the purpose of summarising
information.
GOOD PRESENTATION SKILLS
Sales of who/what?
Vertical scale missing
No informative title
PRESENTATION SKILLS?????
EXCEL EXAMPLE (ON MOODLE)
The Height and Shoe Size dataset contains information on
height (in inches), dress shoe size, and gender for 408
college students.

Make a histogram of students height.


TAKEAWAYS
Extract meaningful information from large masses of data
in summary form, usually in terms of:

Tables (e.g. frequency distributions or cross-tabulation)


Graphs (e.g. histograms, frequency polygons, pie-charts, etc)

Good presentations skills important


APPENDIX

Descriptive Statistics: Graphical Techniques


LOGARITHMS - THE INVERSE OF
POWERS/EXPONENTIATION
(n = power, b = base>0)
We can write this as logbM = n
n is the “logarithm of M to the base of b”.
b > 0 so M > 0
Example: , we solve n using ; and 

The natural exponential is denoted by:



The natural logarithm is denoted by:

i.e. if then 
LOGARITHM RULE AND CALCULATION
One of the rules in logarithm calculations is:

To solve , where n=30 in our example.


We take ln on both sides of the inequation to get 
Using the above rule on the left-hand side, we get 
Solve it by dividing  on both sides, i.e. 
Use a scientific calculator to the values of  and (, then you obtain 

You can guess the value of n for small samples, but for large samples, it would be
better to use this logarithm method: 
CUMULATIVE FREQUENCY DISTRIBUTIONS AND
OGIVE CURVES
A Cumulative Frequency Distribution table shows the
cumulative addition of frequencies

Note: to compute cumulative frequencies, we simply cumulatively add the


frequencies. If we cumulatively add the relative frequencies, we obtain
cumulative relative frequencies.
CUMULATIVE FREQUENCY DISTRIBUTIONS AND
OGIVE CURVES
An Ogive Curve is a graph of cumulative frequencies. In
percentage terms, it allows comparisons between
distributions.

You might also like