Lecture-02 Data Organization and Presentation
Lecture-02 Data Organization and Presentation
Presentation of Data
Graphical Presentation
• Bar chart
• Pie chart
• Histograms
• Frequency polygons
• Ogives
• Stem-and-Leaf Display
Example
Calculating Percentage
Example
Bar Graph
Pie Charts
Definition
A circle divided into portions that represent the relative
frequencies or percentages of a population or a sample belonging
to different categories is called a pie chart.
Figure 2.2 shows the pie chart for the percentage distribution of
Table 2.5, which uses the angle sizes calculated in Table 2.6.
Organizing and Graphing for Quantitative Data
Frequency Distributions
2𝑘 ≥ 𝑛.
𝐻−𝐿
𝑖≥
𝑘
or
𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
𝐴𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒 𝐶𝑙𝑎𝑠𝑠 𝑊𝑖𝑑𝑡 =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠
• Exclusive method
• Inclusive method
Exclusive Method
When the class intervals are so fixed that the upper limit of one
class is the lower limit of the next class it is known as the
‗Exclusive‘ method of classification. The following data are
classified on the basis:
Inclusive method
Number of classes
Here n=30
We use 2𝑘 ≥ 𝑛.
If k=5 then 25 = 32 ≥ 𝑛.
𝐻−𝐿 29−5
so, Class interval, 𝑖≥ = = 4.8 ≈ 5
𝑘 5
Histograms
Definition
Figures 2.3 and 2.4 show the frequency and the relative frequency
histograms, respectively, for the data of Tables 2.9 and 2.10 of
Sections 2.3.2 and 2.3.3. The two histograms look alike because
they represent the same data. A percentage histogram can be
drawn for the percentage distribution of Table 2.10 by marking
the percentages on the vertical axis. In Figures 2.3 and 2.4, we
have used class limits to mark classes on the horizontal axis.
However, we can show the intervals on the horizontal axis by
using the class boundaries instead of the class limits.
Class Boundary
We adjust the classes by deducting 0.5 from each lower limit and
adding 0.5 to each upper limit of all the classes.
Example
4.95 5.80 4.50 4.85 6.992 12.35 7.75 10.45 21.77 18.00
25.99 8.00 2.99 16.60 9.00 15.75 9.50 3.05 5.65 21.00
Solution
Shapes of Histogram
Symmetric
Skewed
Uniform or Rectangular
Symmetric Histogram
Skewed histogram
Uniform histogram
Steam and Leaf Display
Definition
75 52 80 96 65 79 71 87 93 95
69 72 81 61 76 86 79 68 50 92
83 84 77 64 71 87 72 92 57 98
Solution
After we have listed the stems, we read the leaves for all scores
and record them next to the corresponding stems on the right side
of the vertical line. The complete stem-and-leaf display for scores
is shown in Figure 2.14.
Features of distributions: using steam and leaf plot
When you assess the overall pattern of any distribution (which is the
pattern formed by all values of a particular variable), look for these
features:
number of peaks
centre
spread
Number of peaks
Line graphs are useful because they readily reveal some characteristic
of the data. The first characteristic that can be readily seen from a line
graph is the number of high points or peaks the distribution has.
While most distributions that occur in statistical data have only one
main peak (unimodal), other distributions may have two
peaks (bimodal) or more than two peaks (multimodal).
The amount of distribution spread and any large deviations from the
The results of 41 students' math tests (with a best possible score of 70) are
recorded below:
31, 49, 19, 62, 50, 24, 45, 23, 51, 32, 48
55, 60, 40, 35, 54, 26, 57, 37, 43, 65, 50
55, 18, 53, 41, 50, 34, 67, 56, 44, 4, 54
57, 39, 52, 45, 35, 51, 63, 42
1. Prepare an ordered stem and leaf plot for the data and briefly describe
what it shows.
a. number of peaks
b. symmetry
Solution:
A test score is a discrete variable. For example, it is not possible to have a
test score of 35.74542341....
The lowest value is 4 and the highest is 67. Therefore, the stem and leaf
plot that covers this range of values looks like this:
Stem Leaf
0 4
1 8 9
2 3 4 6
3 1 2 4 5 5 7 9
4 0 1 2 3 4 5 5 8 9
5 0 0 0 1 1 2 3 4 4 5 5 6 7 7
6 0 2 3 5 7
The result of 4 could be an outlier, since there is a large gap between this
and the next result, 18.
If the stem and leaf plot is turned on its side, it will look like the
following:
Line charts, especially useful in the fields of statistics and science, are more
popular than all other graphs combined because their visual characteristics
reveal data trends clearly and these charts are easy to create.
Line charts compare two variables: one is plotted along the x-axis
(horizontal) and the other along the y-axis (vertical). The y-axis in a line
chart usually indicates quantity (e.g. dollars, litres) or percentage, while the
horizontal x-axis often measures units of time. As a result, the line chart is
often viewed as a time series graph. For example, if you wanted to graph
the height of a baseball pitch over time, you could measure the time
variable along the x-axis, and the height along the y-axis. Although they do
not present specific data as well as tables do, line charts are able to show
relationships more clearly than tables do. Line charts can also depict
multiple series and hence are usually the best candidate for time series data
and frequency distribution.
Chart 5.5.1 shows one obvious trend, the fluctuation in the labour force
from January to July. The number of students at Andrew‘s high school who
are members of the labour force is scaled using intervals on the y-axis,
while the time variable is plotted on the x-axis.
Chart 5.5.2 is a single line chart comparing two items. In this example, time
is not a factor. The chart compares the average number of dollars donated
by the age of the donors. According to the trend in the chart, the older the
donor, the more money he or she donates. The 17-year-old donors donate,
on average, $84. For the 19-year-olds, the average donation increased by
$26 to make the average donation of that age group $110.