0% found this document useful (0 votes)
8 views

Chapter 1

This document provides an overview of basic concepts in probability and statistics, including definitions, classifications, and stages of statistical investigation. It discusses the types of data, methods of data collection, and the importance of statistics in various fields, along with its applications, uses, and limitations. Additionally, it covers scales of measurement and methods for presenting data, emphasizing the significance of proper data organization for effective analysis.

Uploaded by

ld0637524
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Chapter 1

This document provides an overview of basic concepts in probability and statistics, including definitions, classifications, and stages of statistical investigation. It discusses the types of data, methods of data collection, and the importance of statistics in various fields, along with its applications, uses, and limitations. Additionally, it covers scales of measurement and methods for presenting data, emphasizing the significance of proper data organization for effective analysis.

Uploaded by

ld0637524
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 23

Lecture notes Probability and Statistics (Stat 2071)

Chapter one

CHAPTER 1
1. Basic Concepts, methods of data collection and presentation

1.1 INTRODUCTION
Definition and classifications of statistics
Definition:
We can define statistics in two ways.
1. Plural sense (lay man definition).
It is an aggregate or collection of numerical facts.
2. Singular sense (formal definition)
Statistics is defined as the science of collecting,
organizing, presenting, analyzing and interpreting
numerical data for the purpose of assisting in making a
more effective decision.
Classifications:
Depending on how data can be used statistics is sometimes
divided in to two main areas or branches.
1. Descriptive Statistics: is concerned with summary
calculations, graphs, charts and tables.
2. Inferential Statistics: is a method used to generalize
from a sample to a population. For example, the average
income of all families (the population) in Ethiopia can be
estimated from figures obtained from a few hundred (the
sample) families.
 It is important because statistical data usually arises
from sample.
 Statistical techniques based on probability theory are
required.

Stages in Statistical Investigation


There are five stages or steps in any statistical investigation.
1. Collection of data: the process of measuring, gathering,
assembling the raw data up on which the statistical
investigation is to be based.
 Data can be collected in a variety of ways; one of the
most common methods is through the use of survey.

Page 1 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

Survey can also be done in different methods, three


of the most common methods are:
 Telephone survey
 Mailed questionnaire
 Personal interview.
Exercise: discuss the advantage and disadvantage of
the above three methods with respect to each other.
2. Organization of data: Summarization of data in some
meaningful way, e.g table form
3. Presentation of the data: The process of re-organization,
classification, compilation, and summarization of data to
present it in a meaningful form.
4. Analysis of data: The process of extracting relevant
information from the summarized data, mainly through
the use of elementary mathematical operation.
5. Inference of data: The interpretation and further
observation of the various statistical measures through
the analysis of the data by implementing those methods
by which conclusions are formed and inferences made.
 Statistical techniques based on probability theory are
required.

Definitions of some terms


a. Statistical Population: It is the collection of all possible
observations of a specified characteristic of interest
(possessing certain common property) and being under
study. An example is all of the students in AAU those who
take stat 2011 course.
b. Sample: It is a subset of the population, selected using
some sampling technique in such a way that they
represent the population.
c. Sampling: The process or method of sample selection
from the population.
d. Sample size: The number of elements or observation to
be included in the sample.
e. Census: Complete enumeration or observation of the
elements of the population. Or it is the collection of data
from every element in a population

Page 2 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

f. Parameter: Characteristic or measure obtained from a


population.
g. Statistic: Characteristic or measure obtained from a
sample.
h. Variable: It is an item of interest that can take on many
different numerical values.

Types of Variables or Data:

1. Qualitative Variables are nonnumeric variables and


can't be measured. Examples include gender, religious
affiliation, and state of birth.
2. Quantitative Variables are numerical variables and can
be measured. Examples include balance in checking
account, number of children in family. Note that quantitative
variables are either discrete (which can assume only certain
values, and there are usually "gaps" between the values,
such as the number of bedrooms in your house) or
continuous (which can assume any value within a specific
range, such as the air pressure in a tire.)

Applications, Uses and Limitations of statistics


Applications of statistics:
 In almost all fields of human endeavor.
 Almost all human beings in their daily life are subjected
to obtaining numerical facts e.g. abut price.
 Applicable in some process e.g. invention of certain
drugs, extent of environmental pollution.
 In industries especially in quality control area.
Uses of statistics:
The main function of statistics is to enlarge our knowledge of
complex phenomena. The following are some uses of
statistics:
1. It presents facts in a definite and precise form.
2. Data reduction.
3. Measuring the magnitude of variations in data.
4. Furnishes a technique of comparison

Page 3 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

5. Estimating unknown population characteristics.


6. Testing and formulating of hypothesis.
7. Studying the relationship between two or more
variable.
8. Forecasting future events.
Limitations of statistics
As a science statistics has its own limitations. The following
are some of the limitations:
 Deals with only quantitative information.
 Deals with only aggregate of facts and not with
individual data items.
 Statistical data are only approximately and not
mathematical correct.
 Statistics can be easily misused and therefore should
be used be experts.

Scales of measurement
Proper knowledge about the nature and type of data to be
dealt with is essential in order to specify and apply the
proper statistical method for their analysis and inferences.
Measurement scale refers to the property of value assigned
to the data based on the properties of order, distance and
fixed zero.

In mathematical terms measurement is a functional mapping


from the set of objects {Oi} to the set of real numbers
{M(Oi)}.

Page 4 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

The goal of measurement systems is to structure the rule for


assigning numbers to objects in such a way that the
relationship between the objects is preserved in the numbers
assigned to the objects. The different kinds of relationships
preserved are called properties of the measurement system.

Order

The property of order exists when an object that has more of


the attribute than another object, is given a bigger number
by the rule system. This relationship must hold for all objects
in the "real world".

The property of ORDER exists

When for all i, j if Oi > Oj, then M(Oi) > M(Oj).

Distance

The property of distance is concerned with the relationship


of differences between objects. If a measurement system
possesses the property of distance it means that the unit of
measurement means the same thing throughout the scale of
numbers. That is, an inch is an inch, no matters were it falls -
immediately ahead or a mile downs the road.

Page 5 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

More precisely, an equal difference between two numbers


reflects an equal difference in the "real world" between the
objects that were assigned the numbers. In order to define
the property of distance in the mathematical notation, four
objects are required: Oi, Oj, Ok, and Ol . The difference
between objects is represented by the "-" sign; O i - Oj refers
to the actual "real world" difference between object i and
object j, while M(Oi) - M(Oj) refers to differences between
numbers.

The property of DISTANCE exists, for all i, j, k, l

If Oi-Oj ≥ Ok- Ol then M(Oi)-M(Oj) ≥ M(Ok)-M( Ol ).

Fixed Zero

A measurement system possesses a rational zero (fixed


zero) if an object that has none of the attribute in question is
assigned the number zero by the system of rules. The object
does not need to really exist in the "real world", as it is
somewhat difficult to visualize a "man with no height". The
requirement for a rational zero is this: if objects with none of
the attribute did exist would they be given the value zero.
Defining O0 as the object with none of the attribute in
question, the definition of a rational zero becomes:

The property of FIXED ZERO exists if M(O0) = 0.

The property of fixed zero is necessary for ratios between


numbers to be meaningful.

SCALE TYPES

Measurement is the assignment of numbers to objects or


events in a systematic fashion. Four levels of measurement
scales are commonly distinguished: nominal, ordinal,
interval, and ratio and each possessed different properties of
measurement systems.

Nominal Scales

Page 6 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

Nominal scales are measurement systems that possess none


of the three properties stated above.

 Level of measurement which classifies data into


mutually exclusive, all inclusive categories in which no
order or ranking can be imposed on the data.
 No arithmetic and relational operation can be applied.

Examples:

o Political party preference (Republican, Democrat,


or Other,)
o Sex (Male or Female.)

o Marital status(married, single, widow, divorce)

o Country code

o Regional differentiation of Ethiopia.

Ordinal Scales

Ordinal Scales are measurement systems that possess the


property of order, but not the property of distance. The
property of fixed zero is not important if the property of
distance is not satisfied.

 Level of measurement which classifies data into


categories that can be ranked. Differences between the
ranks do not exist.
 Arithmetic operations are not applicable but relational
operations are applicable.

 Ordering is the sole property of ordinal scale.

Examples:

o Letter grades (A, B, C, D, F).


o Rating scales (Excellent, Very good, Good, Fair,
poor).

Page 7 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

o Military status.

Interval Scales

Interval scales are measurement systems that possess the


properties of Order and distance, but not the property of
fixed zero.

 Level of measurement which classifies data that can be


ranked and differences are meaningful. However, there is
no meaningful zero, so ratios are meaningless.
 All arithmetic operations except division are applicable.
 Relational operations are also possible.

Examples:

o IQ
o Temperature in oF.

Ratio Scales

Ratio scales are measurement systems that possess all three


properties: order, distance, and fixed zero. The added power
of a fixed zero allows ratios of numbers to be meaningfully
interpreted; i.e. the ratio of Bekele's height to Martha's
height is 1.32, whereas this is not possible with interval
scales.

 Level of measurement which classifies data that can be


ranked, differences are meaningful, and there is a true
zero. True ratios exist between the different units of
measure.
 All arithmetic and relational operations are applicable.

Examples:

o Weight
o Height

o Number of students

Page 8 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

o Age

The following present a list of different attributes and rules


for assigning numbers to objects. Try to classify the different
measurement systems into one of the four types of scales.
(Exercise)

1. Your checking account number as a name for your


account.
2. Your checking account balance as a measure of the
amount of money you have in that account.
3. The order in which you were eliminated in a spelling
bee as a measure of your spelling ability.
4. Your score on the first statistics test as a measure of
your knowledge of statistics.
5. Your score on an individual intelligence test as a
measure of your intelligence.
6. The distance around your forehead measured with a
tape measure as a measure of your intelligence.
7. A response to the statement "Abortion is a woman's
right" where "Strongly Disagree" = 1, "Disagree" = 2,
"No Opinion" = 3, "Agree" = 4, and "Strongly Agree" =
5, as a measure of attitude toward abortion.
8. Times for swimmers to complete a 50-meter race
9. Months of the year Meskerm, Tikimit…
10. Socioeconomic status of a family when classified
as low, middle and upper classes.
11. Blood type of individuals, A, B, AB and O.
12. Pollen counts provided as numbers between 1 and
10 where 1 implies there is almost no pollen and 10
that it is rampant, but for which the values do not
represent an actual counts of grains of pollen.
13. Regions numbers of Ethiopia (1, 2, 3 etc.)
14. The number of students in a college;
15. the net wages of a group of workers;
16. the height of the men in the same town;

INTRODUCTION TO METHODS OF DATA COLLECTION

Page 9 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

There are two sources of data:


1. Primary Data
 Data measured or collect by the investigator or
the user directly from the source.
 Two activities involved: planning and measuring.
a) Planning:
 Identify source and elements of the
data.
 Decide whether to consider sample or
census.
 If sampling is preferred, decide on
sample size, selection method,… etc
 Decide measurement procedure.
 Set up the necessary organizational
structure.
b) Measuring: there are different options.
 Focus Group
 Telephone Interview
 Mail Questionnaires
 Door-to-Door Survey
 Mall Intercept
 New Product Registration
 Personal Interview and
 Experiments are some of the sources
for collecting the primary data.
2. Secondary Data
 Data gathered or compiled from published and
unpublished sources or files.
 When our source is secondary data check that:
 The type and objective of the situations.
 The purpose for which the data are
collected and compatible with the
present problem.
 The nature and classification of data is
appropriate to our problem.
 There are no biases and misreporting in
the published data.
Note: Data which are primary for one may be secondary for
the other.

Page 10 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

1.2. METHODS OF DATA PRESNTATION

Having collected and edited the data, the next important step is to organize it. That is to
present it in a readily comprehensible condensed form that aids in order to draw
inferences from it. It is also necessary that the like be separated from the unlike ones.

The presentation of data is broadly classified in to the following two categories:

 Tabular presentation
 Diagrammatic and Graphic presentation.

The process of arranging data in to classes or categories according to similarities


technically is called classification.

Classification is a preliminary and it prepares the ground for proper presentation of data.

Definitions:

 Raw data: recorded information in its original collected form, whether it be


counts or measurements, is referred to as raw data.
 Frequency: is the number of values in a specific class of the distribution.

 Frequency distribution: is the organization of raw data in table form using classes
and frequencies.

There are three basic types of frequency distributions

 Categorical frequency distribution


 Ungrouped frequency distribution

 Grouped frequency distribution

There are specific procedures for constructing each type.

1) Categorical frequency Distribution:

Used for data that can be place in specific categories such as nominal, or ordinal. E.g.
Marital status

Example: A social worker collected the following data on marital status for 25
persons. (M=married, S=single, W=widowed, D=divorced)

Page 11 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

M S D W D

S S M M M

W D S M M

W D D S S

S W W D D

Solution:

Since the data are categorical, discrete classes can be used. There are four types of marital
status M, S, D, and W. These types will be used as class for the distribution. We follow
procedure to construct the frequency distribution.

Step 1: Make a table as shown.

Class Tally Frequency Percent


(1) (2) (3) (4)
M
S
D
W

Step 2: Tally the data and place the result in column (2).

Step 3: Count the tally and place the result in column (3).

Step 4: Find the percentages of values in each class by using;

Where f= frequency of the class, n=total number of value.

Percentages are not normally a part of frequency distribution but they can be added since
they are used in certain types diagrammatic such as pie charts.

Step 5: Find the total for column (3) and (4).


Class Tally Frequency Percent
Combing the entire steps one can
(1) (2) (3) (4)
construct the following frequency
M
distribution.
//// 5 20

S //// // 7 28
Page 12 of 23
D //// // 7 28

W //// 6 24
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

2) Ungrouped frequency Distribution:

-Is a table of all the potential raw score values that could possible occur in the data along
with the number of times each actually occurred.

-Is often constructed for small set or data on discrete variable.

Constructing ungrouped frequency distribution:


 First find the smallest and largest raw score in the collected data.
 Arrange the data in order of magnitude and count the frequency.
 To facilitate counting one may include a column of tallies.
Example:
The following data represent the mark of 20 students.

80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85

Construct a frequency distribution, which is ungrouped.


Solution:

Step 1: Find the range, Range=Max-Min=90-60=30.


Step 2: Make a table as shown
Step 3: Tally the data.
Step 4: Compute the frequency.
Mark Tally Frequency
60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1

Page 13 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
80 /// 3
85 /// 3
90 / 1

Each individual value is presented separately, that is why it is named ungrouped


frequency distribution.

3) Grouped frequency Distribution:

-When the range of the data is large, the data must be grouped in to classes that are more than
one unit in width.

Definitions:

 Grouped Frequency Distribution: a frequency distribution when several numbers


are grouped in one class.
 Class limits: Separates one class in a grouped frequency distribution from another.
The limits could actually appear in the data and have gaps between the upper limits of
one class and lower limit of the next.

 Units of measurement (U): the distance between two possible consecutive measures.
It is usually taken as 1, 0.1, 0.01, 0.001, -----.

 Class boundaries: Separates one class in a grouped frequency distribution from


another. The boundaries have one more decimal places than the row data and
therefore do not appear in the data. There is no gap between the upper boundary of
one class and lower boundary of the next class. The lower class boundary is found by
subtracting U/2 from the corresponding lower class limit and the upper class
boundary is found by adding U/2 to the corresponding upper class limit.

 Class width: the difference between the upper and lower class boundaries of any
class. It is also the difference between the lower limits of any two consecutive classes
or the difference between any two consecutive class marks.

 Class mark (Mid points): it is the average of the lower and upper class limits or the
average of upper and lower class boundary.

 Cumulative frequency: is the number of observations less than/more than or equal to


a specific value.

 Cumulative frequency above: it is the total frequency of all values greater than or
equal to the lower class boundary of a given class.

 Cumulative frequency blow: it is the total frequency of all values less than or equal
to the upper class boundary of a given class.

Page 14 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
 Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class
interval together with their corresponding cumulative frequencies. It can be more than
or less than type, depending on the type of cumulative frequency used.

 Relative frequency (rf): it is the frequency divided by the total frequency.

 Relative cumulative frequency (rcf): it is the cumulative frequency divided by the


total frequency.

Guidelines for classes

1. There should be between 5 and 20 classes.


2. The classes must be mutually exclusive. This means that no data value can fall
into two different classes
3. The classes must be all inclusive or exhaustive. This means that all data values
must be included.
4. The classes must be continuous. There are no gaps in a frequency distribution.
5. The classes must be equal in width. The exception here is the first or last class. It
is possible to have an "below ..." or "... and above" class. This is often used with
ages.

Steps for constructing Grouped frequency Distribution

1. Find the largest and smallest values


2. Compute the Range(R) = Maximum - Minimum
3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule
where k is number of classes desired and n is total number of
observation.
4. Find the class width by dividing the range by the number of classes and rounding
up, not off. .
5. Pick a suitable starting point less than or equal to the minimum value. The starting
point is called the lower limit of the first class. Continue to add the class width to
this lower limit to get the rest of the lower limits.
6. To find the upper limit of the first class, subtract U from the lower limit of the
second class. Then continue to add the class width to this upper limit to find the
rest of the upper limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2
units from the upper limits. The boundaries are also half-way between the upper
limit of one class and the lower limit of the next class. !may not be necessary to
find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish,
it may not be necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies

Page 15 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
Example*:

Construct a frequency distribution for the following data.

11 29 6 33 14 31 22 27 19 20

18 17 22 38 23 21 26 34 39 27

Solutions:

Step 1: Find the highest and the lowest value H=39, L=6

Step 2: Find the range; R=H-L=39-6=33

Step 3: Select the number of class’s desired using Sturges formula;

=1+3.32log (20) =5.32=6(rounding up)

Step 4: Find the class width; w=R/k=33/6=5.5=6 (rounding up)

Step 5: Select the starting point, let it be the minimum observation.

 6, 12, 18, 24, 30, 36 are the lower class limits.

Step 6: Find the upper class limit; e.g. the first upper class=12-U=12-1=11

 11, 17, 23, 29, 35, 41 are the upper class limits.

So combining step 5 and step 6, one can construct the following classes.

Class limits

6 – 11

12 – 17

18 – 23

24 – 29

30 – 35

36 – 41

Page 16 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
Step 7: Find the class boundaries;

E.g. for class 1 Lower class boundary=6-U/2=5.5

Upper class boundary =11+U/2=11.5

 Then continue adding w on both boundaries to obtain the rest boundaries. By


doing so one can obtain the following classes.

Class boundary

5.5 – 11.5

11.5 – 17.5

17.5 – 23.5

23.5 – 29.5

29.5 – 35.5

35.5 – 41.5

Step 8: tally the data.

Step 9: Write the numeric values for the tallies in the frequency column.

Step 10: Find cumulative frequency.

Step 11: Find relative frequency or/and relative cumulative frequency.

The complete frequency distribution follows:

Page 17 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

Class Class boundary Class Tally Freq. Cf (less Cf (more rf. rcf (less
limit Mark than than type) than type
type)

6 – 11 5.5 – 11.5 8.5 // 2 2 20 0.10 0.10

12 – 17 11.5 – 17.5 14.5 // 2 4 18 0.10 0.20

18 – 23 17.5 – 23.5 20.5 ////// 7 11 16 0.35 0.55

24 – 29 23.5 – 29.5 26.5 //// 4 15 9 0.20 0.75

30 – 35 29.5 – 35.5 32.5 /// 3 18 5 0.15 0.90

36 – 41 35.5 – 41.5 38.5 // 2 20 2 0.10 1.00

Diagrammatic and Graphic presentation of data.

These are techniques for presenting data in visual displays using geometric and pictures.

Importance:

 They have greater attraction.


 They facilitate comparison.

 They are easily understandable.

-Diagrams are appropriate for presenting discrete data.

-The three most commonly used diagrammatic presentation for discrete as well as qualitative
data are:

 Pie charts
 pictogram

 Bar charts

Pie chart

- A pie chart is a circle that is divided in to sections or wedges according to the


percentage of frequencies in each category of the distribution. The angle of the
sector is obtained using:

Page 18 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

Example: Draw a suitable diagram to represent the following population in a town.

Men Women Girls Boys

2500 2000 4000 1500

Solutions:

Step 1: Find the percentage.

Step 2: Find the number of degrees for each class.

Step 3: Using a protractor and compass, graph each section and write its name corresponding
percentage.

Class Frequency Percent Degree

Men 2500 25 90

Women 2000 20 72

Girls 4000 40 144

Boys 1500 15 54

Page 19 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one

CLASS

Boys Men

Girls Women

Pictogram
-In these diagram, we represent data by means of some picture symbols. We decide about
a suitable picture to represent a definite number of units in which the variable is
measured.
Bar Charts:
- A set of bars (thick lines or narrow rectangles) representing some magnitude over
time space.
- They are useful for comparing aggregate over time space.
- Bars can be drawn either vertically or horizontally.
- There are different types of bar charts. The most common being :
 Simple bar chart
 Deviation o0r two way bar chart
 Broken bar chart
 Component or sub divided bar chart.
 Multiple bar charts.

Simple Bar Chart

-Are used to display data on one variable.


-They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity
is represented by the height /length of the bar.
Example: The following data represent sale by product, 1957- 1959 of a given company for three
products A, B, C.

Product Sales($) Sales($) Sales($)


In 1957 In 1958 In 1959
A 12 14 18
B 24 21 18

Page 20 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
C 24 35 54

Solutions:

Sales by product in 1957

30
25
Sales in $

20
15
10
5
0
A B C
product

Component Bar chart


-When there is a desire to show how a total (or aggregate) is divided in to its component parts, we
use component bar chart.
-The bars represent total value of a variable with each total broken in to its component parts and
different colours or designs are used for identifications
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:

SALES BY PRODUCT 1957-1959

100

80
Sales in $

Product C
60
Product B
40
Product A
20

0
1957 1958 1959
Year of production

Multiple Bar charts

Page 21 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
- These are used to display data on more than one variable.
- They are used for comparing different variables at the same time.
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:

Sales by product 1957-1959

60
50
Sales in $

40 Product A
30 Product B
20 Product C

10
0
1957 1958 1959
Year of production

Graphical Presentation of data


- The histogram, frequency polygon and cumulative frequency graph or ogive are
most commonly applied graphical representation for continuous data.
Procedures for constructing statistical graphs:
 Draw and label the X and Y axes.
 Choose a suitable scale for the frequencies or cumulative frequencies and label it on the
Y axes.
 Represent the class boundaries for the histogram or ogive or the mid points for the
frequency polygon on the X axes.
 Plot the points.
 Draw the bars or lines to connect the points.

Histogram

A graph which displays the data by using vertical bars of various heights to represent frequencies.
Class boundaries are placed along the horizontal axes. Class marks and class limits are sometimes
used as quantity on the X axes.

Example: Construct a histogram to represent the previous data (example *).


Frequency Polygon:
- A line graph. The frequency is placed along the vertical axis and classes mid points
are placed along the horizontal axis. It is customer to the next higher and lower
class interval with corresponding frequency of zero, this is to make it a complete
polygon.

Page 22 of 23
Lecture notes Probability and Statistics (Stat 2071)
Chapter one
Example: Draw a frequency polygon for the above data (example *).
Solutions:
8

4
Value Frequency

0
2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5

Class Mid points

Ogive (cumulative frequency polygon)


- A graph showing the cumulative frequency (less than or more than type) plotted
against upper or lower class boundaries respectively. That is class boundaries are
plotted along the horizontal axis and the corresponding cumulative frequencies are
plotted along the vertical axis. The points are joined by a free hand curve.
Example: Draw an ogive curve(less than type) for the above data.
(Example *)

Page 23 of 23

You might also like