Course Code & Number:FET201
Course Code & Number:FET201
E-mail: [email protected]
1
Syllabus
1- Introduction:
Definition of statistics, types of statistics, population , sample , variables and types of
variables, boundaries of a continuous variable.
2- Frequency distribution and graph:
categorical frequency distribution
grouped frequency distribution
un grouped frequency distribution
histogram, frequency polygon, ogive, stem and leaf plots.
Other Types of Graphs ( Pie graph, Bar graph, Pareto chart, Time Series graph
3- Data description
measures of central tendency
measures of variation
measures of position
4- Probability
Basic concept: probability experiment- outcome- sample space- event- Tree diagram
Probability of an event, complement of an event, mutually exclusive events 2
Addition rule, Multiplication Rules, Conditional Probability.
5- Discrete probability distributions
Probability Distributions
Mean, variance, standard deviation, and expectation
The binomial distribution
6- The Normal Distribution:
Properties of normal distribution
The Standard normal distribution
Application of the normal distribution
7-Correlation and Regression
Correlation- scatter plot- Linear Correlation Coefficient. levels of correlation
Regression- Equation of regression
3
1- Introduction and Basic Concepts
Statistics: is the science of conducting studies to collect
,organize,summarize,analyze and drawing conclusions from data.
A population: consists of all subjects (human or otherwise) that
are being studied.
Example: All students who registered in the university last year.
A sample : is a group of subjects selected from a population
Example: A group of students who registered in the department of
IT.
4
Types of statistics
8
Types of data
variables
Quantitative Qualitative
(numerical) (categorical)
Continuous
Discrete
Example
Determine the correct data type (quantitative or qualitative).
Indicate whether quantitative data are continuous or discrete.
a. the number of pairs of shoes you own
b. the type of car you drive
c. the distance it is from your home to the nearest grocery
store
d. the number of classes you take per school year.
e. the type of calculator you use
f. weights of sumo wrestlers
g. number of correct answers on a quiz
h. IQ (Intelligent quotient.)
10
Solution
a. quantitative discrete.
b. qualitative, or categorical
c. quantitative continuous
d. quantitative discrete. .
e. qualitative, or categorical
f. quantitative continuous
g. quantitative discrete.
h. quantitative continuous
11
The boundaries of a continuous variable
The boundaries of a continuous variable are given in one
additional decimal place and always end with the digit 5.
Example:
13
Relative
frequency
IIII 5/25=0.2 20
7/25=0.28 28
9/25=0.36 36
4/25=0.16 16
14
Grouped Frequency Distribution
Example: The following data represent the record high
temperatures for each of the 50 states. Construct a grouped
frequency distribution for the data using 7 classes.
15
Solution
Determine the classes.
Determine the lowest value (L), L=100,
highest value (H), H=134.
Find the range (R). Range= highest value – smallest value
R=H-L=134-100=34.
Find the class width.
Class width = Range/number of classes
=34/7 = 5
Rounding Rule: Always round up if a remainder
16
Constructing a Grouped Frequency Distribution
For convenience sake, we will choose the lowest data
value, 100, for the first lower class limit.
The subsequent lower class limits are found by
adding the width to the previous lower class limits.
Class Limits
The first upper class limit is one
100 - 104
105 - 109 less than the next lower class limit.
110 - 114
The subsequent upper class limits
115 - 119
120 - 124 are found by adding the width to the
125 - 129 previous upper class limits.
130 - 134
17
Constructing a Grouped Frequency Distribution
Exercise: Find the midpoint for the classes in the previous example. 19
Rules for Classes in Grouped Frequency
Distributions
1. There should be 5-20 classes.
2. The classes must be mutually exclusive.
3. The classes must be continuous.
4. The classes must be exhaustive.
5. The classes must be equal in width (except in open-
ended distributions).
20
Cumulative Frequency
A cumulative frequency distribution is a distribution that
shows the number of data values less than or equal to a specific
value (usually an upper boundary).
0
2
10
28
41
48
49
50
21
Un Grouped Frequency Distribution
When the range of the data values is relatively small, a frequency distribution
can be constructed using single data values for each class. This type of
distribution is called an ungrouped frequency distribution
Example
The data shown here represent the number of miles per gallon (mpg) that 30
selected four-wheel-drive sports utility vehicles obtained in city driving.
Construct a frequency distribution.
22
Solution
STEP 1 Determine the classes.
Determine the lowest value (L), L=12, highest value (H), H=19.
Find the range (R), R=H-L=19-12=7.
23
Cumulative Frequency
24
2-2 Graphs
3 Most Common Graphs in Research
1. Histogram
2. Frequency Polygon
3. Cumulative Frequency Polygon (Ogive)
25
1- Histograms
The histogram is a graph that displays the data by using
contiguous (unless the frequency of a class is 0) vertical bars of
various heights to represent the frequencies of the classes.
Steps
1: Draw and label the x and y axes. The x axis is
always the horizontal axis, and the y axis is always
the vertical axis.
2: Represent the class boundaries on the x axis.
and the frequency on the y axis.
3: Using the frequencies as the heights, draw vertical
bars for each class.
26
Example 2-4
Construct a histogram to represent the data for
the record high temperatures for each of the 50
states (see Example 2–2 for the data).
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
27
Histograms
Histograms use class boundaries and
frequencies of the classes.
Class Class
Frequency
Limits Boundaries
100 - 104 99.5 - 104.5 2
105 - 109 104.5 - 109.5 8
110 - 114 109.5 - 114.5 18
115 - 119 114.5 - 119.5 13
120 - 124 119.5 - 124.5 7
125 - 129 124.5 - 129.5 1
130 - 134 129.5 - 134.5 1
28
Histograms
Histograms use class boundaries and
frequencies of the classes.
29
Frequency Polygon
The frequency polygon is a graph that displays the data by using
lines that connect points plotted for the frequencies at the class
midpoints. The frequencies are represented by the heights of the
points.
Steps
1: Draw and label the x and y axes.
2: Represent the midpoint, on the x axis.
3: Choose a suitable scale for the frequencies, and label it on the y
axis.
4: Connect adjacent points with line segments. Draw a line back to
the x axis at the beginning and end of the graph, at the same
distance that the previous and next midpoints would be located.
30
Example 2-5
Construct a frequency polygon to represent the
data for the record high temperatures for each of
the 50 states.
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
31
Frequency Polygons
Frequency polygons use class midpoints
and frequencies of the classes.
Class Class
Frequency
Limits Midpoints
100 - 104 102 2
105 - 109 107 8
110 - 114 112 18
115 - 119 117 13
120 - 124 122 7
125 - 129 127 1
130 - 134 132 1
32
Frequency Polygons
Frequency polygons use class midpoints
and frequencies of the classes.
33
An Ogive (Cumulative Frequency Polygon
The ogive is a graph that represents the cumulative
frequencies for the classes in a frequency distribution.
steps
1: Draw and label the x and y axes.
2: Represent the class boundaries on the x axis
3: Choose a suitable scale cumulative frequencies, and
label it on the y axis.
4: Plot the points and then draw the bars or lines.
34
Example 2-6
Construct an ogive to represent the data for the
record high temperatures for each of the 50
states (see Example 2–2 for the data).
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
35
Solution
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Class Class Cumulative
Frequency
Limits Boundaries Frequency
100 - 104 99.5 - 104.5 2 2
105 - 109 104.5 - 109.5 8 10
110 - 114 109.5 - 114.5 18 28
115 - 119 114.5 - 119.5 13 41
120 - 124 119.5 - 124.5 7 48
125 - 129 124.5 - 129.5 1 49
130 - 134 129.5 - 134.5 1 50
36
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Cumulative
Class Boundaries
Frequency
Less than 99.5 0
Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50
37
An ogive (Cumulative Frequency Polygon)
38
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.
39
2.2 Relative Frequency Graphs
If proportions are used instead of frequencies, the
graphs are called relative frequency graphs.
40
Example 2-7 Page #57
Construct a histogram, frequency polygon, and ogive
using relative frequencies for the distribution (shown
here) of the miles that 20 randomly selected runners
ran during a given week. Class
Frequency
Boundaries
5.5 - 10.5 1
10.5 - 15.5 2
15.5 - 20.5 3
20.5 - 25.5 5
25.5 - 30.5 4
30.5 - 35.5 3
35.5 - 40.5 2
41
Histograms
The following is a frequency distribution of
miles run per week by 20 selected runners.
Divide each
Class Relative
Frequency frequency by
Boundaries Frequency the total
5.5 - 10.5 1 frequency to
1/20 = 0.05
10.5 - 15.5 2 get the
2/20 = 0.10
15.5 - 20.5 3 relative
3/20 = 0.15
20.5 - 25.5 5 frequency.
5/20 = 0.25
25.5 - 30.5 4 4/20 = 0.20
30.5 - 35.5 3 3/20 = 0.15
35.5 - 40.5 2 2/20 = 0.10
f = 20 rf = 1.00
42
Histograms
Use the class boundaries and the
relative frequencies of the classes.
43
Frequency Polygons
The following is a frequency distribution of
miles run per week by 20 selected runners.
Class Class Relative
Boundaries Midpoints Frequency
5.5 - 10.5 8 0.05
10.5 - 15.5 13 0.10
15.5 - 20.5 18 0.15
20.5 - 25.5 23 0.25
25.5 - 30.5 28 0.20
30.5 - 35.5 33 0.15
35.5 - 40.5 38 0.10
44
Frequency Polygons
Use the class midpoints and the
relative frequencies of the classes.
45
Ogives
The following is a frequency distribution of
miles run per week by 20 selected runners.
Class Cumulative Cum. Rel.
Frequency
Boundaries Frequency Frequency
5.5 - 10.5 1 1 1/20 = 0.05
10.5 - 15.5 2 3 3/20 = 0.15
15.5 - 20.5 3 6 6/20 = 0.30
20.5 - 25.5 5 11 11/20 = 0.55
25.5 - 30.5 4 15 15/20 = 0.75
30.5 - 35.5 3 18 18/20 = 0.90
35.5 - 40.5 2 20 20/20 = 1.00
f = 20
46
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Cum. Rel.
Class Boundaries
Frequency
Less than 5.5 0
Less than 10.5 0.05
Less than 15.5 0.15
Less than 20.5 0.30
Less than 25.5 0.55
Less than 30.5 0.75
Less than 35.5 0.90
Less than 40.5 1.00
47
Ogives
Use the upper class boundaries and the
cumulative relative frequencies.
48
Shapes of Distributions
49
Shapes of Distributions
50
Other Types of Graphs
Stem and Leaf Plots
A stem and leaf plots is a data plot that uses part of a data
value as the stem and part of the data value as the leaf to
form groups or classes.
51
Example
25 31 20 32 13
14 43 2 57 23
36 32 33 32 44
32 52 44 51 45
Solution
Step 1 Arrange the data in order:
02, 13, 14, 20, 23, 25, 31, 32, 32, 32,
32, 33, 36, 43, 44, 44, 45, 51, 52, 57
52
Step 2 Separate the data according to the first digit, as shown.
02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36
43, 44, 44, 45 51, 52, 57
Stem Leaf
0 2
1 3 4
2 0 3 5
3 1 2 2 2 2 3 6
4 3 4 4 5
5 1 2 7
53
Example
An insurance company researcher conducted a survey on the number of car
thefts in a large city for a period of 30 days last summer. The raw data are
shown. Construct a stem and leaf plot by using classes 50–54, 55–59, 60–64,
65–69,70–74, and 75–79.
52 62 51 50 69
58 77 66 53 57
75 56 55 67 73
79 59 68 65 72
57 51 63 69 75
65 53 78 66 55
Solution
Step 1 Arrange the data in order:
54
Step 2 Separate the data according to the classes.
Stem Leaf
55
The Pie Graph:
Pie graphs are used extensively in statistics. The
purpose of the pie graph is to show the
relationship of the parts to the whole
The pie graph is used to represent the nominal or
categorical variable
A pie graph is a circle that is divided into
sections according to the percentage of
frequencies in each category of the distribution.
Example: Construct a pie graph showing the blood
types of the army inductees described in Example 2–1.
The frequency distribution is repeated here.
Step 3 Using a protractor, graph each section and write its name and
corresponding percentage, as shown in following figure .
Example
The average amounts spent by college freshmen for
school items are shown. Construct a pie graph.
Electronics/computers $728
Dorm items $344
Clothing $ 141
Shoes $ 72
Solution:
Bluman, Chapter 2 62
63
64
Pareto Charts
A Pareto chart is used to represent a frequency
distribution for a categorical variable, and the
frequencies are displayed by the heights of vertical bars,
which are arranged in order from highest to lowest.
Example:
65
Solution
Step 1 Arrange the data from the largest to smallest
according to frequency.
66
Pareto Charts
68
Solution
69
There was a slight decrease in the years ’04, ’05, and
’06, compared to ’03, and again an increase in ’07. The
largest decrease occurred in ’08.
70