0% found this document useful (0 votes)
10 views58 pages

Chapter 2 Modelling Distribution of Data

Uploaded by

王一荣
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views58 pages

Chapter 2 Modelling Distribution of Data

Uploaded by

王一荣
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Modeling Distributions

of Continuous Data
ILO(Intended Learning Outcomes)
This section covers…

• Percentile and Cumulative frequency graph

• Standardized score

• Coded data

• Distribution of continuous variables(N and Z distribution)


1. Percentile and cumulative frequency data

• The nth percentile of a distribution is the value with n percent of the


observations less than it
1. Percentile and cumulative frequency data

The box-and-whisker
graph is similar
4. Display of quantitative data with Numbers

• Cumulative frequency is the total frequency of all values less than a given value.

Which of the following cumulative frequency graphs (A D ) could represent the same set of data
as each of the histograms (1-4)?
1. Percentile and cumulative frequency data

• Cumulative relative frequency


graph for the ages of U.S.
presidents at inauguration
1. Percentile and cumulative frequency data

• The 65th percentile of the distribution is the age with cumulative relative frequency 65%.

• The value on the horizontal axis is about 58. So about 65% of all U.S. presidents were younger
than 58 when they took office.
1. Percentile and cumulative frequency data
1. Percentile and cumulative frequency data
1. Percentile and cumulative frequency data

The graph displays the cumulative relative frequency of the lengths of phone callsmade
from the mathematics department office at Gabalot High lastmonth.
About what percent of calls lasted less than 30 minutes? 30minutes or more?
Estimate Q1, Q3, and the IQR of the distribution.
4 Representation of continuous data: cumulative frequency graphs

A fashion company selected 100 12-year-old


boys and 100 12-year-old girls to audition as
models. The heights, A cm, of the selected
children are represented in the following graph

I. Estimate the number of girls who are taller than the shortest 50 boys.
II. What is the significance of the value of h where the graphs intersect?
The shortest 75 boys and tallest 75 girls were recalled for a second audition.
III. On a cumulative frequency graph, show the heights of the children who were not recalled.
2. Standardized score

• A standardized score:

• The process of converting observations from original values to standard


deviation units, called the z-score is known as standardizing
2. Standardized score

• A z-score tells us how many standard deviations from the mean an


observation falls, and in what direction.
2. Standardized score-Practice

Jenny earned an 82 on Mr. Goldstone’s chemistry test. The distribution of


scores was fairly symmetric with a mean of 76 and a standard deviation of
4.
On which test did Jenny perform relative to the class? Justify your
answer.
2. Standardized score-Practice

Brent is a member of the school’s basketball team. The mean height of the
players on the team is 76 inches. Brent’s height 70 inches translates to a z-
score of −0.85 in the team’s height distribution.
What is the standard deviation of the team members’ heights?
3. Transform data

• Sometimes we transform our raw data so it can be easy to organized.

• For example To find the mean of 101,103, 104, 109 and 113, for
example, we can use the values 1, 3, 4, 9 and 13.
What is the effect on the shape, center, and spread of a distribution
of data?
3. Transform data

Effect of adding the same positive number a to (subtracting a from) each


observation

• adds to (subtracts a from) measures of center and location


(mean,median, quartiles, percentiles), but

• does not change the shape (symmetry)of the distribution or measures


of spread(range, IQR, standard deviation).
3. Transform data
3. Transform data-Practice

If Mrs. Navard had the entire class stand on a 6-inch-high platform and
then had the students measure the distance from the top of their heads to
the ground, how would the shape, center, and spread of this distribution
compare with the original height distribution?
3. Transform data

Multiplying (or dividing) each observation by the same positive number b

• Multiplies (divides) measures of center and location (mean,


median,quartiles, percentiles) by b,

• Multiplies (divides) measures of spread (range, IQR, standard


deviation)by b, but

• does not change the shape of the distribution.


3. Transform data
3. Transform data

What does all this transformation business have to do with z-scores?

• If we standardize every observation in a distribution, the resulting set of

z-scores has mean 0 and standard deviation 1.


3. Transform data-Practice
3. Transform data
3. Transform data
3. Transform data
3. Transform data
3. Transform data
4. Density Curve and Normal Distribution
4. Density Curve and Normal Distribution

• From histogram to density curve


4. Density Curve and Normal Distribution

A density curve is a curve that

• is always on or above the horizontal axis, and

• has area exactly 1 underneath it.

• The area under the curve and above any interval of values on the
horizontal axis is the proportion of all observations that fall in that
interval.
4. Density Curve and Normal Distribution

• The median of a density curve is the equal-areas point, the point that
divides the area under the curve in half.

• The mean of a density curve is the balance point, at which the curve
would balance if made of solid material.

• The median and mean are the same for a symmetric density curve. They
both lie at the center of the curve. The mean of a skewed curve is pulled
away from the median in the direction of the long tail.
4. Density Curve and Normal Distribution
The mean and median are marked.
Which of the lines following is the mean? Is the mean above or below the
median ?
4. Density Curve and Normal Distribution

Normal distributions.

• All Normal curves have the same overall shape:


symmetric, single-peaked, and bell-shaped.

• Any specific Normal curve is completely described


by giving its mean and its standard deviation.
4. Density Curve and Normal Distribution

• If the random variable X is normally distributed with mean p and


variance <72, then its equation is

The parameters that define a normally distributed random variable are

We say describes a normally distributed random variable.


4. Density Curve and Normal Distribution
4. Density Curve and Normal Distribution
• Mean decides the location
• Standard deviation decides the shape
4. Density Curve and Normal Distribution
4. Density Curve and Normal Distribution

• The 68–95–99.7 Rule


4. Density Curve and Normal Distribution
4. Density Curve and Normal Distribution
4. The Standard Normal Distribution

• Standardize transformation:

• This new distribution formed is called the standard Normal distribution

• The standard Normal distribution is the Normal distribution with mean 0


and standard deviation 1

• or
4. The Standard Normal Distribution
The standard Normal table
• Table A is a table of areas under the standard Normal curve. The table
entry for each value z is the area under the curve to the left of z.
4. The Standard Normal Distribution
4. The Standard Normal Distribution
4. The Standard Normal Distribution
4. Normal Distribution Calculations

• Step 1: Construct a Normal distribution(expression)

Justify if it is a Normal distribution; find

• Step 2: Standardization(draw)

Find scores; define the area under Z curve

• Step 3: Calculation(cal)

Simple math; Check your answer


4. The Standard Normal Distribution

• The mass of a newborn baby in a certain region is normally


distributed with mean 3.35 kg and variance 0.0858. Estimate how
many of the 1356 babies born last year had masses of less than 3.5
kg.

Informati Distributi
Area Value
on on
4. Normal Distribution Calculations

On the driving range, Tiger Woods practices his swing with a particular
club by hitting many, many balls. Suppose that when Tiger hits his
driver, the distance the ball travels follows a Normal distribution with
mean 304 yards and standard deviation 8 yards.
What percent of Tiger’s drives travel at least 290 yards? (Page118)
4. Normal Distribution Calculations

High levels of cholesterol in the blood increase the risk of heart disease. For 14-
year-old boys, the distribution of blood cholesterol is approximately Normal with
mean m = 170 milligrams of cholesterol per deciliter of blood (mg/dl) and
standard deviation s = 30 mg/dl.
4. Normal Distribution
Are the data close to Normal?

Graph then 68–95–99.7 rule


4. Normal Distribution*
Method 2: A Normal probability plot
• If the points on a Normal probability plot lie close to a straight line,
the data are approximately Normally distributed.
4. Normal Distribution
4. Normal Distribution
4. Normal Distribution
4. Normal Distribution
4. Normal Distribution

You might also like