0% found this document useful (0 votes)
14 views

AP Stats Module 1 Notes

The document outlines methods for analyzing and visualizing categorical and quantitative data, emphasizing the importance of selecting appropriate graphical displays and statistical measures based on data type. It discusses various graphical representations such as pie charts, bar charts, histograms, and box plots, along with guidelines for describing distributions, including center, spread, and identifying outliers. Additionally, it highlights common mistakes in data analysis and the significance of using comparative language when discussing distributions.

Uploaded by

bjs63624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

AP Stats Module 1 Notes

The document outlines methods for analyzing and visualizing categorical and quantitative data, emphasizing the importance of selecting appropriate graphical displays and statistical measures based on data type. It discusses various graphical representations such as pie charts, bar charts, histograms, and box plots, along with guidelines for describing distributions, including center, spread, and identifying outliers. Additionally, it highlights common mistakes in data analysis and the significance of using comparative language when discussing distributions.

Uploaded by

bjs63624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

*Can be arranged into categories (e.g. hair color or genre).

Categorical (Qualitative) Data*


Single Variable (Univariate) Two or More Variables (Bivariate or Multivariate)
Pie Chart Bar Chart Segmented Bar Chart Two-Way Table Mosaic Plot
Best when data adds up to 100% Always a great choice

*Always have units. For example, height (in.) , age (years), or SAT (points).

Quantitative Data*
Dot Plot Stemplot (stem-and-leaf) Histogram Frequency Display (ogive) Box plot (box and whisker)
Quick display! Don’t forget to include the key! Make sure that everything is less than Always cumulative display Best for summary stats!
bin limit.
Don’t forget to label and Variations: Uses 5 # Summary
title the graph!
Back-to-back (Min, Q1, Med, Q3, Max)

Split stems:

0-4, 5-9

Describing Quantitative Distributions


Shape Outlier Center Spread
Always start with the shape of Be sure to use the 1.5IQR rule Typical Values Variability in the data
the distribution: in order to determine outliers.
Unless the shape of the Report the IQR (with median)
Symmetric Lower fence: Q1 - 1.5IQR distribution is symmetric use and Standard Deviation (with
median instead of the mean. mean).
Upper fence: Q3 + 1.5IQR
The mean is sensitive to Range and Standard Deviation
Skewed ( left or right ) Always show your work and
outliers, whereas the median is are sensitive to outliers,
round to four decimal places! resistant. whereas the IQR is resistant.

Linear Transformations Measures of Position


Center Spread Shape Empirical Rule (68-95-99.7) Area Under Normal Curve
NormalCDF()
Add/Subtract
From z-scores to percentage of
observations
Multiply/Divide
When stating Normalcdf, you must
state what each value represents
Adding/Subtracting a constant will only change measures of center (e.g. Mean, Median, Q1, Q3, Mode). Normalcdf(upper =, lower =, mean
= , standard deviation = )
Multiplying/Dividing by a constant will change measures of center and spread (e.g. Standard Deviation, IQR, Range).

Unless multiplying by a negative, the shape of the distribution will not change. InvNorm()
From Percentile to z-score.
Common mistakes Z-Scores When stating InvNorm, you must
state what each value represents
• Always show your work! Z-Scores: How many standard InvNorm(percentile= , mean = ,
deviations a data point is away standard deviation = )
• Round to four decimal places! from the mean.

• When comparing distributions of quantitative variables, it is not enough to Percentiles


list each of the values in SOCS. Wording such as “greater than”, “less than” or The pth percentile of a distribution is
“about the same as” must be used to show comparison. Having a negative z-score is not always a the value with p% of observations less
bad thing (e.g. golf and swimming). Always
than or equal to it.
• Always write your answers in the context of the problem. For example, The read and answer the question in context of
the problem!
Avengers were able to save approximately 50% of the population.
Sections 1.01-1.04
• The selection of an appropriate data analysis method depends on whether the data are categorical or quantitative. This is one of the
first things you should think about when you encounter a new data set.

• Don’t confuse bar charts with histograms. These two types of graphical displays look similar, but bar charts are used with categorical
data and histograms are used with numerical data.

• When constructing graphical displays, make sure to label the axes and mark them with appropriate scales (including units).

• The legend/key is an important part of a stem-and-leaf display. Make sure to always include one whenever you make a stem-and-leaf
display

• If the class intervals are not all the same width, be sure to use the density scale on the vertical axis when making a histogram.

• Center and variability are two important aspects of a distribution. When describing the distribution of a numerical data set, be sure
to address both center and variability (spread).

• The measures used to describe center and variability for distributions that are approximately symmetric (mean and standard devia-
tion) are different from those used to describe center and variability for distributions that are skewed or that have outliers (median
and IQR). It’s the difference between reporting resistant statistics vs. sensitive ones.

• For distributions that are skewed or that have outliers, you should consider using the median to describe the center.

• The lower quartile, the median, and the upper quartile divide the data set into four parts, with 25% of the data in each part.

• Outliers convey important information about a data set. For this reason, boxplots that show outliers (modified boxplots) are usually
preferred over boxplots that do not show outliers.

• When asked to compare distributions, don’t forget to use comparative language (more than, less than, same as, etc.)

Sections 1.05-1.06
• Not all distributions are mound shaped. Using the Empirical Rule in situations where you are not convinced that the data distribution
is mound shaped and approximately symmetric can lead to incorrect statements.

• z-scores indicate both direction and distance from mean in standard deviations.

• Adding/Subtracting a constant ONLY affects measures of center, NOT spread.

• Only use the empirical rule (68-95-99.7 Rule) when specifically told to do so.

• When using your calculator to determine the area under the curve of a Normal/Approximately Normal distribution, you MUST
indicate what each value that was entered into the calculator represents. For example, NormalCDF(lower= 89, upper= 105,
mean=99, standard deviation=5)

• NEVER refer to the shape of a distribution as Normal. Random variables rarely have a shape that is perfectly normal. At the very best,
they are approximately normal.

• High percentiles are NOT always a good thing! Examples include: golf scores, swimming races, and student debt.

• Only use NormalCDF and InverseNorm when the shape of the distribution is Normal/Approximately Normal.

You might also like