Ch1 Basic Terms
Ch1 Basic Terms
Winter 2025
1
STA 215 Basic Terms Winter 2025
1.1 Data and Its Role in Statistics
2
Statistics
• The science of
collecting
organizing
summarizing, and Statistics is about using the information
interpreting in data to help guide decisions.
data for making decisions.
3
Definition: Raw Data
Raw data can come in various forms, depending on the source and purpose.
4
Raw Data
• Example (Figure 1.1): Play-by-play results of a baseball game before any analysis is done.
• Depicts play-by-play results of the May 6, 2010, baseball game between the Los Angeles
Angels and Boston Red Sox.
• Requires familiarity with baseball and the Box Score Explanation on the website for
interpretation.
• Details such as pitch locations are omitted but major game characteristics are captured.
5
Raw Data
6
Raw Data
Conclusion:
• In both examples, raw data must be transformed into a usable format to enable
meaningful data analysis. 7
Definition: Data
8
Data
9
Definition: Individual
10
Individual
11
Definition: Variable
12
Definition: Observation
Example: The weight, age, and white blood cell count of a mouse in an experiment.
13
Summary
Individuals in the experiment: Mice are the individuals being studied in a medical
experiment.
Variables collected: Data includes weight, age, white blood cell count before
treatment, and white blood cell count after one week of treatment with a new
cancer medication.
Observation: An observation consists of all the variables (weight, age, and white
blood cell counts) recorded for a particular mouse.
Multiple observations: Data is typically collected on multiple mice, resulting in more
than one observation.
14
Example 1.1: Structural Health of Bridges in Michigan
Scenario: A state report on the structural health of bridges.
Questions:
1. What is an individual in this data set?
Individual:
2. Name each variable included in the data set.
Variables:
3. Describe the data connected to the first individual in the data set.
Data:
15
Example 1.1: Structural Health of Bridges in Michigan
Scenario: A state report on the structural health of bridges.
Questions:
1. What is an individual in this data set?
Individual: A Michigan bridge.
2. Name each variable included in the data set.
Variables: Bridge, County, Route, NHS, Age, Inspection, Deficient,
Obsolete.
3. Describe the data connected to the first individual in the data set.
Data: B1 (Bridge), Alcona (County), US-23 (Route), Yes (NHS), 77
(Age), 9/22/2010 (Inspection), No (Deficient), No (Obsolete). 16
Example 1.1: Structural Health of Bridges in Michigan
17
Data and Its Role in Statistics
• Data is essential for informed decisions: Statistical tools help analyze data and
provide insights beyond anecdotes.
• Data beats anecdotes: Anecdotal evidence, like exceptional stories, is unreliable
for decision-making. Statistics focuses on both the typical and exceptional to
provide a balanced understanding.
• Statistical techniques are objective: They reduce subjectivity by relying on data
rather than personal opinions, though subjectivity still exists in selecting
individuals, variables, and methods.
• Statistics account for uncertainty: Statistical thinking quantifies uncertainties in
outcomes, helping make decisions even when results vary among individuals. 18
Data and Its Role in Statistics
19
Data and Its Role in Statistics
21
Core Concepts
22
Definition: Distribution of a Variable
23
Distribution of a Variables
24
Example 1.2: High School GPA Data for Incoming Freshmen
Scenario: High school GPAs of all incoming freshmen at GVSU for fall 2020 (made-up data).
Questions:
1. What are the individuals in the data set?
Individuals:
2. What does the distribution of the variable "First Name" mean?
Distribution of the variable “First Name”:
25
Example 1.2: High School GPA Data for Incoming Freshmen
Scenario: High school GPAs of all incoming freshmen at GVSU for fall 2020 (made-up data).
Questions:
1. What are the individuals in the data set?
Individuals: The incoming freshmen.
2. What does the distribution of the variable "First Name" mean?
Distribution of the variable “First Name”: All the different first names in the data
set and how many students have each first name.
3. What does the distribution of the variable "HSGPA" mean?
Distribution of the variable “HSGPA”: all the different high school GPA values in
the data set and how many students have each GPA value. 26
Population and Sample
27
Population and Sample
28
Definition: Population
29
Definition: Sample
30
Population and Sample
32
Census
34
Definition: Parameter
35
Summary
36
Types of Variables
• Categorical (qualitative)
• Quantitative (numeric)
37
Types of Variables
38
Definition: Categorical Variable
39
Definition: Quantitative Variable
40
Measurement scale
42
Example 1.6: Library's Survey
Sometimes researchers convert categorical measurements into numerical measurements.
Scenario: A local library wants to gauge patrons' attitudes using a survey. The survey
employs a Likert scale with the responses: 1 = Strongly Disagree, 2 = Disagree, 3 = Neither
Agree nor Disagree, 4 = Agree, and 5 = Strongly Agree.
Questions:
1. Should we consider this to be quantitative data?
Answer:
Reading Assignment
Chapter 1: Basic Terms, pp. 9 – 23
Textbook: STA 215 @ GVSU: Introductory Applied Statistics, 2023, by John Gabrosek
and Diann Reischman.
45