Introduction To Statistics: Learning Objectives
Introduction To Statistics: Learning Objectives
Introduction to Statistics
By
Prof. Vishal Singh Patyal
Learning Objectives
• What is Statistics
• Why Statistics
• Basic vocabulary used in Statistics
• Sources of data and its types
• Types of Variables
• Level of Measurement
1
7/7/2021
What is Statistics?
Statistics
Data Information
Data: Facts, especially Information: Knowledge
numerical facts, collected communicated
together for reference or concerning some
information. particular fact.
Statistics is a tool for creating new understanding from a set of numbers.
What is statistics?
2
7/7/2021
Statistics in Business
• Accounting — auditing and cost estimation
• Economics — local, regional, national, and
international economic performance
• Finance — investments and portfolio management
• Management — human resources, compensation, and
quality management
• Management Information Systems — performance of
systems which gather, summarize, and disseminate
information to various managerial levels
• Marketing — market analysis and consumer research
• International Business — market and demographic
analysis
3
7/7/2021
Population
• A population consists of all the items
or objects about which you want to
draw a conclusion.
• The objects can be people, animals,
plants, etc.
• Population size is usually very large
(human beings) but can be very small
also (panda).
• A study that involves a population is
called census.
• The population size denoted by N
4
7/7/2021
Sample
10
5
7/7/2021
Population
11
RD1 Red 12
RD2 Red 10
RD3 Red 13
RD4 Red 10
RD5 Red 13
BL1 Blue 27
BL2 Blue 24
GR1 Green 35
GR2 Green 35
GY1 Gray 15
GY2 Gray 18
GY3 Gray 17
12
6
7/7/2021
RD2 Red 10
RD5 Red 13
GR1 Green 35
GY2 Gray 18
13
14
7
7/7/2021
4. Use x
to estimate
1. Population 3. Sample
x
(parameter ) (statistic )
2. Select a
random sample
15
Statistics in Business
16
8
7/7/2021
Variable
• A variable is some characteristic of a population or sample.
E.g. student grades. Typically denoted with a capital letter: A,
B, C…
• The values of the variable are the range of possible values for
a variable.
E.g. student marks (0..100)
Data
• Data are the observed values of a variable.
• Data are the different values associated with a variable.
E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
17
Example
IIMV Institute dean is interested in learning about the average age of
faculty. Identify the basic terms in this situation.
The population is the age of all (30) faculty members at the Institute.
A sample is any subset of that population. For example, we might
select 5 faculty members and determine their age.
The variable is the “age” of each faculty member.
The data would be the set of values in the sample.
The parameter of interest is the “average” age of all faculty at the
Institute.
The statistic is the “average” age for all faculty in the sample.
18
9
7/7/2021
Statistics
19
Descriptive Statistics
20
10
7/7/2021
Descriptive Statistics
Collect data
ex. Survey
Present data
ex. Tables and graphs
Characterize data
ex. Sample mean = X i
n
Collect
Organize
Summarize
Display
Analyze
21
Inferential Statistics
22
11
7/7/2021
Inferential Statistics
• Estimation
‒ ex. Estimate the population
mean weight using the sample
mean weight
• Hypothesis testing
‒ ex. Test the claim that the Predict and forecast values
population mean weight is 120 of population parameters
pounds Test hypotheses about
values of population
parameters
Make decisions
23
Sources of Data
Primary
Sources of
Data
Secondary
24
12
7/7/2021
Sources of Data
• Primary Sources:
The data collector is the one using the data for analysis
Data from a political survey
Data collected from an experiment
Observed data
• Secondary Sources
The person performing data analysis is not the data collector
Analyzing census data
Examining data from print journals or data published on
the internet.
25
Primary Data
Merits Demerits
26
13
7/7/2021
Secondary Data
Merits Demerits
They may not have been collected
It is readily available
the data through proper procedure
It is much less expensive as
compared to primary data They may have been influenced by
biased investigation
It is less time consuming as
They may be out of date and not
compared to primary data
suitable for present period
27
Types of Variables
Data
Categorical Numerical
Examples:
Marital Status
Discrete Continuous
Political Party
Eye Color
Examples: Examples:
(Defined categories)
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured characteristics)
28
14
7/7/2021
Types of Variables
Categorical
• Qualitative variables have values that can only be placed into
categories, such as “yes” and “no.”
• A variable that categorizes or describes an element of a
population.
Note: Arithmetic operations, such as addition and averaging, are not
meaningful for data resulting from a qualitative variable
Numerical
• Quantitative variables have values that represent quantities.
• A variable that quantifies an element of a population.
Note: Arithmetic operations such as addition and averaging, are
meaningful for data resulting from a quantitative variable.
29
Example
Identify each of the following examples as attribute (Categorical) or
numerical (Numerical) variables.
30
15
7/7/2021
Question?
Identify each of the following as examples of Categorical or
Numerical variables:
The temperature in Barrow, Alaska at 12:00 pm on any
given day.
The make of automobile driven by each faculty member.
Whether or not a 6 volt lantern battery is defective.
Models of cell phones
The length of time billed for a long distance telephone call.
The brand of cereal children eat for breakfast.
The type of book taken out of the library by an adult.
31
Level of Measurement
Ratio
Interval
Ordinal
Nominal NOIR
32
16
7/7/2021
Nominal scale
• A nominal scale classifies data into distinct categories in
which no ranking is implied.
• There must be distinct classes but these classes have no
quantitative properties. Therefore, no comparison can be
made in terms of one category being higher than the
other.
– Example : there are two classes for the variable gender -
males and females. There are no quantitative properties for
this variable or these classes and, therefore, gender is a nominal
variable.
– Another example is religion – Hindus, Catholic, Protestant,
Muslim, etc.
• Sometimes numbers are used to designate category
membership
33
Example
34
17
7/7/2021
Ordinal scale
• An ordinal scale classifies data into distinct categories in
which ranking is implied
• There are distinct classes but these classes have a natural
ordering or ranking. The differences can be ordered on
the basis of magnitude.
– Example - a gold medal reflects superior performance to a
silver or bronze medal in the Olympics. You can’t say a gold and
a bronze medal average out to a silver medal, though.
– Preference scales are typically ordinal – how much do you like
this cereal? Like it a lot, somewhat like it, neutral, somewhat
dislike it, dislike it a lot.
• Does not assume that the intervals between numbers are
equal
35
Example
36
18
7/7/2021
Example
37
Interval scale
• An interval scale is an ordered scale in which the
difference between measurements is a meaningful
quantity but the measurements do not have a true
zero point, that is It can go below zero
• It is possible to compare differences in magnitude, but
importantly the zero point does not have a natural
meaning.
• It captures the properties of nominal and ordinal
scales - used by most psychological tests.
• Example
– Percentage change in employment
– Percentage return on a stock
– Dollar change in stock price
38
19
7/7/2021
Example
• The difference between 1 and 2 years of age is the same
amount as the difference between 21 and 22 years of
age, or 50 and 51, or 65 and 66.
• We can see that the same difference exists between 10o
C and 20o C vs 25o C and 35o C. But we can not say that
20o C is twice as hot as a temperature of 10o C
• Celsius temperature is an interval variable. It is
meaningful to say that 25 degrees Celsius is 3 degrees
hotter than 22 degrees Celsius, and that 17 degrees
Celsius is the same amount hotter (3 degrees) than 14
degrees Celsius.
• Notice, however, that 0 degrees Celsius does not have a
natural meaning. That is, 0 degrees Celsius does not
mean the absence of heat!
39
40
20
7/7/2021
Ratio Scale
• Highest level of measurement
– Relative magnitude of numbers is meaningful
– Differences between numbers are comparable
– Location of origin, zero, is absolute (natural)
– Vertical intercept of unit of measure transform
function is zero
• Example
• Measurement like Height, Weight, and Volume
• Monetary Variables like Profit and Loss, Revenues,
Expenses
• Financial ratios like: P/E Ratio, Inventory Turnover,
and Quick Ratio.
41
42
21
7/7/2021
Interval Vs Ratio
• In an interval scale, you can take difference of two values.
You may not be able to take ratios of two values.
• Example: temperature in Celsius.
– You can say that if temperature in Delhi is 40 deg Celsius and
that in Shimla is 20 deg Celsius, then Delhi is 20 deg Celsius
hotter than Shimla (taking difference).
– But you cannot say Delhi is twice as hot as Shimla (not
allowed to take ratio).
• In a ratio scale, you can take a ratio of two values.
• Example
– 40 kg is twice as heavy as 20 kg (taking ratios).
– Also, “0” on ratio scale means the absence of that physical
quantity. “0” on interval scale doesn't mean the same.
– 0 kg means the absence of weight.
– 0 deg Celsius doesn't mean absence of heat.
43
Nominal
44
22
7/7/2021
45
Ordinal
46
23
7/7/2021
47
Interval
Ordinal Attributes can be ordered
48
24
7/7/2021
49
Ratio
Interval Distance is meaningful
50
25
7/7/2021
51
Example
• Many changes continue to occur in the
healthcare industry.
• Because of increased competition for patients
among providers and the need to determine how
providers can better serve their clientele,
hospital administrators sometimes administer a
quality satisfaction survey to their patients after
the patient is released.
• The following types of questions are sometimes
asked on such a survey.
• These questions will result in what level of data
measurement?
52
26
7/7/2021
Sample Questions
• How long ago were you released from the hospital?
• Which type of unit were you in for most of your stay?
– Coronary care
– Intensive care
– Maternity care
– Medical unit
– Pediatric /children’s unit
– Surgical unit
• In choosing a hospital, how important was the
hospital’s location? (circle one)
Very Important Somewhat Important Not Very
Important Not at All Important
53
54
27
7/7/2021
Level of Measurement :
Characteristics
55
56
28
7/7/2021
Level of Measurement:
Statistical Tests
57
Exercise
• The Rathburn Manufacturing Company makes electric wiring,
which it sells to contractors in the construction industry.
Approximately 900 electric contractors purchase wire from
Rathburn annually.
• Rathburn’s director of marketing wants to determine electric
contractors’ satisfaction with Rathburn’s wire.
• He developed a questionnaire that yields a satisfaction score
between 10 and 50 for participant responses.
• A random sample of 35 of the 900 contractors is asked to
complete a satisfaction survey. The satisfaction scores for the
35 participants are averaged to produce a mean satisfaction
score.
58
29
7/7/2021
Questions
• What is the population for this study?
• What is the sample for this study?
• What is the statistic for this study?
• What would be a parameter for this
study?
59
Example
Identify each of the following as examples of (1) nominal, (2)
ordinal, (3) discrete, or (4) continuous variables:
60
30
7/7/2021
Class Exercise
Q 1: Determine whether the variable is categorical or
numerical If numerical, determine whether the
variable is discrete or continuous .Determine the
level of measurement
Class Exercise
Q 2: Determine whether the variable is categorical or
numerical If numerical, determine whether the
variable is discrete or continuous .Determine the
level of measurement
62
31
7/7/2021
Exercise
Q 3: Suppose the following information is collected
from Mr. Ajay on his application for a home loan at
HDFC Bank. Classify each of the responses by type of
data and measurement scale
a. Monthly payments: $1,927.22
b. Number of jobs in past 10 years
c. Annual family income: $76,000.30
d. Marital status: Married
63
Class Exercise
Q 4 : A manufacturer of dog food was planning to
survey household in India to determine purchasing
habit of dog owners. Among the variables to be
collected are
The primary place of purchase of dog food?
Whether dry or moist food can be purchased ?
Number of dogs living in the household?
Whether the dog is pedigreed?
64
32
7/7/2021
Thank You
65
33