0% found this document useful (0 votes)
22 views

Week-1 Why Do We Need Statistics

The document discusses various topics related to data analysis in hospitality and tourism including the research process, types of data analysis methods, initial observations, generating and testing theories, collecting data to test theories, levels of measurement, measurement error, analyzing data through histograms, properties of frequency distributions, central tendency measures, dispersion measures, normal probability distributions, and z-scores.

Uploaded by

cgamorano
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Week-1 Why Do We Need Statistics

The document discusses various topics related to data analysis in hospitality and tourism including the research process, types of data analysis methods, initial observations, generating and testing theories, collecting data to test theories, levels of measurement, measurement error, analyzing data through histograms, properties of frequency distributions, central tendency measures, dispersion measures, normal probability distributions, and z-scores.

Uploaded by

cgamorano
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Data Analysis in Hospitality

and Tourism

Week 1
Tadayuki Hara, PhD
The Research Process

2
Types of Data Analysis

Quantitative Methods
Testingtheories using numbers
Qualitative Methods
Testing theories using language
 Magazine articles/Interviews
 Conversations

 Newspapers

 Media broadcasts

3
Initial Observation
Find something that needs explaining
Observe the real world
Read other research
Test the concept: collect data
Collect data to see whether your guess is
correct
To do this, you need to define variables
 Anything that can be measured and can differ
across entities or time.

4
The Research Process

5
Generating and Testing Theories
Theories
An hypothesized general principle or set of principles
that explain known findings about a topic and from
which new hypotheses can be generated.
Hypothesis
A prediction from a theory.
A person with increased disposable income is more
likely to travel.
Satisfied guests with hotel services would be more
likely to come back.
A hotel employee who are satisfied more with
his/her job (salary, bosses, benefits) would quit less
than comparable employees.

6
The Research Process

7
Collect Data to Test Your Theory
Hypothesis:
 Propose a hypothesis
Independent Variable
 Possible associations
 A predictor
variable
 A manipulated variable (in experiments)
Dependent Variable
 The proposed effect
 An outcome variable
 Measured not manipulated (in experiments)

8
Levels of Measurement
 Categorical (entities are divided into distinct categories):
 Binary variable: There are only two categories
 e.g. dead or alive.
 Nominal variable: There are more than two categories (orders do not have
meanings)
 e.g. race, nationality
 Ordinal variable: The same as a nominal variable but the categories have a
logical order (e.g. Likert scale)
 e.g. “how do you like the streaming video course so far?”
 (5. Like it very much, 4. Like it, 3. neutral, 2. Not like it 1. hate it – Ordinal has range!)

 Continuous (entities get a distinct score):


 Interval variable: Equal intervals on the variable represent equal differences in
the property being measured
 e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15.
(shoe size – 8, 8.5, 9.0…)
 Ratio variable: The same as an interval variable, but the ratios of scores on
the scale must also make sense
 e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as
anxious as someone scoring 8. (temperature, length of work etc)

9
Measurement Error
Measurement error
 Thediscrepancy between the actual value we’re trying to
measure, and the number we use to represent that value.
Example:
 You (in reality) weigh 80 kg.
 You stand on your bathroom scales and they say 83 kg.
 The measurement error is 3 kg.

10
The Research Process

11
Analysing Data: Histograms

Frequency Distributions (aka Histograms)


A graph plotting values of observations on
the horizontal axis, with a bar showing how
many times each value occurred in the data
set.
The ‘Normal’ Distribution
Bell shaped
Symmetrical around the centre

12
The Normal Distribution
Properties of Frequency Distributions
Skew
The symmetry of the distribution.
Positive skew (scores bunched at low values with
the tail pointing to high values).
Negative skew (scores bunched at high values
with the tail pointing to low values).
Kurtosis
The ‘heaviness’ of the tails.
Leptokurtic = heavy tails.
Platykurtic = light tails.

14
Skew

15
Kurtosis

16
Central tendency: The Mode

Mode
Themost frequent score
Bimodal
Having two modes
Multimodal
Having several modes

17
Bimodal and Multimodal
Distributions

18
Central Tendency: The Median
Here are Numbers of friends of 11 Facebook users
The Median is the middle score when scores are
ordered:

57 40 103 234 93 53 116 98 108 121 22 Data

22 40 53 57 93 98 103 108 116 121 234 Ordered Data

Median

19
Central Tendency: The
MeanMean
 The sum of scores divided by the number of
scores.
 Number of friends of 11 Facebook users.

20
The Dispersion: Range
The Range
 The smallest score subtracted from the largest
 For our Facebook friends data the highest score is 234
and the lowest is 22; therefore the range is:
 234 −22 = 212

21
The Dispersion: The Inter‐quartile
range
Quartiles
 The three values that split the sorted data into four
equal parts.
 Second Quartile = median.
 Lower quartile = median of lower half of the data
 Upper quartile = median of upper half of the data

22
Deviance

We can calculate the spread of scores by looking


at how different each score is from the center of a
distribution e.g. the mean:

23
Sum of Squared Errors, SS
Indicates the total dispersion, or total deviance
of scores from the mean:

It’s size is dependent on the number of scores


in the data.
More useful to work with the average
dispersion, known* as the variance: (*=
calculated)

24
Standard Deviation
The variance gives us a measure in units squared.
In our Facebook example we would have to say that the
average error in out data was 3224.6 friends squared.
This problem is solved by taking the square root of the
variance, which is known as the standard deviation:

25
Using a Frequency Distribution to
go Beyond the Data

26
Important Things to Remember
The Sum of Squares, Variance, and Standard Deviation
represent the same thing with different expressions:
The ‘Fit’ of the mean to the data
The variability in the data
How well the mean represents the observed data
Error

27
The Normal Probability Distribution

28
Going beyond the data: Z‐scores
Z‐scores
 Standardising a score with respect to the other scores in
the group.
 Expresses a score in terms of how many standard
deviations it is away from the mean.
 The distribution of z‐scores has a mean of 0 and SD

XX
= 1.

z
s
29
Probability Density function of a
Normal Distribution

30
Properties of z‐scores
1.96 cuts off the top 2.5% of the distribution.
−1.96 cuts off the bottom 2.5% of the distribution.
As such, 95% of z‐scores lie between −1.96 and 1.96.
99% of z‐scores lie between −2.58 and 2.58,
99.9% of them lie between −3.29 and 3.29.

These are important, and will be


important for you, once you
start to read academic research
paper.

31

You might also like