0% found this document useful (0 votes)
4 views

Chapter 7

Chapter Seven discusses data analysis and interpretation, emphasizing the conversion of raw data into meaningful results through various statistical techniques. It covers data processing steps, statistical measures, and the importance of hypothesis testing and interpretation in research. The chapter highlights the role of software like SPSS in data analysis and the necessity of careful interpretation to draw accurate conclusions.

Uploaded by

Milkii Santa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 7

Chapter Seven discusses data analysis and interpretation, emphasizing the conversion of raw data into meaningful results through various statistical techniques. It covers data processing steps, statistical measures, and the importance of hypothesis testing and interpretation in research. The chapter highlights the role of software like SPSS in data analysis and the necessity of careful interpretation to draw accurate conclusions.

Uploaded by

Milkii Santa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter Seven: Data Analysis and

Interpretation
❑ Major Points for Discussions
• Meaning of data analysis
• Data processing
• Statistical techniques or methods of data analysis
• Meaning and techniques of interpretation
Data Analysis
• Analysis of data refers to the conversion of raw data into a
form meaningful to draw some results from the data after
the proper treatment.

• Analysis of data includes comparison of the outcomes of the


various treatments upon several groups and making
appropriate decisions as to the achievement of the goals of
research.
Data Analysis…
➢Data, fact and figures are silent and they never speak for
themselves but they have complexities.

➢ It is only by organizing, analyzing and interpreting the


data that we can know their important features, inter-
relationship and cause/effect relationship.

➢The trends and sequences inherent in the phenomena can


be elaborated by means of generalization.
Data Analysis…
➢Analysis of data involves a number of closely related
operations that are performed with the purpose of
summarizing the collected data and organizing these in
such a manner that they will yield answers to the research
questions or test hypothesis.
➢it also refers to seeing the data in the light of hypothesis or
research questions and the prevailing theories and drawing
conclusions that are as amenable to theory formation as
possible.
Data Analysis…
➢ these days, with the availability of computer facilities and
different software, we can analyze many complex data
within short period of time.
• One of the most popular statistical packages which can
perform highly complex data manipulation and analysis with
simple instructions is SPSS which is an acronym for
Statistical Package for the Social Sciences.
Data analysis…
➢Since it is highly skilled and technical job, data analysis should
be carried out by the researcher himself or under his close
supervision.

➢The researcher should also possess judgment skill, ability of


generalization and should be familiar with the background
objects and hypothesis of study.
Important Questions in Data Analysis
• What is the nature of the data?

• How will the data be categorized?

• Will standardized editing and coding procedure be used?

• How many variables are to be investigated simultaneously?

• What statistical software will be used?

• What questions need to be answered?


Data Preparation/Processing

• Processing is a statistical method by which the


collected data is so organized and get ready for
further analysis and interpretation of data.

• It is an intermediary stage between the


collection of data and their analysis and
interpretation.
Steps in Data Processing
❖ Processing consists of:-
• Editing -checking the data for accuracy
• Coding the data- the assignment of numerical scores or
classifying symbols to edited data
• Classification- arranging data into sequences and
groups according to some common
characteristics.
• Tabulation- summarizing raw data and displaying the
same in compact form.
➢ statistical tables are orderly arrangements of
data in columns and rows
Data Cleaning
❖Part of data editing and involves three tasks:
▪ Completeness
▪ Accuracy
▪ Uniformity
1. Completeness - checking that there is an answer to
every question.
2. Accuracy - as much as possible, a check should be
made that all questions are answered accurately.
3. Uniformity- checking that interviewers have
interpreted instructions and questions uniformly.
Data Cleaning…
• Data cleaning helps to get rid of any obvious data
entry errors:
• Outliers (very high or low numbers)
Example: Age = 110 (really 10 or 11?)
• Value entered that doesn’t exist for a variable?
Example: 2 entered where 1=male, 0=female
• Missing values?
➢ Did the person not give an answer? Was answer
accidentally not entered into the database?
Statistical Analysis in Research
• Statistical Analysis means the computation of certain indices
or measures along with searching for patterns of relationship
that exist among the data groups.
• Statistical Analysis involves estimating the values of unknown
parameters of the population and testing of hypothesis for
drawing inferences.
• The fundamental question that arises in the minds of the
researcher is: “What technique should be used to analyze the
collected data?”
Types of Statistical Measures
❖The common statistical measures used in statistical analysis may be categorized
into the following types:
1. Measures of central tendency:
➢ Mean, mode, median.
2. Measures of dispersion:
a. Ranges, variance, standard deviation.
b. For comparison purpose, we use mostly the coefficient of standard
deviation or the coefficient of variation, t-test, Chi-square.
3. Measures of Association/relations: Correlation, Regression, Factor analysis.
4. Analysis of variance:
➢ One-way ANOVA, Two-way ANOVA, Multivariate analysis and Analysis of
Covariance.
5. Time series analysis:
➢ Seasonal, cyclical, trend and erratic variations.
Types of Statistical Analysis
➢Statistical analysis can broadly be classified into:
• Descriptive analysis (statistics), and
• Inferential analysis (statistics)
a. Descriptive Statistics
• Descriptive is the term given to the analysis of data that helps
describe, show or summarize data in a meaningful way.
• Typically, there are two general types of statistics that are
used to describe data:
1. Measures of central tendency (mean, median and mode)
2. Measures of dispersion/spread/variation (range, standard
deviation, variance)
Types of statistics…
b. Inferential statistics
• is concerned with making predictions or inferences about a
population from observations and analyses of a sample.
• That is, we can take the results of an analysis using a sample and
can generalize it to the larger population that the sample
represents.
• There are two areas of statistical inferences (a) statistical
estimation and (b) the testing of hypothesis.
• t-test, Analysis of Variance (ANOVA), Chi-square, Analysis of
Covariance, (ANCOVA), Correlation analysis, Regression analysis,
etc. commonly used techniques in inferential analysis.
Descriptive & Inferential Statistics
Descriptive Statistics Inferential Statistics
• Organize • Generalize from
• Summarize samples to popn
• Simplify • Hypothesis testing
• Presentation of data • Relationships
among variables

Describing data Make predictions


Hypothesis Testing
❖ Comparing Means: T-test
• t-test is a type of inferential statistics used to determine if
there is a significant difference between the means of two
groups.
• There are three common types of t-tests:
➢ One sample t-test (mean of one sample with the
population mean)

➢ Independent samples t-test (means of two independent


samples),

➢ Dependent (paired) samples t-test.


Parametric Tests…
a. One Sample t-test

• One sample t-test is used to compare the mean of a single


sample with the population mean.

• Example

• comparing the per capita income of Oromia with that of the


national average.
Parametric Tests…
b. Independent Samples t-test
• Used when we are interested in comparing two populations using a
random sample from each.

• Used to detect the differences between the means of two independent


groups, hence independent samples test.

❖Examples

a. Comparison of the per capita incomes of two different regions.

b. Boys vs. Girls on reading ability test

c. There will be no difference between men and women in the mid


exams
Dependent (Paired) Samples t-test
• A study where observations are made on the same sample at two
different times, is called dependent or paired sample t-test.
• Examples
➢ The HR manager wants to know if a particular training program
had any impact in increasing the motivation level of the
employees.
➢ The production manager wants to know if a new method of
handling machines helps in reducing the break down period.
➢ A pedagogist wants to know if interactive teaching helps
students learn more as compared to one-way lecturing.
Comparing Means: Analysis of Variance
• Analysis of Variance (ANOVA) is used to compare the
means of more than two populations.
❖Examples
• Consumer behavior: A researcher wants to investigate the
impact of three different advertising stimuli on the shopping
propensity of males and females as well as consumers of
different age brackets. The dependent variable here is
shopping propensity and independent variables or the factors
are advertising stimuli, gender, and age brackets.
• Marketing management: A marketing manager wants to
investigate the impact of different discount schemes on the
sale of three major brands of edible oil.
Correlation Analysis
• Correlation analysis is a method that is used to discover if there is
a relationship between two variables, and how strong that
relationship may be.
• When there are two variables, the correlation between them is
called simple correlation.
• When there are more than two variables and we want to study
relation between two of them only, treating the others as
constant, the relation is called partial correlation.
• When there are more than two variables and we want to study
relation of one variable with all other variables together, the
relation is called multiple correlations
Correlation …
• The marketing manager wants to know if price reduction has any
relationship with increasing sales.

• The production department wants to know if the number of


defective items produced has anything to do with the age of the
machine.

• The HR department wants to know if the productivity of its


workers decreases with the number of hours they work.

• Does attendance have an association with exam score?


Correlation Coefficient
• The correlation coefficient gives a mathematical value for
measuring the strength of the linear relationship between two
variables.
• It can take values from –1 to 1 with:
➢ +1 represents absolute positive linear relationship (as X
increases, Y increases).
➢ 0 represents no linear relationship (X and Y have no
pattern).
➢ –1 represents absolute inverse relationship (as X
increases, Y decreases).
➢ The closer the value is to -1 or +1, the stronger the
association is between the variables.
Regression analysis
• Regression analysis is a statistical process for estimating the
relationships among variables.

• It includes many techniques for modeling and analyzing several


variables.

• focuses on the relationship between a dependent variable and


one or more independent variables.

• commonly used in a cause-effect type of research


Regression examples
• The marketing manager wants to know if sales is dependent on
factors such as advertising budget, number of products
introduced, number of sales personnel etc.

• The HR department wants to predict the efficiency of


management trainees based on their academic performance,
leadership abilities, IQ level etc.
Types of Regression Analysis
• Simple Linear Regression: is a statistical model that utilizes one
quantitative independent variable “X” to predict one quantitative
dependent variable “Y”.

• Multiple Linear Regression: is a statistical model that utilizes two


or more quantitative and qualitative explanatory variables
(X1,X2,…Xp) to predict a quantitative dependent variable.
Multiple Regression Types
• There are three major types of multiple regression: standard
multiple regression, hierarchical or sequential regression and
stepwise or statistical regression.
• In standard multiple regression, all the IVs are entered into the
equation together.
• In hierarchical regression, IVs are entered in a pre-specified
manner by the researcher, which is driven by theoretical
considerations.
• In stepwise regression, order of entry of variable is solely based
on statistical criteria. Independent variables entered according to
some order
– By size or correlation with dependent variable
– In order of significance
Regression Coefficient
• Regression coefficient is a measure of how strongly each
IV (also known as predictor variable) predicts the DV.
• Two types of regression coefficients—unstandardized
coefficients and standardized coefficients, also known as
beta value.
• The unstandardized coefficients can be used in the
equation as coefficients of different IVs along with the
constant term to predict the value of DV.
• The standardized coefficient (beta) is, however, measured
in standard deviations.
• A beta value of 2 associated with a particular IV indicates
that a change of 1 standard deviation in that particular IV
will result in a change of 2 standard deviations in the DV.
Example: Output from SPSS
• Key regression table:

Y = -6.66 + 0.36x P – value < 0.001

• As p < 0.05, gestational age is a significant predictor of


birth weight. Weight increases by 0.36 lbs for each week
of gestation.
How reliable are predictions? – R2
How much of the variation in birth weight is explained by the
model including Gestational age?

Proportion of the variation in birth weight explained by the


model R2 = 0.499 = 50%
Predictions using the model are fairly reliable.

Which variables may help improve the fit of the model?


Compare models using Adjusted R2
Assumptions of multiple regression

1. Sample size
2. Normality,
3. Linearity,
4. Homoscedasticity
5. Others
Logistic regression
• In multiple regression, we explored a technique to assess the impact
of a set of predictors on a dependent variable.
• In that case the dependent variable was measured as a continuous
variable.
• There are many research situations, however, when the dependent
variable of interest is categorical/binary in nature (e.g. win/lose;
fail/pass; dead/alive).
• Unfortunately, multiple regression is not suitable when you have
categorical dependent variables.
• For logistic regression the dependent variable is to be
categorical/binary (having two categories and the independent
variable can be continuous or binary.
Interpretation of Research Findings

• Interpretation refers to the task of drawing


inferences from the collected facts after an
analytical task is done.

• The task of drawing conclusion or inferences and of


explaining their significance after a careful analysis
of selected data is known as interpretation.
Interpretation…
➢For any successful study the task of analysis and
interpretation should be designed before the data
are actually collected with the exception of
formularize studies where the researcher had no idea
as to what kind of answer he wants.

➢Otherwise there is always a danger of being too late


and the chances of missing important relevant data.
Interpretation…
➢Since analysis and interpretation of data are
interwoven the interpretation should more properly
be conceived of as a special aspect of analysis rather
than a distinct operation.
➢ Interpretation is the process of establishing
relationship between variables, which are expressed
in the findings and why such relationship exists.
Interpretation…
➢It must, therefore, be clear that if the methods of
statistics are not properly and correctly applied as the
science demands, then the inferences which will be
drawn from a set of data will also be wrong.
➢On the other hand, if the data have been collected and
analyzed properly according to accepted principles of
the science, there is no reason why the conclusions,
which emerge form such a data, are not found to be
fairly accurate.
➢Hence, it is said that “statistics are like clay of which you
can make either a god or a devil.”
Need of interpretation
➢Interpretation is considered as a basic component of
research process because of the following reasons:
➢It is through interpretation that the researcher can
well understand the abstract principle that works
beneath his findings.
➢It will lead to the establishment of explanatory
concepts that can serve as a guide for further
research study.
End of Chapter

You might also like