Unit 3 Research Methods
Unit 3 Research Methods
Introduction:
There are three categories of analysis to be aware of:
• Univariate analysis, which looks at just one variable
• Bivariate analysis, which analyses two variables
• Multivariate analysis, which looks at more than two variables
1. Univariate data –
This type of data consists of only one variable. The analysis of Univariate data
is thus the simplest form of analysis since the information deals with only one
quantity that changes. It does not deal with causes or relationships and the
main purpose of the analysis is to describe the data and find patterns that
exist within it. The example of a Univariate data can be height.
2. Bivariate data –
This type of data involves two different variables. The analysis of this type of
data deals with causes and relationships and the analysis are done to find out
the relationship among the two variables. Example of bivariate data can be
temperature and ice cream sales in summer season.
Suppose the temperature and ice cream sales are the two variables of a
bivariate data (figure 2). Here, the relationship is visible from the table that
temperature and sales are directly proportional to each other and thus
related because as the temperature increases, the sales also increase. Thus
bivariate data analysis involves comparisons, relationships, causes and
explanations. These variables are often plotted on X and Y axis on the graph
for better understanding of data and one of these variables is independent
while the other is dependent.
3. Multivariate data
When the data involves three or more variables, it is categorized under
multivariate. Example of this type of data is suppose an advertiser wants to
compare the popularity of four advertisements on a website, then their click
rates could be measured for both men and women and relationships between
Variables can then be examined.
lOMoAR cPSD| 44911668
It is similar to bivariate but contains more than one dependent variable. The
ways to perform analysis on this data depends on the goals to be achieved.
Some of the techniques are regression analysis, path analysis, factor analysis
and multivariate analysis of variance (MANOVA).
Disadvantages
• The main disadvantage of MVA includes that it requires rather complex
computations to arrive at a satisfactory conclusion.
lOMoAR cPSD| 44911668
Multivariate analysis technique can be classified into two broad categories viz., this
classification depends upon the question: are the involved variables dependent on
each other or not?
If the answer is yes: We have Dependence methods.
If the answer is no: We have Interdependence methods.
Multiple Regression
Multiple Regression Analysis– Multiple regression is an extension of simple linear
regression. It is used when we want to predict the value of a variable based on the
value of two or more other variables. The variable we want to predict is called the
dependent variable (or sometimes, the outcome, target, or criterion variable).
Multiple regressions use multiple “x” variables for each independent variable: (x1)1,
(x2)1, (x3)1, Y1)
Conjoint analysis
‘Conjoint analysis‘ is a survey-based statistical technique used in market research
lOMoAR cPSD| 44911668
that helps determine how people value different attributes (feature, function,
lOMoAR cPSD| 44911668
There are multiple conjoint techniques, few of them are CBC (Choice-based conjoint)
or ACBC (Adaptive CBC).
The weights assigned to each independent variable are corrected for the
interrelationships among all the variables. The weights are referred to as
discriminant coefficients.
Binary outcomes are everywhere: whether a person died or not, broke a hip has
hypertension or diabetes, etc.
We typically want to understand what the probability of the binary outcome is given
explanatory variables.
We could actually use our linear model to do so; it’s very simple to understand why.
We can then interpret the parameters as the change in the probability of Y when X
changes by one unit or for a small change in X For example, if we model , we could
interpret β1 as the change in the probability of death for an additional year of age
SEM in a single analysis can assess the assumed causation among a set of dependent
and independent constructs i.e. validation of the structural model and the loadings of
observed items (measurements) on their expected latent variables (constructs) i.e.
validation of the measurement model. The combined analysis of the measurement
and the structural model enables the measurement errors of the observed variables
to be analysed as an integral part of the model, and factor analysis combined in one
operation with the hypotheses testing.
Interdependence Technique
Interdependence techniques are a type of relationship that variables cannot be
classified as either dependent or independent.
Factor Analysis
Factor analysis is a way to condense the data in many variables into just a few
variables. For this reason, it is also sometimes called “dimension reduction”. It
lOMoAR cPSD| 44911668
makes the grouping of variables with high correlation. Factor analysis includes
techniques such as principal component analysis and common factor analysis.
This type of technique is used as a pre-processing step to transform the data before
using other models. When the data has too many variables, the performance of
multivariate techniques is not at the optimum level, as patterns are more difficult to
find. By using factor analysis, the patterns become less diluted and easier to analyse.
Cluster analysis
Cluster analysis is a class of techniques that are used to classify objects or cases into
relative groups called clusters. In cluster analysis, there is no prior information
about the group or cluster membership for any of the objects.
• While doing cluster analysis, we first partition the set of data into groups
based on data similarity and then assign the labels to the groups.
• The main advantage of clustering over classification is that it is adaptable
to changes and helps single out useful features that distinguish different
groups.
Multidimensional Scaling
Multidimensional scaling (MDS) is a technique that creates a map displaying the
relative positions of several objects, given only a table of the distances between
them. The map may consist of one, two, three, or even more dimensions. The
program calculates either the metric or the non-metric solution. The table of
distances is known as the proximity matrix. It arises either directly from
experiments or indirectly as a correlation matrix.
Correspondence analysis
Correspondence analysis is a method for visualizing the rows and columns of a table
of non-negative data as points in a map, with a specific spatial interpretation. Data
are usually counted in a cross-tabulation, although the method has been extended
too many other types of data using appropriate data transformations. For cross-
tabulations, the method can be considered to explain the association between the
rows and columns of the table as measured by the Pearson chi-square statistic. The
method has several similarities to principal component analysis, in that it situates
lOMoAR cPSD| 44911668
the rows or the columns in a high-dimensional space and then finds a best-fitting
subspace, usually a plane, in which to approximate the points.
Dependence methods
Dependence methods are used when one or some of the variables are dependent on
others. Dependence looks at cause and effect; in other words, can the values of two or
more independent variables be used to explain, describe, or predict the value of
another, dependent variable? To give a simple example, the dependent variable of
“weight” might be predicted by independent variables such as “height” and “age.”
In machine learning, dependence techniques are used to build predictive models. The
analyst enters input data into the model, specifying which variables are independent
and which ones are dependent—in other words, which variables they want the model to
predict, and which variables they want the model to use to make those predictions.
Interdependence methods
lOMoAR cPSD| 44911668
4. The fourth and final step is to analyse the results and either reject the null
hypothesis, or state that the null hypothesis is plausible, given the data.
A random sample of 100 coin flips is taken, and the null hypothesis is then tested. If it is
found that the 100 coin flips were distributed as 40 heads and 60 tails, the analyst
would assume that a penny does not have a 50% chance of landing on heads and would
reject the null hypothesis and accept the alternative hypothesis.
If, on the other hand, there were 48 heads and 52 tails, then it is plausible that the coin
could be fair and still produce such a result. In cases such as this where the null
hypothesis is "accepted," the analyst states that the difference between the expected
results (50 heads and 50 tails) and the observed results (48 heads and 52 tails) is
"explainable by chance alone."
Hypothesis testing is the use of statistics to determine the probability that a given
hypothesis is true. The usual process of hypothesis testing consists of four steps.
1. Formulate the null hypothesis (H NOT) (commonly, that the observations are the
result of pure chance) and the alternative hypothesis Ha (commonly, that the
observations show a real effect combined with a component of chance variation).
2. Identify a test statistic that can be used to assess the truth of the null hypothesis.
3. Compute the P-value, which is the probability that a test statistic at least as significant
as the one observed would be obtained assuming that the null hypothesis were true.
The smaller the P-value, the stronger the evidence against the null hypothesis.
4. Compare the P-value to an acceptable significance value alpha (sometimes called
an alpha value). If P<=alpha, that the observed effect is statistically significant, the null
hypothesis is ruled out, and the alternative hypothesis is valid.
lOMoAR cPSD| 44911668
Measures of Association:
Methods of analysis
Pearson’s correlation coefficient
A typical example for quantifying the association between two variables measured on
an interval/ratio scale is the analysis of relationship between a person’s height and
weight. Each of these two characteristic variables is measured on a continuous scale.
Correlation coefficients that differ from 0 but are not −1 or +1 indicate a linear
relationship, although not a perfect linear relationship. In practice, ρ (the population
correlation coefficient) is estimated by r, which is the correlation coefficient derived
from sample data.
lOMoAR cPSD| 44911668
As an example of when Spearman rho would be appropriate, consider the case where
there are seven substantial health threats to a community. Health officials wish to
determine a hierarchy of threats in order to most efficiently deploy their resources.
They ask two credible epidemiologists to rank the seven threats from 1 to 7, where 1 is
the most significant threat. The Spearman rho or Kendall tau may be calculated to
measure the degree of association between the epidemiologists’ rankings, thereby
indicating the collective strength of a potential action plan. If there is a significant
association between the two sets of ranks, health officials may feel more confident in
their strategy than if a significant association is not evident.
Chi-square test
The chi-square test for association (contingency) is a standard measure for association
between two categorical variables. The chi-square test, unlike Pearson’s correlation
coefficient or Spearman rho, is a measure of the significance of the association rather
than a measure of the strength of the association.
A simple and generic example follows. If scientists were studying the relationship
between gender and political party, then they could count people from a random sample
belonging to the various combinations: female-Democrat, female-Republican, male-
Democrat, and male-Republican. The scientists could then perform a chi-square test to
determine whether there was a significant disproportionate membership among those
groups, indicating an association between gender and political party.
lOMoAR cPSD| 44911668
Presenting Insights and findings using written reports and oral presentation:
Oral presentation:
Two words that are capable of striking fear into the hearts of even the most confident
student. But should they? Though not all of us can ever hope to reach the heady heights
of oratory genius achieved by the likes of Barack Obama or Martin Luther King Jr, there
are steps we can take to help us to present our point of view strongly.
Step 1: Research
Find out as much as you can about your chosen topic. The key skills for presenting
argument in the VCE English Study Design clearly state that you need to ‘conduct
research to support the development of arguments on particular issues and
acknowledge sources accurately and appropriately where relevant’. You are expected to
research your chosen topic so that you have a deep and nuanced understanding of the
issues and arguments. Read from multiple sources that present various points of view,
and take notes on the arguments used.
So, before you start writing, take the time to think carefully about the following aspects
of your presentation.
Your contention
Where do you stand on the issue? Why? Express this in a clear and direct sentence.
Avoid statements such as ‘Greyhound racing is bad’. This a vague and general opinion,
not a contention. A contention on this issue would be something like ‘The cruel and
abusive practice of greyhound racing should be banned immediately’.
Your purpose
What do you want your imagined audience to think, feel or do? Do you wish to inform or
educate them? To create alarm? To effect change? Your purpose should be closely
related to your contention.
Your tone
What feelings are you seeking to communicate and to evoke in the audience? What
mood are you trying to generate? Will you be using humour to relax your audience? Will
you be hostile? Sympathetic? Will your tone change at any point and, if so, why?
All of the above are important factors to consider, as they will affect your language
choices and the persuasive language techniques you employ.
Yourself:
What persuasive language techniques will I use?
What evidence will I present?
Try to vary your chosen techniques, and remember Aristotle’s principles of rhetoric –
logos (appeal to logic and reason), ethos (character of the speaker)
and pathos (emotional influence of the speaker). A strong argument will address all
three elements in varying degrees.
Anecdote – this is a great way to highlight a personal connection to the issue or to strike
a sympathetic tone.
lOMoAR cPSD| 44911668
Inclusive language – if you want to create a shared sense of purpose, make it clear to
your audience that they are part of this issue, and that how they feel matters.
Once you have your audience’s attention, introduce yourself (or your persona), clarify
the issue, state your contention and signpost your main arguments.
For each body paragraph, ensure that you create strong topic sentences that clearly
highlight your main arguments, and then develop each argument using your carefully
selected language and evidence.
There are a few things that you should keep in mind as you write:
Cohesion is king! Keep your line of argument consistent and use connectives
throughout.
Analyse the evidence! Don’t just present a raft of statistics or evidence and expect them
to make the argument for you. Analyse their importance in relation to the debate.
Include some rebuttal! An issue has two sides – you need to rebut some or all arguments
from the opposing point of view.
To ensure that you finish on a powerful note, consider using an appeal, a rhetorical
question, or a call to action.