1-Types of Research - Research Methods Notes
1-Types of Research - Research Methods Notes
Quantitative research
It includes designs, techniques and measures that produce discreet numerical or quantifiable
data.
FORMULATING HYPOTHESES
A hypothesis is a researcher's prediction regarding the outcome of the study. It states possible '
differences, relationships or causes between two variables or concepts. Hypothesis are derived
from or based on existing theories, previous research, personal observations or experiences. The
test of a hypothesis involves collection and analysis of data that may either support or fail to
support the hypothesis. If the results fail to support a stated hypothesis, it does not mean that the
study has failed but it implies that the existing theories or principles need to be revised or
retested under various situations.
Purpose of hypothesis
It provides direction by bridging the gap between the problem and the evidence needed
for its solution.
It ensures collection of the evidence necessary to answer the question posed in the
statement of the problem.
It enables the investigator to assess the information he or she has collected from the
standpoint of both relevance and organisation.
It sensitizes the investigator to certain aspects of the situation that are relevant regarding
the problem at hand.
It permits the researcher to understand the problem with greater clarity and use the data to
find solutions to problems.
It guides the collection of data and provides the structure for their meaningful
interpretation in relation to the problem under investigation.
It forms the framework for the ultimate conclusions as solutions.
LITERATURE REVIEW
The review of literature involves the systematic identification, location and analysis of
documents containing information related to the research problem being investigated It should be
extensive and thorough because it is aimed at obtaining detailed knowledge of the topic being
studied.
ETHICS IN RESEARCH
Ethics are norms or standards of behaviour that guide moral choices about our behaviour and our
relationship with others. Ethics differ from legal constraints, in which generally accepted
standards have defined penalties that are universally enforced. The goal of ethics in research is to
ensure that no one is harmed or suffers adverse consequences from research activities.
As the research is designed, several ethical considerations must be balanced e.g.
Protect the rights of the participant or subject.
Ensure the sponsor receives ethically conducted and reported research.
Follow ethical standards when designing research
Protect the safety of the researcher and team
Ensure the research team follows the design
Deception occurs when the respondents are told only part of the truth or when the truth is fully
compromised. The benefits to be gained by deception should be balanced against the risks to the
respondents. When possible, an experiment or interview should be designed to reduce reliance
on deception. In addition, the respondent's rights and well-being must be adequately protected. In
instances where deception in an experiment could produce anxiety, a subject's medical condition
should be checked to ensure that no adverse physical harm follows.
c) Rights to privacy
All individuals have a right to privacy and researchers must respect that right. The privacy
guarantee is important not only to retain validity of the research but also to protect respondents.
Once the guarantee of confidentiality is given, protecting that confidentiality is essential. The
researcher can protect respondent's confidentiality in several ways, which include: -
Obtaining signed nondisclosure documents
Restricting access to respondent identification.
Revealing respondent information only with written consent.
Restricting access to data instruments where the respondent is identified.
Non disclosure of data subsets.
Researchers should restrict access to information that reveals names, telephone numbers, address
or other identifying features. Only researchers who have signed nondisclosure, confidentiality
forms should be allowed access to the data. Links between the data or-database and the
identifying information file should be weakened. Individual interview response sheets should be
inaccessible to everyone except the editors and data entry personnel.
Occasionally, data collection instruments should be destroyed once the data are in a data file.
Data files that make it easy to reconstruct the profiles or identification of individual respondents
should be carefully controlled. For very small groups, data should not be made available because
it is often easy to pinpoint a person within the group. Employee-satisfaction survey feedback in
small units can be easily used to identify an individual through descriptive statistics.
Privacy is more than confidentiality. A right to privacy means one has the right to refuse to be
interviewed or to refuse to answer any question in an interview. Potential participants have a
right to privacy in their own homes, including not admitting researchers and not answering
telephones. They have the right to engage in private behaviour in private places without fear of
observation. To address these rights, ethical researchers can do the following:-
Inform respondents of their right to refuse to answer any questions or participate in the
study.
Obtain permission to interview respondents
Schedule field and phone interviews.
Limit the time required for participation.
Restrict observation to public behaviour only.
2. Ethics and the sponsor
There are ethical considerations to keep in mind when dealing with the research client or
sponsor. Whether undertaking product, market, personnel, financial or other research, a sponsor
has the right to receive ethically conducted research.
(a) Confidentiality
Sponsors have a right to several types of confidentiality including sponsor nondisclosure,
purpose nondisclosure and findings nondisclosure.
Sponsor nondisclosure: Companies have a right to dissociate themselves from the
sponsorship of a research project. Due to the sensitive nature of the management dilemma
or the research question, sponsors may hire an outside consulting or research firm to
complete research projects, this is often done when a company is testing a new product
idea, to avoid potential consumers from being influenced by the company's current image
or industry standing. If a company is contemplating entering a new market, it may not
wish to reveal its plans to competitors. In such cases, it is the responsibility of the
researcher to respect this desire and device a plan to safeguard the identity of the sponsor.
Purpose nondisclosure: It involves protecting the purpose of the study or its details. A
research sponsor may be testing a new idea that is not yet patented and may not want the
competitor to know his plans. It may be investigating employee complaints and may not
want to spark union activity. The sponsor might also be contemplating a new public stock
offering, where advance disclosure would spark the interest of authorities or cost the firm
thousands of shillings.
Findings nondisclosure: If a sponsor feels no need to hide its identity or the study's
purpose, most sponsors want research data and findings to be confidential; at least until
the management decision is made.
CLASSIFICATIONS OF DESIGNS
Research can be classified using eight different descriptors as shown in the table below:
Category Options
The degree to which the research questions Exploratory study
has been crystallized Formal study
The method of data collection Monitoring
Interrogation/communication
The power of the researcher to produce Experimental
effects in the variables in the study. Ex post facto
The purpose of the study Descriptive
casual
The time dimension Cross-sectional
longitudinal
The topical scope – breadth and depth of the Case
study Statistical study
The research environment Field setting
Laboratory research
simulation
The participants perception of research Actual routine
activity Modified routine
1. Degree to which the research questions has been crystallized
Occasionally, research specialists may be asked by sponsors to participate in unethical
behaviour. Compliance by the researcher would be a breach of ethical standards. Some examples
to be avoided are:
Violating respondent confidentiality
Changing data or creating false data to meet a desired objective
Changing data presentations or interpretations.
Interpreting data from a biased perspective.
Omitting sections of data analysis and conclusions.
Making recommendations beyond the scope of the data collected
The ethical course often requires confronting the sponsor's demand and taking the following
actions:
Educating the sponsor on the purpose of research
Explain the researcher's role in fact finding versus the sponsor's role in decision-making.
Explain how distorting the truth or breaking faith with respondents leads to future
problems
Failing moral suasion, terminate the relationship with the sponsor.
8. Participants' perceptions
The usefulness of a design may be reduced when people in a disguised study perceive that
research is being conducted. Participants' perceptions influence the outcomes of the research in
subtle ways. There are three levels of perception:
Participants perceive no deviations from everyday routines
Participants perceive deviations, but as unrelated to the researcher.
Participants perceive deviations as researcher-induced.
In all research environments and control situations, researchers need to be vigilant to effects that
may alter their conclusions. Participant's perceptions serve as a reminder to classify one's study
by type, to examine validation strengths and weaknesses and to be prepared to qualify results
accordingly.
When we consider the scope of qualitative research, several approaches are adaptable for
exploratory investigations of management questions:
In-depth interviewing - usually conversational rather than structured.
Participant observation - to perceive firsthand what participants in the setting experience
Films, photographs and videotapes - to capture the life of the group under study.
Case studies - for an in-depth contextual analysis of a few events or conditions
Document analysis - to evaluate historical or contemporary confidential or public records,
reports, government documents and opinions.
Where these approaches are combined, four exploratory techniques emerge with wide
applicability for the management researcher: -
i. Secondary data analysis
ii. Experience surveys
iii. Focus groups
iv. Two-stage designs
An exploratory research is finished when the researchers have achieved the following:
Established the major dimensions of the research task
Defined a set of subsidiary investigative questions that can be used as a guide to a
detailed research design.
Developed several hypotheses about possible causes of a management dilemma. Learned
that certain other hypotheses are such remote possibilities that they can be safely ignored
in any subsequent study.
Concluded additional research is not needed or is not feasible.
Sampling procedures:
There are two major ways of selecting samples;
Probability sampling methods
Non - Probability sampling methods
1. Probability Sampling Methods
Samples are selected in such a way that each item or person in the population has a known
(Nonzero) likelihood of being included in the sample.
Types of Probability sampling methods
a) Simple Random Sampling:
A sample is selected so that each item or person in the population has the same chance of being
included.
Advantages
Easy to implement with automatic dialing and with computerized voice response
systems.
Disadvantages
Requires a listing of population elements.
Takes more time to implement
Uses larger sample sizes
Produces larger errors
Expensive
Disadvantages
More error (Lower statistical efficiency) due to subgroups being homogeneous rather the
heterogeneous.
Advantage
Widely used by pollsters, marketers and other researchers.
Disadvantage
It gives no assurance that the sample is representative of the variables being studied.
The data used to provide controls may be outdated or inaccurate.
There is a practical limit on the number of simultaneous controls that can be applied to
ensure precision.
Since the choice of subjects is left to field workers, they may choose only friendly
looking people.
Sampling error
It's the difference between a sample statistic and its corresponding population parameter. The
sampling distribution of the sample means is a probability distribution of possible sample means
of a given sample size.
Statistical Inference
Sample information is used to shade some light on the population characteristics i.e. we infer
population properties based on findings on the sample. Statistical inference falls into two main
areas i.e. statistical estimation and hypothesis testing.
Statistical Estimation: The characteristics of the sample (sample statistic) are used to estimate
or approximate some unknown population characteristics.
Hypothesis testing: The population characteristics are known or assumed. The sample
characteristics arc used to verify or ascertain this assumed or known population characteristic.
The assignment of values to a population parameter is based on a sample is called estimation.
The values assigned to a population parameter based on the value of a sample statistic is called
an estimate of the population parameter. The sample statistic used to estimate a population
parameter is called an estimator. Estimation can be undertaken in two forms namely, Point
estimation or Interval estimation
Selecting the sample size to estimate a population mean
One of the most common questions asked of statisticians is, how large should the sample taken
in a survey be? The answer to this question depends on three factors:-
i. The parameter to be estimated
ii. The desired confidence level of the interval estimator
iii. The maximum error of estimation, where error of estimation is the absolute difference
between the point estimator and the parameter e.g. the point estimator
of µ is x so that the error of estimation = x-µ
The maximum error of estimation is also called the error bound and is denoted B. Suppose the
parameter of interest in an experiment is the population mean µ. The confidence interval
estimator (assuming a normal population, with the population
variance known) is x ± z a/2 σ . If we want to estimate µ to within certain specified
√n
bound B, we will want the confidence interval estimator to be x±B. As a consequence,
r 1 ?-
we have zal2 —7= = B . Solving for n, we get the following result n = ———
4n
A popular method of approximating is to begin by approximating the range of the random
variable. A conservative estimate of 6 is the range divided by 4 i.e. 6~Range/4
This produces a larger value of 6, which results in a larger value of n, which then
estimates u with an interval at least as good as was specified.
Examples
1. A production manager would like to estimate the mean time required for workers to
complete a task on an assembly line. Assume that she knows that 6 is 80 seconds. How
large a sample should she draw to estimate p to within 5 seconds with (i) 90% confidence
(ii) 95% confidence (iii) 99% confidence
2. Find n, given that we want to estimate u to within 10 units with 95% confidence,
assuming that e = 100
3. The operations manager of a large production plant would like to estimate the average
amount of time a worker takes to assemble a new electronic component. After observing
a number of workers assembling similar devices, she noted that the shortest time taken
was 10 minutes and longest time taken was 22 minutes. How large a sample of workers
should she take if she wants to estimate the mean assembly time to within 20 seconds?
Assume that the confidence level is to be 99%.
4. Determine the sample size necessary to estimate p to within 10 units with 99%
confidence. We know that the range of the population is 200 units.
Selecting the sample size to estimate a population proportion
1. The manager of a bank feels that 35% of branches will have enhanced yearly collection of
deposits after introducing a hike in interest rate. Determine the sample size such that the
mean proportion is within plus or minus 0.06 at a confidence level of (i) 90% (ii) 95% and
(iii) 99%.
2. How large a sample should be taken in order to estimate to within 0.01 with 95% confidence?
assume that
a) You have no information about the value of p
b) p is believed to be approximately 0.10
c) p is believed to be approximately 0.90
3. The director of a management school feels that 55% of students will have enhanced
performance if additional input is given to them. Determine the sample size such that the
mean proportion is within plus or minus 0.10 at a confidence level of 95%.
MEASUREMENT
Introduction
While people measure things casually in daily life, research measurement is more precise and
controlled. In measurement, one settles for measuring properties of the objects rather than the
objects themselves. An event is measured in terms of its duration i.e. what happened during it,
who was involved, where it occurred etc. Measurement is the basis for all systematic inquiry
because it provides us with the tools for recording differences in the outcome of variable change.
Definition of Measurement
Measurement is the procedure by which we assign numerals, numbers, or other distinguishing
values to variables according to rules. These rules help us determine the kinds of values we will
assign to certain observable phenomena or variables. They also determine the quality of
measurement. Precision and exactness in measurement are vitally important. The measures are
what are actually used to test the hypotheses. A researcher needs good measures for both
independent and dependent variables.
(d) The data collection instrument: a defective instrument can cause distortion in two
major ways:
It can be too confusing and ambiguous e.g. the use of complex words, leading
questions, ambiguous meanings, multiple questions.
Leads to poor selection from the universe of content items. Seldom does the
instrument explore all the potentially important issues.
TYPES OF VARIABLES
A variable is a measurable characteristic that assumes different values among the
subjects. According to Mugenda and Mugenda (2003), variables can be classified into the
following categories: -
1. Independent variables / Predictor variables
It is a variable that a researcher manipulates in order to determine its effect or influence on
another variable. They predict the amount of variation that occurs in other variables.
2. Experimental variables are characteristics of the persons conducting the experiment which
might influence how a person behaves. Gender, the presence of racial discrimination, language,
or other factors may qualify as such variables.
3. Situational variables are features of the environment in which the study or research was
conducted, which have a bearing on the outcome of the experiment in a negative way. Included
are the air temperature, level of activity, lighting, and the time of day
4. Control variables / concomitant / covariate or blocking variables
They are extraneous variables that are built into the study. Extraneous variables are variables,
which influence the results of a study when they are not controlled.
Once the major extraneous variables are identified, the researcher can control them by:-
i. Building the extraneous variable into the study: i.e. including it as an independent variable.
E.g. in determining the effect of alcohol on reaction time, sex may influence reaction
time. Therefore, sex can be introduced as an independent variable. Using regression, one
can measure the effect of alcohol on reaction time, controlling sex.
ii. Include them in the study but only at one level e.g. time is the dependent variable, alcohol
level - the independent and sex the extraneous variable. Sex can be controlled by
sampling only females or males of a given age. The disadvantage of this method is that
generalizations are limited to a smaller population.
iii. By removing the effects of the extraneous variables by statistical procedures i.e. by
siphoning its effects on the dependent variable. This can be done by:
Analysis of co-variance
Partial correlation.
5. Intervening variables
They are a special case of extraneous variables. The difference between the intervening and
extraneous variables is in the assumed relationship among the variables. An intervening
variable is a hypothetical internal state that is used to explain relationships between observed
variables, such as independent and dependent variables, in empirical research. With an
extraneous variable, there is no causal link between the independent and dependent variable, but
they are independently associated with a third variable - the extraneous variable. An intervening
variable is recognized as being caused by the independent variable and as being a determinant of
the dependent variable.
The choice of the right intervening variables helps one not only to determine accurately the total
effects of an independent variable on the dependent variable but also partition the total effects
into direct and indirect.
They do not interfere with the established relationship between an independent and dependent
variable but clarifies the influence that precedes such a relationship.
E.g. political stability - attracts investors - increased job opportunities - high standards of living -
reduction of poverty.
7. Suppressor variables
It is an extraneous variable which when not controlled for, removes a relationship between the
two variables. When a suppressor variable is introduced in the study as a control variable, a true
relationship emerges.
8. Distorter variables
It is a variable that converts what was thought of as a positive relationship into a negative
relationship and vice-versa. Its effects lead a researcher into drawing erroneous conclusions from
the data. When the distorter variable is controlled, a true relationship is obtained. Consideration
of distorter variables in a study reduces the chances of making a type I (rejecting a true null
hypothesis) or type two error (accepting a false null hypothesis).
B D
C and D are called endogenous variables. Each endogenous variable is caused or explained.by
the variable that precedes it. E.g. D is caused by A, B and C.
A and B are called exogenous variables. They lack hypothesized causes in the model
.
Validity and Reliability in Research
The quality of a research study depends to a large extent on the accuracy of the data collection
procedures. Reliability and validity measures the relevance and correctness of the data.
Reliability
Reliability is the extent to which an experiment, test, or any measuring procedure yields the same
result on repeated trials. Without the agreement of independent observers able to replicate
research procedures, or the ability to use research tools and procedures that yield _consistent
measurements, researchers would be unable to satisfactorily draw conclusions, formulate
theories, or make claims about the generalize ability of their research. In addition to its important
role in research, reliability is critical for many parts of our lives, including manufacturing,
medicine and sports. Reliability is such an important concept that it has been defined in terms of
its application to a wide range of activities.
Reliability is influenced by random error. Random error is the deviation from a true
measurement due to factors that have not effectively been addressed by the researcher. As
random error increases, reliability decreases
An example of stability reliability would be the method of maintaining weights used by the
Kenya Bureau of Standards. Platinum objects of fixed weight (one kilogram, half kilogram,
etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be
reset so they are "weighing" accurately. Keeping track of how much the scales are off from year
to year establishes stability reliability for these instruments. In this instance, the platinum
weights themselves are assumed to have a perfectly fixed stability reliability
Disadvantages
Subjects may be sensitized by the first testing hence will do better in the second test
Difficulty in establishing a reasonable period between the two testing sessions.
2. Equivalent form
Equivalent reliability is the extent to which two items measure identical concepts at an identical
level of difficulty. Equivalency reliability is determined by relating two sets of test scores to one
another to highlight the degree of relationship or association. In quantitative studies and
particularly in experimental studies, a correlation coefficient, statistically referred to as r, is used
to show the strength of the correlation between a dependent variable (the subject under study),
and one or more independent variable, which are manipulated to determine effects on the
dependent variable. An important consideration is that equivalency reliability is concerned with
correlational, not causal, relationships.
For example, a researcher studying university Bachelor of commerce students happened to notice
that when some students were studying for finals, their holiday shopping began. Intrigued by
this, the researcher attempted to observe how often, or to what degree, these two behaviors co-
occurred throughout the academic year. The researcher used the results of the observations to
assess the correlation between studying throughout the academic year and shopping for gifts. The
researcher concluded there was poor equivalency reliability between the two actions. In other
words, studying was not a reliable predictor of shopping for gifts.
Two instruments are used. Specific items in each form are different but they are designed to
measure the same concept. They are the same in number, structure and level of difficulty e.g.
TOEFL, ORE
Advantages
Estimates the stability of the data as well as the equivalence of the items in the two forms
Disadvantages
Difficulty in constructing two tests, which measure the same concept (time and
resources).
3. Internal consistency technique
Internal consistency is the extent to which tests or procedures assess the same characteristic, skill
or quality. It is a measure of the precision between the observers or of the measuring instruments
used in a study. This type of reliability often helps researchers interpret data and predict the
value of scores and the limits of the relationship among variables.
For example, a researcher designs a questionnaire to find out about college students'
dissatisfaction with a particular textbook. Analyzing the internal consistency of the survey items
dealing with dissatisfaction will reveal the extent to which items on the questionnaire focus on
the notion of dissatisfaction.
4. Interrater reliability
Interrater reliability is the extent to which two or more individuals (coders or raters) agree.
Interrater reliability addresses the consistency of the implementation of a rating system.
A test of interrater reliability would be the following scenario: Two or more researchers are
observing a high school classroom. The class is discussing a movie that they have just viewed as
a group. The researchers have a sliding rating scale (1 being most positive, 5 being most
negative) with which they are rating the student's oral responses. Interrater reliability assesses the
consistency of how the rating system is implemented. For example, if one researcher gives a "1"
to a student response, while another researcher gives a "5," obviously the interrater reliability
would be inconsistent.
Interrater reliability is dependent upon the ability of two or more individuals to be consistent.
Training, education and monitoring skills can enhance interrater reliability.
Types of validity
(a) Construct validity
Construct validity seeks agreement between a theoretical concept and a specific measuring
device or procedure. For example, a researcher inventing a new IQ test might spend a great deal
of time attempting to "define" intelligence in order to reach an acceptable level of construct
validity.
Construct validity can be broken down into two sub-categories: Convergent validity and
discriminate validity. Convergent validity is the actual general agreement among ratings,
gathered independently of one another, where measures should be theoretically related.
Discriminate validity is the lack of a relationship among measures which theoretically should not
be related.
To understand whether a piece of research has construct validity, three steps should be followed.
First, the theoretical relationships must be specified. Second, the empirical relationships"
between the measures of the concepts must be examined. Third, the empirical evidence must be
interpreted in terms of how it clarifies the construct validity of the particular measure being
tested.
Content validity
Content Validity is based on the extent to which a measurement reflects the specific intended
domain of content.
Content validity can be illustrated using the following examples: Researchers aim to study
mathematical learning and create a survey to test for mathematical skill. If these researchers only
tested for multiplication and then drew conclusions from that survey, their study would not show
content validity because it excludes other mathematical functions. Although the establishment of
content validity for placement-type exams seems relatively straight-forward, the process
becomes more complex as it moves into the more abstract domain of socio-cultural studies. For
example, a researcher needing to measure an attitude like self-esteem must decide what
constitutes a relevant domain of : content for that attitude. For socio-cultural studies, content
validity forces the researchers to define the very domains they are attempting to study.
The usual procedure in assessing the content validity of a measure is to use professional or
experts in the particular field. The instrument is given to two groups of experts, one group is
requested to assess what concept the instrument is trying to measure. The other group is asked to
determine whether the set of items or checklist accurately represents the concept under study
Criterion related validity, also referred to as instrumental validity, is used to demonstrate the
accuracy of a measure or procedure by comparing it with another measure or procedure which
has been demonstrated to be valid. For example, imagine a hands-on driving test has been shown
to be an accurate test of driving skills. By comparing the scores on the written driving test with
the scores from the hands-on driving test, the written test can be validated by using a criterion
related strategy in which the hands-on driving test is compared to the written test.
Types
Predictive validity - refers to the degree to which obtained data predicts the future
behaviour of subjects e.g. B. Com graduates
Concurrent validity- refers to the degree to which data are able to predict the behaviour
of subjects in the present and not in the future e.g. psychiatry
Internal and external validity
Researchers should be concerned with both external and internal validity.
External validity refers to the extent to which the results of a. study are generalizable or
transferable. External validity is the degree to which research findings can be generalized
to populations and environments outside the experimental setting. It has to do with
representativeness of the sample with regard to the target population.
Internal validity refers to (1) the rigor with which the study was conducted (e.g., the
study's design, the care taken to conduct measurements, and decisions concerning what
was and wasn't measured) and (2) the extent to which the designers of a study have taken
into account alternative explanations for any causal relationships they explore. In studies
that do not explore causal relationships, only the first of these definitions should be
considered when assessing internal validity. Internal validity depends on the degree to
which extraneous variables have been controlled for in the study Internal and external
validity are inversely related to each other.
Threats to internal validity
History - refers to occurrence of events that influence experimental units during the
course of the study
Maturation - refers to the biological or psychological processes which occur among the
subjects in a relatively short time and which influence research findings
Instrumentation -
Pre-testing - solution - use equivalent form tests
Statistical regression
Attrition- subjects dropping out of the study before completion- leads to error, biasness in
the sample
Differential selection - occurs when subjects are systematically selected for a study -
volunteers and non-volunteers - biasness leads error
Selection - maturation interaction
Ambiguity - when correlation is taken for causation
Apprehension - when people are scared to respond to your study
Demoralization - when people get bored with your measurements
Diffusion - when people figure out your test and start mimicking symptoms
Threats to external validity
Accessible and target population
Control of extraneous variables
Pre-test treatment interaction
Explicit description of the sample
Multi-treatment interference
RESEARCH INSTRUMENTS
The research instruments that are widely used include
Questionnaires
Interviews
Observations
QUESTIONNAIRES
Each item in the questionnaire is developed to address a specific objective, research question or
hypothesis of the study. The researcher must also know how information obtained from each
questionnaire item will be analysed.
Types of questions used questionnaires
1 Structured or closed-ended questions
They are questions, which are accompanied by a list of possible alternatives from which
respondents select the answer that best describes their situation.
Advantages of Structured or closed-ended questions
They are easier to analyse since they are in an immediate usable form
They are easier to administer
They are economical to use in terms of time and money
Disadvantages of Structured or closed-ended questions
They are more difficult to construct
Responses are limited and the respondent is compelled to answer questions according to
the researcher's choices
2. Unstructured or open - ended questions
They refer to questions, which give the respondent complete freedom of response. The amount,
of space provided is always an indicator of whether a brief or lengthy answer is desired.
Advantages of Unstructured or open - ended questions
They permit a greater depth of response
They are simple to formulate
The respondent's responses may give an insight into his feelings, background, hidden
motives, interest and. decisions.
Disadvantages of Unstructured or open - ended questions
There is a tendency of the respondents providing information, which does not answer the
stipulated, research-questions or objectives.
The responses given may be difficult to categorize and hence difficult to analyze
quantitatively
Responding to open ended questions is time consuming, which may put some respondent
off.
3. Contingency questions
In particular cases, certain questions are applicable to certain groups of respondents. In such
cases, follow-up questions are needed to get further information from the relevant sub-group
only. These subsequent questions, which are asked after the initial questions, are called
'contingency questions' or ' filter questions'. The purpose of these kinds of questions is to probe
for more information. They also simplify the respondent's task, in that they will not be required
to answer questions that are not relevant to them.
4. Matrix Questions
These are questions, which share the same set of response categories. They are used whenever
scales like likert scale are being used.
Advantages of matrix questions
When questions or items are presented in matrix form, they are easier to complete and
hence the respondent is unlikely not to be put off.
Space is used efficiently
It is easy to compare responses given to different items.
Disadvantages of matrix questions
Some respondents, especially the ones that may not be too keen to give right responses,
might form a pattern of agreeing or disagreeing with statements.
Some researchers use them when in fact the kind of information being sought could better
be obtained in another format.
Rules for constructing questionnaires and questionnaire items
1. List the objectives that you want the questionnaire to accomplish before constructing the
questionnaire.
2. Determine how information obtained from each questionnaire item will be analyzed.
3. Ensure clarity and avoid ambiguity.
4. If a concept has several meanings and that concept must be used in a question, the
intended meaning must be defined.
5. Construct short questions.
6. Items should be stated positively as possible.
7. Double-barreled items should be avoided.
8. Leading and biased questions should be avoided.
9. Very personal and sensitive questions should be avoided.
10. Simple words that are easily understandable should be used.
11. Questions that assume facts with no evidence should be avoided.
12. Avoid psychologically threatening questions.
13. Include enough information in each item so that it is meaningful to the respondent.
Tips on how to organize or order items in a questionnaire
1. Begin with non-threatening, interesting items.
2. It is not advisable to put important questions at the end of a long questionnaire.
3. Have some logical order when putting items together.
4. Arrange the questions according to themes being studied.
5. If the questionnaire is arranged into content sub-sections, each section should be
introduced with a short statement concerning its content and purpose.
6. Socio-economic questions should be asked at the end because respondents may be put off
by personal questions at the beginning of the questionnaire.
Presentation of the questionnaire
1. Make the questionnaire attractive by using quality paper. It increases the response rate.
2. Organize and lay out the questions so that the questionnaire is easy to complete.
3. All the pages and items in a questionnaire should be numbered.
4. Brief but clear instruction must be included.
5. Make your questionnaire short.
Pretesting the questionnaire
The questionnaire should be pretested to a selected sample, which is similar to the actual sample,
which the researcher plans to study. This is important because:-
Questions that are vague will be revealed in the sense that the respondents will interpret
them differently.
Comments and suggestions made by respondents during pretesting should be seriously
considered and incorporated.
Pretesting will reveal deficiencies in the questionnaire.
It helps to test whether the methods of analysis are appropriate
Ways of administering questionnaires
i. Self administered questionnaires
Questionnaires are send to the respondents through mail or hand-delivery, and they
complete on their own.
ii. Researcher administered questionnaires
The researcher can decide to use the questionnaire to interview the respondents. This is
mostly done when the subjects may not have the ability to easily interpret the questions
probably because of their educational level.
iii. Use of the internet
The people sampled for the research receive and respond to the questionnaires through
their web sites or e-mail addresses.
The letter of transmittal / Cover letter
The letter of transmittal / Cover letter should accompany every questionnaire.
Contents of a letter of transmittal
It should explain the purpose of the study.
It should explain the importance and significance of the study.
A brief assurance of confidentiality should be included in the letter.
If the study is affiliated to a certain institution or organisation, it is advisable to have an
endorsement from such an institution or organisation.
In a sensitive research, it may be necessary to assure the anonymity of respondents.
The letter should contain specific deadline dates by which the completed questionnaire is
to be returned.
Follow-up techniques
Sending a follow-up letter which should be polite, and asking the subjects to respond
A questionnaire and a follow-up letter
Response rate
It refers to the percentage of subjects who respond to questionnaires. Many authors believe that a
response rate of 50% is adequate for analysis and reporting. If the response rate is low, the
researcher must question the representativeness of the sample.
INTERVIEWS
An interview is an oral (face to face) administration of a questionnaire or an interview schedule.
To obtain accurate information through interviews, a researcher needs to obtain the maximum
co-operation from respondents. Interviews are particularly useful for getting the story behind a
participant's experiences. The interviewer can pursue in-depth information around a topic.
Interviews may be useful as follow-up to certain respondents to questionnaires, e.g., to further
investigate their responses. Usually open-ended questions are asked during interviews.
Guidelines for preparation for Interview
1. Choose a setting with little distraction. Avoid loud lights or noises, ensure the
interviewee is comfortable (you might ask them if they are), etc. Often, they may feel
more comfortable at their own places of work or homes.
2. Explain the purpose of the interview.
3. Address terms of confidentiality. Note any terms of confidentiality. (Be careful here.
Rarely can you absolutely promise anything. Courts may get access to information, in
certain circumstances.) Explain who will get access to their answers and how their
answers will be analyzed. If their comments are to be used as quotes, get their written
permission to do so.
4. Explain the format of the interview. Explain the type of interview you are conducting and
its nature. If you want them to ask questions, specify if they're to do so as they have them
or wait until the end of the interview.
5. Indicate how long the interview usually takes.
6. Tell them how to get in touch with you later if they want to.
7. Ask them if they have any questions before you both get started with the interview.
8. Don't count on your memory to recall their answers. Ask for permission to record the
interview or bring along someone to take notes.
Types of Interviews approaches
(a) Informal, conversational interview - no predetermined questions are asked, in order to
remain as open and adaptable as possible to the interviewee's nature and priorities; during the
interview, the interviewer "goes with the flow".
(b) General interview guide approach - the guide approach is intended to ensure that the
same general areas of information are collected from each interviewee; this provides more focus
than the conversational approach, but still allows a degree of freedom and adaptability in getting
information from the interviewee.
(c) Standardized, open-ended interview - here, the same open-ended questions are asked to
all interviewees (an open-ended question is where respondents are free to choose how to answer
the question, i.e., they don't select "yes" or "no" or provide a numeric rating, etc.); this approach
facilitates faster interviews that can be more easily analyzed and compared
(d) Closed, fixed-response interview - where all interviewees are asked the same questions and
asked to choose answers from among the same set of alternatives. This format is useful for those
not practiced in interviewing.
Sequence of Questions
1. Get the respondents involved in the interview as soon as possible.
2. Before asking about controversial matters (such as feelings and conclusions), first ask
about some facts. With this approach, respondents can more easily engage in the
interview before warming up to more personal matters.
3. Intersperse fact-based questions throughout the interview to avoid long lists of fact-based
questions, which tends to leave respondents disengaged.
4. Ask questions about the present before questions about the past or future. It's usually
easier for them to talk about the present and then work into the past or future.
5. The last questions might be to allow respondents to provide any other information they
prefer to add and their impressions of the interview.
Wording of Questions
Wording should be open-ended. Respondents should be able to choose their own terms
when answering questions.
Questions should be as neutral as possible. Avoid wording that might influence answers,
e.g., evocative, judgmental wording.
Questions should be asked one at a time.
Questions should be worded clearly. This includes knowing any terms particular to (he
program or the respondents' culture.
Be careful asking "why" questions. This type of question infers a cause-effect relationship
that may not truly exist. These questions may also cause respondents to feel defensive,
e.g., that they have to justify their response, which may inhibit their responses to this and
future questions.
While Carrying Out Interview
Occasionally verify the tape recorder (if used) is working.
Ask one question at a time.
Attempt to remain as neutral as possible. That is, don't show strong emotional reactions to
their responses. Patton suggests to act as if "you've heard it all before."
Encourage responses with occasional nods of the head, "uh huh"s, etc.
Be careful about the appearance when note taking. That is, if you jump to take a note, it
may appear as if you're surprised or very pleased about an answer, which may influence
answers to future questions.
Provide transition between major topics, e.g., "we've been talking about (some topic) and
now I'd like to move on to (another topic)."
Don't lose control of the interview. This can occur when respondents stray to another
topic, take so long to answer a question that times begins to run out, or even begin asking
questions to the interviewer.
Immediately After Interview
Verify if the tape recorder, if used, worked throughout the interview.
Make any notes on your written notes, e.g., to clarify any scratching, ensure pages are
numbered, fill out any notes that don't make senses, etc.
Write down any observations made during the interview. For example, where did the
interview occur and when, was tire respondent particularly nervous at any time? Were
there any surprises during the interview? Did the tape recorder break?
Creating a non response sample and weighting results from this sample
Substituting another individual for the missing non-participant, (c)
Response error
Occurs when the data reported differ from the actual data. It can occur during the interview or
during preparation of data analysis.
Participant-initiated error occurs when the participant fails to answer fully and accurately
either by choice or because of inaccurate or incomplete knowledge. Can be solved by
using trained interviewers who are knowledgeable about such problems.
Interviewer error can be caused by:-
- Failure to secure full participant cooperation
- failure to consistently execute interview procedures
- Failure to establish appropriate interview environment
- Falsification of individual answers or whole interviews
- Inappropriate influencing behaviour
- Failure to record answers accurately and completely
- Physical presence bias.
Advantages of Personal interviews
Good cooperation from the respondents
Interviewer can answer questions about survey, probe for answers, use follow-up
questions and gather information by observation.
Special visual aids and scoring devices can be used.
Illiterate and functionally illiterate respondents can be reached
Interviewer can prescreen respondent to ensure he / she fits the population profile.
Responses can be entered directly into a portable microcomputer to reduce error and cost
when using computer assisted personal interviewing.
Disadvantages of Personal interviews
High costs
Need for highly trained interviewers
Longer period needed in the field collecting data
May be wide geographic dispersion
Follow-up is labour intensive
Not all respondents are available or accessible
Some respondents are unwilling to talk to strangers in their homes
Some neighbor hoods are difficult to visit
Questions may be altered or respondent coached by interviewers.
Telephone interviews
People selected to be part of the sample are interviewed on the telephone by a trained
interviewer.
Advantages of Telephone interviews
Lower costs than personal interviews
Expanded geographic coverage without dramatic increase in costs
Uses fewer, more highly skilled interviewers
Reduced interview bias
Personal interviews
People selected to be part of the sample are interviewed in person by a trained interviewer.
Requirements for success
Three broad conditions must be met in order to have a successful personal interview:
The participant must possess the information being targeted by the investigative
questions
The participant must understand his or her role in the interview as the provider of
accurate information
The participant must perceive adequate motivation to cooperate
Increasing the participant's receptiveness
The first goal in an interview is to establish a friendly relationship with the participant. Three
factors will help increase participant receptiveness. The participant must:
Believe that the experience will be pleasant and satisfying
Believe that answering the survey is an important and worthwhile use of his or her time
Dismiss any mental reservations that he or she might have about participation.
The technique of stimulating participants to answer more fully and relevantly is termed probing.
Since it presents a great potential for bias, a probe should be neutral and appear as a natural part
of the conversation. Appropriate probes should be specified by the designer of the data collection
instrument. There are several probing styles e.g.
A brief assertion of understanding and interest e.g. comments such as "I see" "yes".
An expectant pause
Repeating the question
Repeating the participant's reply
A neutral question or comment
Question clarification.
Problems likely to be encountered during personal interviews
In personal interviews, the researcher must deal with bias and cost. Biased results are as a result
of three types of errors:
(a) Sampling error
It's the difference between a sample statistic and its corresponding population parameter. The
sampling distribution of the sample means is a probability distribution of possible sample means
of a given sample size.
(b)Non-response error
This occurs when the responses of participants differ in some systematic way from the responses
of non-participants. It occurs when the researcher:
Cannot locate the person to be studied
Is unsuccessful in encouraging that person to participate Solutions to reduce errors of
non-response are
Establishing and implementing callback procedures
Fates completion time
Better access to hard-to-reach respondents through repeated callbacks
Can use computerized random digit dialing
Responses can be entered directly into a computer file to reduce error and cost when
using computer assisted telephone interviewing.
Disadvantages of Telephone interviews
Response rate is lower than for personal interview
Higher costs if interviewing geographically dispersed sample
Interview sample must be limited
Many phone numbers are unlisted or not working, making directory listings unreliable
Some target groups are not available by phone
Responses may be less complete
Illustrations cannot be used.
Respondents may not be honest with their responses since it is not a face to face situation
Rules pertaining to interviews
The interviewer must
Be pleasant
Show genuine interest in getting to know respondents without appearing like spies.
Be relaxed and friendly.
Be very familiar with the questionnaire or the interview guide.
Have a guide which indicates what questions arc to be asked and in what order.
Interact with the respondent as an equal.
Pretest the interview guide before using it to check for vocabulary, language level and
how well the questions will be understood.
Inform the respondent about the confidentiality of the information given.
Not ask leading questions
Remain neutral in an interview situation in order to be as objective as possible.
An interview schedule
It's a set of questions that the interviewer asks when interviewing. It makes it possible to obtain
data required to meet specific objectives of the study.
Note taking during interviews
It refers to the method of recording in which the interviewer records the respondent's responses
during the interview.
Advantages
It facilitates data analysis since the information is readily accessible and already classified
into appropriate categories.
If taken well, no information is left out.
Disadvantages of note taking
It may interfere with the communication between the respondent and the interviewer.
It might upset the respondent if the answers are personal and sensitive.
If it is delayed, important details may be forgotten.
It makes the interview lengthy and boring.
Tape recording
The interviewer's questions and the respondent's answers are recorded either using a tape
recorder or a video tape.
Advantages
It reduces the tendency for the interviewer to make unconscious selection of data in the
course of the recording.
The tape can be played back and studied more thoroughly.
A person other than the interviewer can evaluate and categorize responses.
It speeds up the interview.
Communication is not interrupted.
Disadvantages
It changes the interview situation since respondents get nervous.
Respondents may be reluctant to give sensitive information if they know they are being
taped.
Transcribing the tapes before analysis is time consuming and tedious.
Advantages of interviewers
It provides in-depth data, which is not possible to get using a questionnaire.
It makes it possible to obtain data required to meet specific objectives of the study.
Are more flexible than questionnaires because the interviewer can adapt to the situation
and get as much information as possible.
Very sensitive and personal information can be extracted from the respondent.
The interviewer can clarify and elaborate the purpose of the research and effectively
convince respondents about the importance of the research.
They yield higher response rates
Disadvantages of interviews
They are expensive - traveling costs
It requires a higher level of skill
Interviewers need to be trained to avoid bias
Not appropriate for large samples
Responses may be influenced by the respondent's reaction to the interviewer.
OBSERVATION
Observation is one of the few options available for studying records, mechanical processes, small
children and complex interactive processes. Data can be gathered as the event occurs.
Observation includes a variety of monitoring situations that cover non-behavioural and
behavioural activities.
The observer-participant relationship
Interrogation presents a clear opportunity for interviewer bias. The problem is less pronounced
with observation but is still real. The relationship between observer and participant may be
viewed from three perspectives:
Whether the observation is direct or indirect
Whether the observer's presence is known or unknown to the participant
What role the observer plays
Guidelines for the qualification and selection of observers
Concentration: Ability to function in a setting full of distractions
Detail-oriented: Ability to remember details of an experience
Unobtrusive: Ability to blend with the setting and not be distinctive
Experience level: Ability to extract the most from an observation study
Advantages of observation
Enables one to:
Secure information about people or activities that cannot be derived from experiment or
surveys
Reduces obtrusiveness
Avoid participant filtering and forgetfulness
Secure environmental context information
Optimize the naturalness of the research setting
Limitations of observation
Difficulty of waiting for long periods to capture the relevant phenomena
The expense of observer costs and equipment
Reliability of inferences from surface indicators
The problem of quantification and disproportionately large records
Observation forms, schedules or checklists
The researcher must define the behaviours to be observed and then develop a detailed list of
behaviours. During data collection, the researcher checks off each as it occurs. This permits the
observer to spend time thinking about what is occurring rather than on how to record it and this
enhances the accuracy of the study.
DATA ANALYSIS DATA PREPARATION AND DESCRIPTION
Once the data begins to flow in, attention turns to data analysis. If the project has been done
correctly, the analysis planning is already done.
Data preparation
This includes editing, coding and data entry. These activities ensure the accuracy of the data and
their conversion from raw form to reduced and classified forms that are more appropriate for
analysis.
Editing
Editing detects errors and omissions, corrects them when possible and certifies that minimum
data quality standards have been achieved. The editor's purpose is to guarantee that data are:
Accurate
Consistent with intent of the question and other information in the survey
Uniformly entered
Complete
Arranged to simplify coding and tabulation
Field editing
In large projects, field editing review is a responsibility of the field supervisor. It should be done
soon after the data have been gathered. During the stress of data collection, the researcher often
uses ad hoc abbreviations and special symbols. Soon after the interview, experiment or
observation, the investigator should review the reporting forms. It is difficult to complete what
was abbreviated or written in shorthand or noted illegibly if the entry is not caught that day.
When entry gaps are present from interviews, a call back should be made rather than guessing
what the respondent 'probably would have said'. Self-interviewing has no place in quality
research.
Central editing
For a small study, the use of a single editor produces maximum consistency. In large studies, the
tasks may be broken down so that each editor can deal with one entire section. This approach
will not identify inconsistencies between answers in different sections. However, this problem
can be handled by identifying points of possible inconsistency and having one editor check
specifically for them.
Rules to guide editors in their work
Be familiar with instructions given to interviewers and coders
Do not destroy, erase or make illegible the original entry by the interviewer, original
entries should be crossed out with a single line to remain legible.
Make all entries on an instrument in some distinctive colour and in a standardized, form.
Initial all answers changed or supplied.
Place initials and date of editing on each instrument completed.
Coding
Coding involves assigning numbers or other symbols to answers so the responses can be grouped
into a limited number of classes or categories. The classifying of data into limited categories
sacrifices some data detail but is necessary for efficient analysis. Coding helps the researcher to
reduce several thousand replies to a few categories containing the critical information needed for
analysis. In coding, categories are the partitioning of a set and categorization is the process of
using rules to partition a body of data.
Coding rules
The categories should be:
Appropriate to the research problem and purpose: Categories must provide the best
partitioning of data for testing hypotheses and showing relationships.
Exhaustive
Mutually exclusive
The objective of descriptive statistical analysis is to develop sufficient knowledge to describe a
body of data. This is accomplished by understanding the data levels for the measurements we
choose, their distributions and characteristics of location, spread and shape. The discovery of
miscoded values, missing data and other problems in the data set is enhanced with descriptive
statistics
There are three general areas that make up the field of statistics: descriptive statistics, relational
statistics, and inferential statistics:
DESCRIPTIVE STATISTICS
Descriptive statistics fall into one of two categories: measures of central tendency (mean,
median, and mode) or measures of dispersion (standard deviation and variance). Their purpose is
to explore hunches that may have come up during the course of the research process, but most
people compute them to look at the normality of their numbers. Examples include descriptive
analysis of sex, age, race, social class, and so forth.
VISUAL DISPLAYS OF DATA
In addition to numerical summaries of location, spread and shape, visual displays can be used to
provide a complete and accurate impression of distribution and variable relationships.
Frequency table arrays data from highest to lowest values with counts .and percentages.
They are most useful for inspecting the range of responses and their repeated occurrence.
Bar charts and pie charts are appropriate for relative comparisons of nominal data.
Histograms are optimally used with continuous variables where intervals group the
responses.
Stem and leaf displays present actual data values using a histogram type device that
allows inspection of spread and shape.
Box plots use the five-number summary to convey a detailed picture of a distribution's
main body, tails and outliers.
Control charts display sequential measurements of a process together with a centre line
and control limits. The selection of a control chart depends on the level of data one is
measuring. It helps manager's focus on special causes of variation by revealing whether a
system is under control and substantiating results from improvements.
The Pareto diagram is a bar chart whose percentages sum to 100 percent. The causes of
the problem under investigation are sorted in decreasing importance with bar" height
descending from left to right. Its pictorial array reveals the highest concentration of
quality improvement potential in the fewest number of remedies.
INFERENTIAL STATISTICS
Hypothesis: It's a statement about a population parameter developed for the purpose of testing.
Hypothesis testing: It's a procedure based on sample evidence and probability theory to
determine whether the hypothesis is a reasonable statement.
Derived from one classification principle
Coding closed questions
The responses to closed questions include scaled items and others for which answers can be
anticipated. When codes are established early in the research process, it is possible to pre-code
the questionnaire. Pre-coding is particularly helpful for data entry because it makes the
intermediate step of completing a coding sheet unnecessary. The data are accessible directly
from the questionnaire. A respondent, interviewer, field supervisor or researcher is able to assign
an appropriate numerical response on the instrument by checking, circling or printing it in the
proper coding location.
Coding open-ended questions
Open-ended questions are always used where insufficient information or lack of a hypothesis
prohibits preparing response categories in advance, need to measure sensitive or disapproved
behaviour, discover salience or encouraging natural modes of expressions. Content analysis is
always used to analyse open-ended questions. Converse and Presser (1986) define content
analysis as a research technique for the objective, systematic and quantitative description of the
manifest content of a communication.
1. A study by the Coca-Cola Company showed that the typical adult Kenyan consumes 18
gallons of Coca-Cola each year. According to the same survey, the standard, deviation of the
number of gallons consumed is 3.0. A random sample of 64 college students showed they
consumed an average (mean) of 17 gallons of cola last year. At the 0.05 significance level,
can we conclude that there is a significance difference between the mean consumption rate of
college students and other adults?
2. The manager of a departmental store is thinking about establishing a new billing system for
the stores credit customers. After a thorough financial analysis, she determines that the new
system will not be cost effective if the average monthly account is less than 70,000. A
random sample of 200 monthly accounts is drawn, for which the mean monthly account is
Sh. 66,000. With a = 0.05, is there sufficient evidence to conclude that the new system will
not be cost effective? Assume that the population standard deviation is Sh. 30,000.
3. Past experience indicates that the monthly long distance telephone bill per household in a
particular community is normally distributed, with a mean of Sh. 1012 and a standard
deviation of Sh. 327. After an advertising campaign that encouraged people to make long
distance telephone calls more frequently, a random sample of 57 households revealed that the
mean monthly long distance bill was Sh. 1098. Can we conclude at the 10% significance
level that the advertising campaign was successful?
The null and alternate hypotheses of tests of proportions are set up in the same way as the
hypothesis of tests about mean and variance. The test statistic for p is
Examples
Testing the population proportion
The null and alternate hypotheses of tests of proportions are set up in the same way as the
hypothesis of tests about mean and variance. The test statistic for p is
Confidence interval estimator of
Example:
An inventor has developed a system that allows visitors to museums, zoos and other attractions
to get information at the touch of a digital code. For example, zoo patrons can listen to an
announcement (recorded on a microchip) about each animal they see. It is anticipated that the
device would rent for $3.00 each. The installation cost for the complete system is expected to be
about $400,000. The ABC zoo is interested in having the system installed, but the management is
uncertain about whether to take the risk. A financial analysis of the problem indicates that if
more than 10% of the zoo visitors rent the system, the zoo will make a profit. To help make the
decision, a random sample of 400 zoo visitors is given details of the systems capabilities and
cost. If 48 people say that they would rent the device, can the management of the zoo conclude at
the 5% significance level that the investment would result in a profit?
2. In a random sample of 100 units from an assembly line, 22 were defective.
(a) Does this provide sufficient evidence at the 10% significance level to allow us to
conclude that the defective rate among all units exceeds 10%?
(b) Find a 99% confidence interval estimate of the defective rate.
3. A manufacturer of computer chips claims that more than 90% of his products conform to
specifications. In a random sample of 1,000 chips drawn from a large production run, 75 were
defective. Do the data provide sufficient evidence at the 1 % level of significance to enable us to
conclude that the manufacturer's claim is true?
Chi-square test of a multinomial experiment
A multinomial experiment is a generalized version of a binomial experiment that allows for more
than two possible outcomes on each trial of the experiment.
Properties of a multinomial experiment
> The experiment consists of a fixed number n of trials.
> The outcome of each trial can be classified into exactly one of k categories called cells
> The probability Pj that the outcome of a trial will fall into a cell i remains constant
for each trial, for i = 1,2,3, ................ k. moreover,
> Each trial of the experiment is independent of the other trials.
Rejection region
Example
Two companies A and B have recently conducted aggressive advertising campaigns in order to
maintain and possibly increase their respective shares of the market for a particular product.
These two companies enjoy a dominant position in the market. Before advertising campaigns
began, the market share for Company A was 45% while Company B had a market share of 40%.
Other competitors accounted for the remaining market share of 15%. To determine whether these
market shares changed after the advertising campaigns, a marketing analyst solicited the
preferences of a random sample of 200 consumers of this product. Of the 200 consumers, 100
indicated a preference for Company's A's product, 85 preferred Company's B product and the
remainder preferred one or another of the products distributed by other competitors. Conduct a
test to determine at the 5%> level of significance, whether the market shares have changed from
the levels they were at before the advertising campaigns occurred.
. To determine if a single die, is balanced, or fair, the die was rolled 600 times. The observed
frequencies with which each of the six sides of the die turned up are recorded in the following
table:
Is there sufficient, evidence to conclude at the 5% level of significance, that the die is not fair?
The operations manager at a shirt manufacturing plant has been concerned about the large
number of defects that the company's three shifts have been producing. They appear to be three
types of defects: Improper stitching, buttons not aligned with button holes and inconsistent
colouring. The manager decides to investigate the problem. As a first step to improving the
quality, she wants to know if the number and type of defects are the same for all three shifts. A
random sample of one day's shirt production is taken. The number of each type of defect and the
number of perfect shirts for each are shown in the following table.
Do these results allow the operations manager to conclude that at the 10% significance level,
there are differences in quality among the three shifts?
Three distinct types of hardware wholesalers; independents (independently owned), Wholesaler
voluntaries (groups of independents acting together) and retailer cooperatives (retailer owned). In
a random sample of 137 retailers, the retailers were categorized according to the type of
wholesaler they primarily used and according to their store location as shown in the table below.
At the 5% significance level, is there sufficient evidence to conclude that the type of wholesaler
primarily used by a retailer is related to the retailers location?
RELATIONAL STATISTICS
Relational statistics fall into one of three categories: univariate, bivariate, and multivariate
analysis. Univariate analysis is the study of one variable for a sub-population. Bivariate analysis
is the study of a relationship between two variables. Multivariate analysis is the study of
relationship between three or more variables. The relational statistics include correlation,
regression, discriminant analysis, conjoint analysis, factor analysis and cluster analysis
Discriminant analysis: It is used to classify people or objects into groups based on several
predictor variables. The groups are defined by a categorical variable with two or more values,
whereas the predictors are metric. The effectiveness of the discriminant equation is based not
only on its statistical significance but also on its success in correctly classifying cases to groups.
Conjoint analysis: It is a technique that typically handles non-metric independent variables. It
allows the researcher to determine the importance of product or service attributes and the levels
or features that are most desirable. Respondents provide preference data by ranking or rating
cards that describe products. These data become utility weights of product characteristics by
means of optimal scaling and log linear algorithms.
> Factor analysis: It attempts to reduce the number of variables and discover the underlying
constructs that explain the variance. A correlation matrix is used to derive a factor matrix from
which the best linear combination of variables may be extracted.
> Cluster analysis: It is a set of techniques for grouping similar objects or people. The cluster
procedure starts with an undifferentiated group of people, events or objects and attempts to
reorganize them into homogeneous sub-groups.
REGRESSION ANALYSIS
Regression involves developing a mathematical equation that analyses the relationship between
the variable to be forecast (dependent variable) and the variables that the statistician believes are
related to the forecast variable (independent variable). Regression is the estimation of unknown
values or the prediction of one variable from known values of other variables.
Types of regression
> Simple linear regression: Involves a relationship between two variables only.
> Multiple regression: Analyses or considers the relationship between three or more variables
Simple Regression
The first step in establishing the relationship between X and Y is to obtain observations on the
two variables and analyze the data using a scatter diagram to indicate whether a positive or
negative relationship exists between X and Y. the relationship can be approximated by a straight
line. Algebraically, the relationship is Yt = bQ + blXl
The above function is deterministic since it gives exact relationship between X and Y. when the
line is plotted, not all the points will fall on the line because of the following reasons :-
> Omission of other explanatory variables from the function
> Random behavior of human beings
> Imperfect specification of the functional form of the model
> Errors of aggregation
> Errors of measurement
To account for the deviations of some points from the straight line, the error term is introduqed.
The introduction of the error term makes the function stochastic Yt = b0 + b]Xl + et. To estimate
the values of the coefficients b0 and bt, we need
observations on Y, X and the error term. However, the error term is not observable and therefore
we make assumptions about the error term.
Assumptions of the error term
> The error term is a real random variable which has a mean of zero and constant variance (
Assumption of homoscedasticity)
> The error term is normally distributed
> The error term corresponding to different values of X for different periods are not correlated
(assumption of no autocorrelation)
> There is no relationship between the explanatory variables and the error term
The explanatory variables are measured without error. The error absorbs the influence of omitted
variables and errors of measurement in the dependent variable. All the above assumptions are
called stochastic assumptions
Other assumptions
> The explanatory variables are not perfectly linearly related or correlated (No
multicollinearity)
> The variables are correctly aggregated
> The relation being estimated is identified
> The relationship is correctly specified
The regression equation of Y on X
> It used to predict the values of Y from the given values of X.
> It is expressed as follows Y = b0 +blX
> To determine the values of b0 and bx the following two normal equations are to be solved
simultaneously
Alternatively the values of bo and bi can be got using the following formulars
Example:
A random sample of eight auto drivers insured with a company and having similar auto
insurance policies was selected. The following table lists their driving experience (in years) and
the monthly insurance premium (in Sh.000) paid by them.
i. Find the least squares regression line by identifying the appropriate dependent and
independent variable.
ii. Interpret the meaning of the constants calculated in part (i) above.
iii. Compute the coefficient of correlation and coefficient of determination and interpret
their values.
4. A farmer wanted to find out the relationship between the amount of fertilizer used and the
yield of corn. He selected seven acres of his land on which he used different
Find the least squares regression line by identifying the appropriate dependent and
independent variable.
ii. Interpret the meaning of the constants calculated in part (i) above.
5. Compute the coefficient of correlation and coefficient of determination and interpret their
values. . In an attempt to get a better idea of some of the determinants of medical expenditures by
families, a social worker collected data on family size and average weekly medical bills,with the
results shown in the following table;
Predict the yield of com per acre for 105 kg of fertilizer used
i. Find the least squares regression line by identifying the appropriate dependent and
independent variable.'
ii. Interpret the meaning of the constants calculated in part (i) above.
iii. Compute the coefficient of correlation and coefficient of determination and interpret their.
CORRELATION
Definition: It is the existence of some definite relationship between two or more variables.
Correlation analysis is a statistical tool used to describe the degree to which one variable is
linearly related to another variable.
Types of Correlation
Con-elation may be classified in the following ways:-Positive and negative correlation
Correlation is said to be positive if two series move in the same direction, otherwise it is negative
(opposite Direction). Linear and Non-Linear correlation
Correlation is linear if the amount of change in one variable tends to bear a constant ratio to the
amount of change in the other variable otherwise it is non-linear. Simple, partial and multiple
correlation
Simple correlation is where two variables are studied while partial or multiple involves three or
more variables.
Methods of calculating simple correlation
1. Scatter diagram
2. Karl Pearson's coefficient of correlation
3. Spearman's rank correlation coefficient
4. Method of least squares
Scatter diagram
It is a chart that potrays the relationship between two variables. Advantages
> It is simple and non-mathematical method of studying correlation between variables.
> It is not influenced by the size of extreme values Limitation
> One cannot establish the exact degree of correlation between the variables.
Karl Pearson's coefficient of correlation (Product moment coefficient of correlation)
The coefficient of correlation (r) is a measure of strength of the linear relationship between two
variables.
Interpretation of the coefficient of correlation
When r = +1, there is a perfect positive correlation between the variables When r = -1, there is a
perfect negative correlation between the variables When r = 0, there is no correlation between the
variables
The closer r is to +1 or to -1, the closer the relationship between the variables and the closer r is
to 0, the less close the relationship. The closeness of the relationship is not propotional
relationship to r.
The following table lists the interpretation for various correlation coefficients
Advantage
> It summarizes in one figure the degree of correlation and whether it is positive or negative.
Limitations
> It assumes linear relationship regardless of the fact whether that assumption is true or not.
> The coefficient can be misinterpreted.
> The value of the coefficient is unduly affected by the extreme values.
> It is time consuming. Method of least square
Spearman's Rank Correlation Definition
> It is the correlation between the ranks assigned to individuals by two different characters.
> It is a non-parametric technique for measuring strength of relationship between paired
observations of two variables when the data are in ranked form.
It is denoted by R or p
Merits- of the Rank method
It is simpler to understand and easier to apply compared to the Karl Pearson's method.
Where the data are of qualitative nature like honesty, efficiency, intelligence etc, the
method can be used with great advantage. y It is the only method that can be used where we are
given the ranks and not the actual
values.
Limitations
The method cannot be used for finding out correlation in a grouped frequency distribution.
Where the number of observations exceeds 30, the calculations become quite tedious and require
a lot of time.
Coefficient of determination (r2)
It is the square of the correlation coefficient. It shows the proportion of the total variation in the
dependent variable Y that is explained or accounted for by the variation in the independent
variable X. e.g. If the value of r = 0.9, r2 = 0.81, this means 81% of the variation in the
dependent variable has been explained by the independent variable.
REPORT WRITING TECHNIQUES
A quality presentation of research findings can have an inordinate effect on a reader's or a
listener's perceptions of a study's quality. Recognition of this fact should prompt a researcher to
make a special effort to communicate skillfully and clearly. Research reports contain findings,
analysis, interpretations, conclusions and recommendations. Research reports differ depending
on their aims and their readership. Reports should be clearly organized, physically inviting and
easy to read. Writers can achieve these goals if they are careful with mechanical details, writing
style and comprehensibility.
Chapter Two
2.0 Literature Review
The purpose of the literature review is to situate your research in the context of what is already
known about a topic. It need not be exhaustive, it needs to show how your work will benefit the
whole. It should provide the theoretical basis for your work, show what has been done in the area
by others, and set the stage for your work.
In a literature review you should give the reader enough ties to the literature that they feel
confident that you have found, read, and assimilated the literature in the field. It should probably
move from the more general to the more focused studies, but need not be exhaustive, only
relevant.
The literature review should clearly present the holes in the knowledge that need to be plugged
and by so doing, situate your work. It is the place where you establish that your work will fit in
and be significant to the discipline.
Chapter Three
3.0 Research Methodology
This section should make clear to the reader the way that you intend to approach the research
question and the techniques and logic that you will use to address it.
3.1 Research design
The coverage of the design must be adapted to the purpose. In an experimental study, the
materials, tests, equipment, control conditions and other devices should be described. In
descriptive or ex post facto designs, it may be sufficient to cover the rationale for using one
design instead of competing alternatives. The strengths and weaknesses of the design can be
identified and the instrumentation and materials discussed.
3.2 The target population
The researcher should explicitly define the target population being studied
3.3 Sampling strategy
Explanations of the sampling methods, uniqueness of the chosen parameters or other points that
need explanation should be covered with brevity.
3.4 Data Collection Tools and Techniques
This part of the report describes the specifics of gathering the data. Its contents depend on the
design. This might include the data that you anticipate collecting and a description of the
instruments you will use. Detailed copies of the data collection tools e.g. questionnaires,
interview schedule or observation schedule should be attached as an appendix.
3.5 Data Analysis
This section summarizes the methods used to analyze the data. It describes data handling,
preliminary analysis, statistical tests, computer programs and other technical information. The
rationale for the choice of analysis approaches should be clear. A brief commentary on
assumptions and appropriateness of use should be presented.
Chapter Four
4.0 Data analysis and Findings
The objective is to explain the data, rather than draw interpretations or conclusions. When
quantitative data can be presented, it should be done as simply as possible with charts, graphics
and tables. The data need not include everything collected. Only material important to the
reader's understanding of the problem and the findings should be included. Both findings that
support or do not support the hypothesis should be included.
Chapter Five
5.0 Summary and Conclusions
The summary is a brief statement of the essential findings. Sectional summaries may be used if
there are many specific findings. These may be combined into an overall summary. Conclusions
represent inferences drawn from the findings. Conclusions may be presented in a tabular form
for easy reading and reference. Summary findings may be subordinated under-the related
conclusion statement.
Recommendations
There are usually a few ideas about corrective actions. In academic research, the
recommendations are often further study suggestions that broaden or test understanding of the
subject area. In applied research, the recommendations will usually be for managerial action
rather than research action. The writer may offer several alternatives with justifications.
References
The use of secondary data requires a reference or a bibliography. Proper citation, style
and formats are unique to the purpose of the report. The
Appendixes
The appendixes are the place for complex tables, statistical tests, supporting documents, copies
of forms and questionnaires, detailed descriptions of the methodology, instructions to field
workers and other evidence important for later support. The reader who wishes to learn about
technical aspects of the study and to look at statistical breakdowns will want a complete
appendix.
Time schedule
It is a listing of the major activities and the corresponding anticipated time period it will take to
accomplish that activity. The time is usually given in months. Activities to be undertaken can
always overlap.
Budget
A budget is a list of items that will be required to carry out the research and their approximate
cost. It should be detailed enough and precise on items needed, prices per-unit and total cost.
Details of requirements in each budget will be governed by the type of research. )
Characteristics of a Good Proposal:
The need for the proposed activity is clearly established, preferably with data.
The most important ideas are highlighted and repeated in several places.
The objectives of the project are given in detail.
There is a detailed schedule of activities for the project, or at least sample portions of
such a complete project schedule.
Collaboration with all interested groups in planning of the proposed project is evident in
the proposal.
The commitment of all involved parties is evident, e.g., letters of commitment in the
appendix and cost sharing stated in both the narrative of the proposal and the budget.
The budget and the proposal narrative are consistent.
The uses of money are clearly indicated in the proposal narrative as well as in the budget.
All of the major matters indicated in the proposal guidelines are clearly addressed in the
proposal. I
The agreement of all project staff and consultants to participate in the project was
acquired and is so indicated in the proposal.
All governmental procedures have been followed with regard to matters such as civil
rights compliance and protection of human subjects.
Appropriate detail is provided in all portions of the proposal.
All of the directions given in the proposal guidelines have been followed carefully.
Appendices have been used appropriately for detailed and lengthy materials which the
reviewers may not want to read but are useful as evidence of careful planning, previous
experience, etc.
The length is consistent with the proposal guidelines and/or funding agency -
expectations.
The budget explanations provide an adequate basis for the figures used in building the
budget.
If appropriate, there is a clear statement of commitment to continue the project after
external funding ends.
The qualifications of project personnel are clearly communicated.
The writing style is clear and concise. It speaks to the reader, helping the reader
understand the problems and proposal. Summarizing statements and headings are used to
lead the reader.
Guidelines for writing a good research report
Break large units of text into smaller units-with headings to show organisation of the
topics
Relieve difficult text with visual aids when possible
Emphasize important material and de-emphasize secondary material through sentence
construction and judicious use of italising, underlining, capitalizing and parentheses.
Use, ample space and wide margins to create a positive psychological effect on the
reader.
Choose words carefully, opting for the known
and short rather than the unknown and
long.
Repeat and summarize critical and difficult ideas so readers can have time to: absorb
them.
Review the writing to ensure the tone is appropriate
Proof read the final document to correct any errors.
Use short paragraphs
Indent parts of text that represent listings, long quotations or examples.
Use headings and subheadings to divide the report and its major sections into
homogeneous topical parts.