Module 4
Module 4
Data Collection
Data Collection: Primary and Secondary Data. Primary data collection methods - Observations,
survey, Interview and Questionnaire, Qualitative Techniques of data collection. Questionnaire
design – Meaning - process of designing questionnaire. Secondary data -Sources – advantages
and disadvantages
Measurement and Scaling Techniques: Basic measurement Scales-Nominal scale, Ordinal
scale, Interval scale, Ratio scale. Attitude measurement scale - Likert’s Scale, Semantic
Differential Scale, Thurstone scale, Multi-Dimensional Scaling.
Primary data
Primary data is also called as first hand data.
Primary data are originated by a researcher for the specific purpose of addressing the
problem at hand.
Obtaining primary data can be expensive and time consuming.
Secondary data
Secondary data are data that have already been collected for purposes other than the
problem at hand.
These data can be located quickly and inexpensively
Secondary data means data that are already available i.e., they refer to the data which
have already been collected and analyzed by someone else. When the researcher utilizes
secondary data, then he has to look into various sources from where he can obtain them.
In this case he is certainly not confronted with the problems that are usually associated
with the collection of original data.
Secondary data may either be published data or unpublished data. Usually published data
are available in:
(a) various publications of the central, state are local governments;
(b) various publications of foreign governments or of international bodies
and their subsidiary organizations;
(c) technical and trade journals;
(d) books, magazines and newspapers;
(e) reports and publications of various associations connected with
business and industry, banks, stock exchanges, etc.;
(f) reports prepared by research scholars, universities, economists, etc. in
different fields; and
(g) public records and statistics, historical documents, and other sources of
published information.
The sources of unpublished data are many; they may be found in diaries, letters,
unpublished biographies and autobiographies and also may be available with scholars and
research workers, trade associations, labour bureaus and other public/ private individuals
and organizations.
Advantages of Secondary Data
Secondary data are easily accessible, relatively inexpensive and quickly obtained.
Available on topics where it would not be feasible for a firm to collect primary data.
Mismatch between the original purpose and the purpose of current study
Because secondary data have been collected for purposes other than the problem at hand,
their usefulness to the current problem may be limited in several important ways,
including relevance and accuracy.
The objectives, nature and methods used to collect the secondary data may not be
appropriate to the present situation.
Secondary data may be lacking in accuracy, or they may not be completely current or
dependable.
secondary
data
Computerized Syndicated
Published data
database services
Internal data:
The data which is available within the organization conducting research is termed as internal
data. The main advantage of this data is that it is easily available and greater reliance can be
placed on the degree of accuracy and relevance to the study.
External data
External data are those generated by sources outside the organization. These are mainly
a) published data,
b) computerized databases
c) Syndicated services.
a) Published Data: these are the most popular of the external sources of data. Different
sources of published data are:
a. Guides: Excellent source of standard information. A guide may help identify
other important sources such as directories, trade associations and trade
publications.
Some of the useful guides are:
Vancouver India business guide
Encyclopedia of business information sources
A Guide to Consumer Markets
Business information sources
b. Directories: Helpful for identifying individuals or organizations that collect
specific data. Eg. Research Services Directory.
www.webdir.biz/Human_Resources/; forex directories, retail direct. Small
business, real estate, fortune 500 directory, leading public and private companies,
Standard directories of advertisers.
c. Indexes – It is possible to locate information on a particular topic in several
different publications by using an index. Indexes can, therefore, increase the
efficiency of the search process.
Ex: Business Periodical Index. the wall street journal index
Librarian’s Internet index, -www.lli.org
d. Statistical Data – Published statistical data are of great interest to researchers.
Graphic and statistical analyses can be performed on these data to draw important
insights. Ex: A Guide to Consumer Markets, standard and poor’s statistical
service.
e. Government Sources
Census Data – Provides detailed view of the human population, their income
and education level. The quality of census data is high and the data are often
extremely detailed. Important census data include Census of Housing, Census
of Manufacturers, Census of Population, Census of Retail Trade, Census of
Service Industries and Census of Wholesale Trade.
Other Government Publications – addition to the census, the government
collects and publishes a great deal of statistical data. The more useful
publications are Business Conditions Digest and Survey of Current Business.
b) Computerized databases
Online Databases- Databases, stored in computers, which require a telecommunications
network to access.
Internet Databases – Internet databases can be accessed, searched and analyzed on the
internet. It is also possible to download data from the internet and store them in the
computer or an auxiliary storage device.
c) Syndicated services
Syndicated sources, also referred to as syndicated services, are companies that collect and
sell common pools of data of known commercial value, designed to serve information needs
shared by a number of clients. These data are not collected for the purpose of marketing
research problems specific to individual clients, but the data and reports supplied to client
companies can be personalized to fit particular needs.
The data is also called as first hand data, contains information that has been collected specifically
for the purpose of investigation at hand.
The data directly collected by the researcher with respect to the problem under study is known as
primary data.
The primary data collection involves greater effort on behalf of the researcher and is time
consuming and expensive.
OBSERVATIONS
It refers to monitoring and recording the behavioural and non-behavioural activities and
conditions in a systematic manner to obtain information about the phenomena of interest.
In the observation method, only present or current behaviour can be studied. Therefore, many
researchers feel that this is a great disadvantage.
Non-Participant Observation
When the observer observes the group passively from a distance without participating in the
group activities, it is known as non-participant observation.
Advantages of Non-Participant Observation:
(a) Objectivity: non-participant observation, the objectivity or neutrality can be maintained.
The observer in this type of observation gives a detached and unbiased view about the group.
(b) Command respect and co-operation: In case of non-participant observation the researcher
plays an impartial role. Therefore, every member of the group gives him a special status and co-
operate with his study.
(c) Careful analysis: In participant observation because of the much familiarity with the events,
sometimes the observer does not realize the significance of same events and neglects them. But
in non- participant observation the researcher does not even miss a minute thing. He carefully
judges the merits and demerits of each and every phenomenon under study.
(d) Freedom from groupism: In non-participant observation the researcher always maintains
his unbiased status.
Disadvantages of Non-Participant Observation:
(a) Subjectivity: In non-participant observation the observer does not have clarity about certain
events on activities. He cannot clear his doubts by asking various questions to the group
members. Therefore, he has to simply understand and interpret what he sees. This lack of
understanding may make some of his findings biased and coloured by his personal prediction,
belief and pre-conception.
(b) Inadequate observation: The observer can observe only those events which take place in
front of him. But that is not enough and only a part of the phenomena as a vast range of
information required for the research. He can know many things about the group when he
participates in the group and interacts with the group members.
(c) Unnatural and formal information: The members of a group become suspicious of a
person who observes them objectively. In front of an outsider or stranger they feel conscious and
provide only some formal information’s in an unnatural way. It creates bias and what the
observer collects is not actual or normal thing but only formal information’s.
(d) Inconvenience to the respondents: The members of a particular group always feel
uncomfortable when they know that their behaviour is critically analyzed by an outsider.
Therefore, in some cases the tribal do not allow an outsider to watch their socio-cultural
activities. It is always better for a researcher to become a member of the group in order to learn
much about it.
4. Natural vs. Contrived Observation:
Conducting the study in a natural setting essentially means that one is simply observing your
subjects in their "real life" environments. Because one has no way of influencing what your
subjects are doing, this method can be time consuming to gather the information that oneare
specifically trying to obtain for your project. Alternatively, the data that is collected in a natural
setting does have more accuracy in reflecting "real life" behavior rather than "contrived"
behavior.
A contrived setting is one where the specific situation being studied is created by the observer.
The contrived setting offers you, the observer, greater control over the gathering of data and
specifically will enable one to gather the information more quickly and efficiently. However, it
may be questionable as to whether or not the data collected does truly reflect a "real life"
situation.
Mechanical Observation – Mechanical devices, rather than human observers, record the
phenomenon being observed. Devices are:
Audiometer – Attached to a television set to continually record what channel the set is tuned to.
Turnstiles – Record the number of people entering or leaving a building.
On-site cameras – Used by retailers to assess package designs, counter space, floor displays and
traffic flow patterns.
Eye-tracking monitors – Records the gaze movements of the eye. Used to determine how a
respondent reads an advertisement and views a TV commercial.
Voice pitch analysis – Measures emotional reactions through changes in the respondent’s voice.
Response latency – Time a respondent takes before answering a question.
Audit – The researcher collects data by examining physical records or performing inventory
analysis of inventory of brands, quantities and package sizes in a consumer’s home or at a retail
store.
Content Analysis – Objective, systematic and quantitative description of the clear content of a
communication. The unit of analysis may be words, characters, themes, space and time measures
or topics.
Trace Analysis – Data collection is based on physical traces or evidence of past behavior. E.g.
No. of different fingerprints on a page was used to find out the readership of various
advertisements in a magazine.
Interview is the verbal conversation between two people with the objective of collecting
research relevant information.
Types of Interview
1. Personal interview.
2. Telephone interview.
3. Focus group interview.
4. Depth interview.
5. Projective Techniques.
Synergism- When group of people produce better insights into a problem than an
individual.
Snowballing- One person`s response initiates a chain of responses.
Stimulations- As the interview progresses, respondents are more and more encouraged to
give responses.
Security- Since an individual generally finds somebody in the group who might endorse
his opinion, he feels secure in answering.
Spontaneity- Since there are no pre designed questions being asked, responses are
spontaneous.
Often new ideas are generated.
Scientific scrutiny- Since the proceedings are being recorded, they can be analysed in a
great detain scientifically.
Flexible and in-depth responses.
Data is collected quickly.
Results can be wrongly interpreted since the response is not to any specific question.
Coding and analyzing data is difficult.
It is difficult to find a moderator who can conduct these interviews successfully.
4. Depth interview: Depth interview, like a focus group interview is an unstructured type of
interview used to collect qualitative data. However, it involves a one to one interaction
between the interviewer and respondent.
The depth interview can be non-directive in nature where the respondent is given freedom to
answer within the boundaries of topic of interest.
The other form of depth interview is `semi structured` in nature where the interviewer covers
a specific list of topics although the linking, the sequence and the wording of each question is
left to the interviewer`s discretion.
In depth interviews, the interviewer asks the initial questions and thereafter it is the response
of the respondents from which further questions may be generated. The interviewer using
probing techniques looks for more elaboration.
It suffers from drawback of being expensive, time consuming and demands skilled
interviewer.
i) Rapport Building
ii) Introduction
iii) Probing
iv) Recording
v) Closing
It is a form containing a set of questions, which are filled by the respondents. In general, the
questionnaire refers to a device for securing answer to questions by using a form which the
respondents fill in himself.
Importance of Questionnaire
Characteristics of Questionnaire
Here the questions are structured so as to obtain the facts. The interviewer will ask the
questions strictly in accordance with the pre-arranged order.
In the non-disguised type, the purpose of the questionnaire is known to the respondent.
Here the purpose of the study is clear, but the responses to the question are open-
ended.
No fixed questions.
Suitable for conducting depth interview
Subject matter can be questioned in great detail
Coding, tabulation etc. are difficult.
Not a very frequently used method.
Process of Questionnaire Design
1. Specify the information sought: The researcher should be able to specify the list of
information needs. Generally, this task has already been accomplished when the research
proposal or the research design was developed. The hypothesis developed earlier is the
guiding light in stating the information requirement. The hypothesis establishes the
relationship between the variable and the researcher can ideally develop the data that is
required to be collected to prove or disprove the hypothesis.
2. Determine the communication approach: it refers to the decision on the method used to
conduct the survey i.e. personal interview, depth interview, telephone, mail, computer etc.
This decision on the method to be used will have a bearing on the type of questionnaire to
be designed. The choice of communication approach is influenced by factors like location
of the respondent, time and funds available, nature of study etc.
3. Select the type of questionnaire: in this step the researcher specifies how the data will
be gathered by stating the type of questionnaire required. The questionnaire can be of four
types,
a. Structured- undisguised
b. Unstructured- disguised
c. Structured- disguised
d. Unstructured- undisguised
4. Determine question content: this involves the task of framing the questions which would
yield the data required for study. While framing the questions certain things should be
kept in mind:
dichotomous questions
do you own a digital camera?
o Yes
o No
multichotomous questions
o Sony
o Cannon
o Nikon
o Kodak
Checklist questions
Scale questions etc
6. Determine the working of each question: this stage is concerned with phrasing of each
question. The researcher needs to use utmost caution in framing the question. Following
things should be kept in mind while wording a question
a. Use simple words
b. Avoid technical jargon
c. Avoid using ambiguous questions
d. Avoid biased wordings
e. The level of personalization should be controlled
Phase 3: Drafting and refining questionnaire
7. Decide on the question sequence: from this step we enter the stage of drafting the
questionnaire and the ordering of questions is an important aspect. The following things
should be kept in mind:
a. Use simple and interesting questions first
b. The questions should be arranged in a logical order
c. Classification questions should be asked later on
d. Difficult and sensitive questions should not be asked in the beginning
e. Branching of questions should be done with care.
8. Determine the physical characteristics of questionnaire
Physical appearance affects the way respondents react to the questionnaire. Hence the
following points should be observed.
Use a good quality paper with high definition ink so that it can be read easily
It should look professional and easy to answer
Questionnaire should be accompanied by an introduction letter
Size of the questionnaire is important. It should not be too lengthy or too short
Introduction should be written politely and clearly
Pretesting of questionnaire is done to detect any flaws that might be present. One of the
prime condition for pretesting is that the sample chosen for pretesting should be similar to
the respondents who are ultimately going to participate.
Advantages of questionnaire
Disadvantages of questionnaire
1. The level of questionnaire may not match the intelligence level of respondents
2. The non-response rate is high in questionnaire
3. since the questionnaire is filled by the respondent in his own hand, many times writing
will not be legible
4. in studies, where immediate response is required, questionnaire method is not suitable
5. clarification of doubts is not possible
Mail questionnaire
Advantages
Limitations
SCHEDULES
Measurement is essential in a research process because measurement alone will help us gather
some kind of conclusive and quantitative data.
Measurement refers to the assignment of numerals to the objects to represent the amount of
property or characteristics possessed by the object.
1. Order: numbers are placed in a logical sequence and the sequence has some meaning
2. Distance: the differences between the numbers are ordered
3. Origin: the number system has a unique origin indicated by number zero
Based on these 3 characteristics, measurement can be done using 4 different types of system
1. Nominal scale
A nominal scale is a figurative labeling scheme in which the numbers serve only as labels
or tags for identifying and classifying objects. – Description
The numbers assigned to the respondents in a study constitute a nominal scale.
For eg.
University Registration Numbers assigned to students,
Bus Route Numbers and
Numbers on the jerseys of cricket players are examples of Nominal scale.
Coding male as 1 and female as 2
Control group-1 and experimental group- 2
The numbers used in nominal scales serve only the purpose of counting and the idea is to
make sure that no two persons or objects receive the same number.
Number does not reflect any characteristic of the store
2. Ordinal scale
An ordinal scale is a ranking scale in which numbers are assigned to objects to indicate
the relative extent to which the objects possess some characteristic.
Numbers indicate the relative positions of the objects but not magnitude of differences
between them
Ordinal Scales are used to ascertain the consumer perceptions, preferences etc.
For eg. The respondents may be given a list of brands which may be suitable and were
asked to rank on the basis of ordinal scale of 1-5.
a)Lux b)Liril c)Cinthol d)Dove e) Pears
In Market Research, we often ask the respondents to rank the items, like for eg., “A soft
drink, based upon flavor or color”. In such a case, the ordinal scale is used. `
The object ranked first has more of the characteristic as compared to the object ranked
second- but the object ranked second is a close second or poor second is not known.
Measurement of this type include greater than or less than judgment from the respondents
Common examples – quality rankings, ranking of teams in a tournament, preference
ranking, market position
3. Interval scale
In an interval scale, numerically equal distances on the scale represent equal values in the
characteristic being measured. An interval scale contains all the information of an ordinal scale,
but it also allows you to compare the differences between objects.
Temperature scale i.e., Centigrade and Fahrenheit are also interval scale, e.g the temperature of
four cities are:
It can be said that the difference in the temperature of Delhi and Shimla is the same as difference
in the temperature of Jaipur and Bangalore. However, we cannot say that Delhi is two times
warmer than Shimla.
4. Ratio scale
It is a special kind of internal scale that has a meaningful zero point. With this scale, length,
weight or distance can be measured. In this scale, it is possible to say, how many times greater or
smaller one object is being compared to the other.
The highest scale. it allows the researcher to identify or classify objects and compare intervals or
differences
It possesses all the properties of nominal, ordinal and interval scales and in addition an absolute
zero point.
Eg. Sales this year for product A are twice the sales of the same product last year.
2. Ordinal scale
The lowest level of the ordered scale that is commonly used is the ordinal scale.
The ordinal scale places events in order, but there is no attempt to make the intervals of
the scale equal in terms of some rule. Rank orders represent ordinal scales and are
frequently used in research relating to qualitative phenomena. A student’s rank in his
graduation class involves the use of an ordinal scale. One has to be very careful in
making statement about scores based on ordinal scales.
For instance, if Ram’s position in his class is 10 and Mohan’s position is 40, it cannot be said
that Ram’s position is four times as good as that of Mohan. The statement would make no sense
at all.
Ordinal scales only permit the ranking of items from highest to lowest. Ordinal measures
have no absolute values, and the real differences between adjacent ranks may not be
equal. All that can be said is that one person is higher or lower on the scale than another,
but more precise comparisons cannot be made.
Thus, the use of an ordinal scale implies a statement of ‘greater than’ or ‘less than’ (an
equality statement is also acceptable) without our being able to state how much greater or
less. The real difference between ranks 1 and 2 may be more or less than the difference
between ranks 5 and 6.
Since the numbers of this scale have only a rank meaning, the appropriate measure of
central tendency is the median.
A percentile or quartile measure is used for measuring dispersion.
Correlations are restricted to various rank order methods.
Measures of statistical significance are restricted to the non-parametric methods.
Ex: respondent is asked to rank 3 books on the content matter. They may give the following
ranks
Book Rank
Book – A 2
Book – B 3
Book – C 1
3. Interval scale
In the case of interval scale, the intervals are adjusted in terms of some rule that has been
established as a basis for making the units equal. The units are equal only in so far as one
accepts the assumptions on which the rule is based.
Interval scales can have an arbitrary zero, but it is not possible to determine for them
what may be called an absolute zero or the unique origin.
The primary limitation of the interval scale is the lack of a true zero; it does not have the
capacity to measure the complete absence of a trait or characteristic.
The Fahrenheit scale is an example of an interval scale and shows similarities in what
one can and cannot do with it. One can say that an increase in temperature from 30° to
40° involves the same increase in temperature as an increase from 60° to 70°, but one
cannot say that the temperature of 60° is twice as warm as the temperature of 30° because
both numbers are dependent on the fact that the zero on the scale is set arbitrarily at the
temperature of the freezing point of water. The ratio of the two temperatures, 30° and
60°, means nothing because zero is an arbitrary point.
Interval scales provide more powerful measurement than ordinal scales for interval scale
also incorporates the concept of equality of interval.
As such more powerful statistical measures can be used with interval scales.
Mean is the appropriate measure of central tendency, while standard deviation is the most
widely used measure of dispersion. Product moment correlation techniques are
appropriate and
the generally used tests for statistical significance are the ‘t’ test and ‘F’ test.
Ex: if we are measuring the performance of 3 students A, B &C on an interval scale and we get
the score like 1,3,7 then it can be graphically depicted as follows
4. Ratio scale
Ratio scales have an absolute or true zero of measurement.
The term ‘absolute zero’ is not as precise as it was once believed to be. We can conceive
of an absolute zero of length and similarly we can conceive of an absolute zero of time.
For example, the zero point on a centimeter scale indicates the complete absence of
length or height.
The number of minor traffic-rule violations and the number of incorrect letters in a page
of type script represent scores on ratio scales. Both these scales have absolute zeros and
as such all minor traffic violations and all typing errors can be assumed to be equal in
significance.
With ratio scales involved one can make statements like “Jyoti’s” typing performance
was twice as good as that of “Reetu.” The ratio involved does have significance and
facilitates a kind of comparison which is not possible in case of an interval scale.
Ratio scale represents the actual amounts of variables. Measures of physical dimensions
such as weight, height, distance, etc. are examples.
Generally, all statistical techniques are usable with ratio scales and all manipulations that
one can carry out with real numbers can also be carried out with ratio scale values.
Multiplication and division can be used with this scale but not with other scales
mentioned above.
Geometric and harmonic means can be used as measures of central tendency and
coefficients of variation may also be calculated.
Thus, proceeding from the nominal scale (the least precise type of scale) to ratio scale (the most
precise), relevant information is obtained increasingly. If the nature of the variables permits, the
researcher should use the scale that provides the most precise description. Researchers in
physical sciences have the advantage to describe variables in ratio scale form but the behavioural
sciences are generally limited to describe variables in interval scale form, a less precise type of
measurement.
ATTITUDE MEASUREMENT SCALES
1. LIKERT’S SCALE
2. SEMANTIC DIFFERENTIAL SCALE
3. THURSTONE SCALE
4. MULTI- DIMENSIONAL SCALING
Attitude may be defined as the degree of positive or negative affect associated with some
psychological object. It is a pre-disposition of the individuals to evaluate some object or symbol
or aspect of his world in a favourable or unfavourable manner. Attitude comprises of three
components.
For example, an individual having a favourable attitude towards a product may not buy it
because of economic considerations. For the purpose of marketing decision, the attitude
behaviour relationship relates to measuring of cognitive and affective components and being able
to predict future behaviour.
1. Likert Scale – It is a widely used rating scale that requires the respondents to indicate a
degree of agreement or disagreement with each of a series of statements about the stimulus
objects. Typically, each scale item has five response categories ranging from “strongly
disagree” to “strongly agree”.
Example of Likert Scale
Listed below are different opinions about Spar Hypermarket. Please indicate how strongly
you agree or disagree with each by using the following scale:
-2 to + 2 or 1 to 5
-2 or 1= Strongly disagree
-1 or 2=Disagree
0 or 3=Neither agree nor disagree
1 or 4=Agree
2 or 5=Strongly agree
Score
Total Score 27