0% found this document useful (0 votes)
44 views

BRM Unit-4

The document discusses various methods for preparing data for analysis, including data editing, coding, classification, tabulation and ensuring validity. It also covers quantitative and qualitative data analysis techniques like averages, percentages, graphs, unstructured/structured analysis and looking for patterns/themes.

Uploaded by

Deepa Selvam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

BRM Unit-4

The document discusses various methods for preparing data for analysis, including data editing, coding, classification, tabulation and ensuring validity. It also covers quantitative and qualitative data analysis techniques like averages, percentages, graphs, unstructured/structured analysis and looking for patterns/themes.

Uploaded by

Deepa Selvam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

BUSINESS RESEARCH METHODS

UNIT-IV
Data Preparation:
• A set of methods and techniques used to obtain information and insights
from data
• Helps avoid erroneous judgments and conclusions
• Can constructively influence the research objectives and the research
design

Preparing the Data for Analysis:


• Process and analyzing

• Data editing

• Coding
• Statistically adjusting the data

PROCESSING AND ANALYSIS OF DATA:


Processing: Editing, coding, classification and tabulation of collected data so that
they are amenable to analysis.

Analysis: Computation of certain measures along with searching for patterns of


relationship that exist among data groups.

Data Editing:
• A process of examining the collected raw data to detect errors and
omissions and to correct these when possible.

Data Editing
• Identifies omissions, ambiguities, and errors in responses

• Conducted in the field by interviewer and field supervisor and by the


analyst prior to data analysis

• Careful scrutiny of the completed questionnaire/schedules.

• To assure data are accurate , consistent, uniform and complete and well
arranged to facilitate coding and tabulation.

(i) Field editing

(ii) Central editing

Field Editing:

• While the data collection is in the process, the supervisor in responsible to


collect the correct information.

• Normally, it will be done at the same day or next day Here, 2 types
of problems may be avoided:
• Recording the Answer to Questions and Illegible Hand Writing. At times
Translation also.

Central Editing:

• At the Office mostly after data collection or at times during data collection.

• Editor must keep the following points in mind while editing the data
Familiar with instructions given to the interviewer and coder

• While crossing out the original entry, there should be just one line, so as
the original is visible

• Entries should be made in a distinguished color and that too in a standard


form.

• Editor should initial all answers which they change or supply

• Editor initials and date of editing be marked on each completed form.

Preparing the Data for Analysis:

Problems Identified with Data Editing

• Interviewer Error

• Omissions

• Ambiguity

• Inconsistencies

• Lack of Cooperation

• Ineligible Respondent

Coding:
The process of assigning numerals or other symbols to answers so that responses
can be put into a limited number of categories or classes.
(i) Hand coding

(ii) Computer coding.

• Coding closed-ended questions involves specifying how the responses are


to be entered

• Open-ended questions are difficult to code

– Lengthy list of possible responses is generated

Preparing the Data for Analysis:


Coding Classification:
The process of arranging data in groups or classes on the basis of common
characteristics.

(i) According to attributes.

Descriptive – sex, literacy, honesty.

Numerical – Weight, height, income.

(ii) According to class-intervals.

• Codebook Construction:

- Codebook or coding scheme contains each variable in the study and


specifies the application of coding rules to the variable

• Coding closed questions:

– Pre-coding:

• Helpful for manual data entry

• Codes from variable categories are accessible directly from


questionnaire

Types of Content
– Syntactical unit can be words, phrases, sentences or paragraphs

– Referential units are described by words, phrases and sentences they


may be objects, events, persons.

– Proportional units are assertions about an object, event, person

– Thematic units are topics contained within(and across) texts.

– Don’t know responses:

– It is a special problem for data preparation

– Dealing with undesired DK responses

– Missing data

– Missing data are information from a participant or case that is not


available for one or more variables of interest.

– Mechanism for missing data

– MCAR – Data missing completely at random

– MAR – Data missing at random

– NMAR – Data not missing at random

Data entry

• Keyboarding

• Database development

• Spreadsheet

• Optical recognition

– Optical character recognition (OCR)

– Optical mark recognition (OMR)


– Optical scanning

• Voice recognition

• Digital

• Barcode

Tabulation:
The process of summarizing raw data and displaying the same in compact form
(form of statistical table) for further analysis. The process of arranging the data
in some kind of concise or logical order.

(i) Hand

(ii) Mechanical/Electronic devices

Validity of Data:

• Estimating the values of unknown parameters of the population and testing


the hypothesis for drawing inferences.

Category:

Inferential analysis/ Statistical analysis – concerned with the various tests of


significance for testing hypothesis in order to determine with what validity data
can be said to indicate some conclusion.

- Estimation of population values.

- The task of drawing inferences and conclusions is performed.

Quantitative and Qualitative Data Analysis:

• Quantitative data – expressed as numbers

• Qualitative data – difficult to measure sensibly as numbers, e.g. count


number of words to measure dissatisfaction
• Quantitative analysis – numerical methods to ascertain size, magnitude,
amount

• Qualitative analysis – expresses the nature of elements and is represented


as themes, patterns, stories

• Be careful how you manipulate data and numbers!

Simple quantitative analysis:

• Averages

– Mean: add up values and divide by number of data points

– Median: middle value of data when ranked

– Mode: figure that appears most often in the data

• Percentages

• Graphical representations give overview of data

Number of errors made

10
Number of errors made

8
6
4
2
0
0 5 10 15 20
User
Internet use

< once a day

once a day

once a week

2 or 3 times a week

once a month

Number of errors made

4.5
Number of errors made

4
3.5
3
2.5
2
1.5
1
0.5
0
1 3 5 7 9 11 13 15 17
User

Simple qualitative analysis:

• Unstructured - are not directed by a script. Rich but not replicable.

• Structured - are tightly scripted, often like a questionnaire. Replicable but


may lack richness.

• Semi-structured - guided by a script but interesting issues can be explored


in more depth. Can provide a good balance between richness and
replicability.

Simple qualitative analysis:

Recurring patterns or themes:


– Emergent from data, dependent on observation framework if used

Categorizing data:

– Categorization scheme may be emergent or pre-specified

Looking for critical incidents:

– Helps to focus in on key events

Tools of data analysis:

• Descriptive and causal analyses

– Univariate analysis

• Measures of central tendency, Measures of skewness,


Measure of dispersion

– Bivariate analysis

• Correlation

• Regression

• ANOVA

– Multivariate analysis

• Multiple correlation and regression, Discriminant analysis,


MANOVA, cluster analysis, factor analysis, canonical
analysis

Bivariate analysis:

• Bivariate analysis focus on the relationship between two variables

• Look at associations/relationships among two variables.

• Look at measures of the strength of the relationship between two variables.


• Test hypotheses about relationships between two nominal or ordinal level
variables.

CHI-SQUARE TEST:
• Used to judge the significance of population variance

- To test the goodness of fit.

- To test the significance of association between two attributes.

- To test the homogeneity or the significance of population variance.

Test:

1. Test of goodness of fit – to see how well foes the assumed theoretical
distribution fit to the observed data.

2. Test of independence – to explain whether or not two attributes are


associated.

Conditions:

1. Observation recorded and used are collected on a random basis

2. All the items in a sample must be independent

3. No group should contain very few items. i.e less than 10.

4. The overall number of items must be reasonably large.

5. The constraints must be linear.

Steps:

1.Calculate the expected frequencies on the basis of given hypothesis or on the


basis of null hypothesis. Expected frequency of any cell =[(row total for the row
of that cell)x column total for the column of that cell)]/Grand total
2. Obtain the difference between observed and expected frequencies and find out
the squares of such difference. (Oij-Eij)2

3. Divide the quantity obtained by the corresponding expected frequency (Oij-


Eij)2/Eij

4. Find the summation i.e., Σ(Oij-Eij)2/Eij

Correlation:

• Correlation is the study of linear relationship between two variables

• Types of correlation

– On the basis of direction

• Positive correlation, Negative correlation

– On the basis of number of variables

• Simple correlation, Multiple correlation, Partial correlation

– On the basis of ratio of change direction

• Linear correlation, Non linear correlation

• Karl Pearson's Coefficient of correlation: Measuring the degree of


relationship between two variables.

• Multiple correlation analysis: Analysis of relationship between two or


more than two independent variables.

• Partial correlation analysis: Measuring the relation between a dependent


variable and a particular independent variable by holding all other variables
constant i.e., effect of independent variable on dependent variable.
• Simple Regression analysis: Determination of a statistical relationship
between two or more variables( dependent & independent variables).

ANALYSIS OF VARIANCE AND CO-VARIANCE:

• ANOVA -Useful technique in the field of economics, biology, psychology,


sociology, business /industry and other disciplines.

• To test for differences among the means of the populations by examining


the amount of variation within each of these samples, relative to the amount
of variation between the samples.

• One way ANOVA – consider only one factor and then observe that the
reason for said factor to be important is that several possible types of
samples can occur within that factor. Then determine if there are difference
within that factors.

• Two way ANOVA - consider two factors.

• ANOCOVA – the influence of uncontrolled variables is usually removed


by simple linear regression method and the residual sums of squares are
used to provide variance estimates which in turn are used to make tests of
significance.

MULTIVARIATE ANALYSIS:

• Multivariate techniques are appropriate for analyzing data when there are
two or more measurements of each observation and the variables are to be
analyzed simultaneously.

• This technique focus upon and bring out in bold relief, the structure if
simultaneous relationships among three or more phenomena.

VARIABLES:

• Explanatory variables
• Criterion variable

• Observed variables

• Latent variables

• Discrete variable and continuous variable

• Dummy variable (Pseudo variable)

CLASSIFICATION:

• Dependence methods

– Multiple regression

– Discriminant analysis

– Multivariate analysis of variance(MANOVA)

– Canonical analysis

• Interdependence methods

– Factor analysis

– Cluster analysis

– Multidimensional scaling

– Latent structure analysis

Cluster Analysis:

Cluster analysis is a class of techniques used to classify objects or cases into


relatively homogeneous groups called clusters. Objects in each cluster tend to be
similar to each other and dissimilar to objects in the other clusters. Cluster
analysis is also called classification analysis, or numerical taxonomy.

• A cluster consists of variable that correlate highly with one another and
have comparatively low correlations with variables in other clusters.

• The objective is to determine how many mutually and exhaustive groups


or clusters based on similarities of profiles among entities, really exist in
the population and then to state the composition of such groups.

Statistics Associated with Cluster Analysis:

• Agglomeration schedule. An agglomeration schedule gives information


on the objects or cases being combined at each stage of a hierarchical
clustering process.

• Cluster centroid. The cluster centroid is the mean values of the variables
for all the cases or objects in a particular cluster.

• Cluster centers. The cluster centers are the initial starting points in non-
hierarchical clustering. Clusters are built around these centers, or seeds.

• Cluster membership. Cluster membership indicates the cluster to which


each object or case belongs.
Applications:

• Marketing research (Segmentation, product positioning, NPD, test


marketing)

• Social network analysis

• Image Segmentation

• Data Mining

• Search Result Grouping

• Grouping of Shopping Items

• Petroleum Geology
• Physical Geography

• Crime Analysis

Discriminant Analysis:

• Discriminant analysis helps to identify the independent variables that


discriminate scaled dependent variable of interest.

• Linear combination of independent variables.

• Interval or ratio scale discriminates to groups of interest to the study.

• Discriminant analysis requires prior knowledge of the cluster or group


membership for each object or case included, to develop the classification
rule. In contrast, in cluster analysis there is no a priori information about
the group or cluster membership for any of the objects. Groups or clusters
are suggested by the data, not defined a priori.
Applications:

• Classifying and describing the subject

• Tool for improving the quality

• Determine descriptive variables

• Serves as check for diagnoses

• Gauges to locate the distance for groups

Conjoint Analysis:

• Conjoint analysis attempts to determine the relative importance


consumers attach to salient attributes and the utilities they attach to the
levels of attributes.
• The respondents are presented with stimuli that consist of combinations of
attribute levels and asked to evaluate these stimuli in terms of their
desirability.

• Conjoint procedures attempt to assign values to the levels of each attribute,


so that the resulting values or utilities attached to the stimuli match, as
closely as possible, the input evaluations provided by the respondents.

You might also like