RM-4
RM-4
ANALYSIS
• In research methodology, data preparation and analysis are crucial
steps that ensure the validity, reliability, and meaningful
interpretation of research findings.
• These steps help transform raw data into useful insights through
systematic processing and statistical evaluation.
DATA PREPARATION
Step 1: Data Collection
Editing is the process of reviewing and correcting collected data to ensure completeness, accuracy, and consistency.
Types of Editing
Types of Coding
• Pre-coding – Assigning codes before data collection, typically for structured questionnaires with fixed
response options. Example:
• Gender: Male = 1, Female = 2, Other = 3
• Post-coding – Done after data collection for open-ended responses by categorizing similar answers.
Example:
• Reasons for choosing a brand:
• Quality = 1
• Price = 2
• Availability = 3
• Thematic Coding – Used in qualitative research to identify recurring themes and patterns.
Coding Process
Basic Tools:
• Microsoft Excel / Google Sheets – Useful for small datasets and basic analysis.
• Google Forms / SurveyMonkey – Automatically collects and structures responses.
Statistical Software:
• SPSS – Used for structured data entry and advanced statistical analysis.
• Stata – Helpful for econometric and social science research.
• R / Python – For large datasets and advanced data processing.
A. Internal Validity
• Ensures that changes in the dependent variable are due to the independent variable and not other
factors.
• Example: In a study on a new teaching method, internal validity ensures that improvements in student
performance are due to the method, not external factors like prior knowledge.
B. External Validity
• Determines if the study results can be generalized to a broader population.
• Example: A study on employee motivation in one company may not apply to all industries.
C. Content Validity
• Checks if the research covers all relevant aspects of the topic.
• Example: A job satisfaction survey should include factors like salary, work environment, and job security.
D. Construct Validity
• Ensures that a test or tool measures the intended concept (construct).
• Example: A stress questionnaire should measure psychological stress, not just
physical fatigue.
E. Criterion Validity
• Compares study results with an external standard.
• Types:
• Concurrent validity: Comparing new measures with established ones (e.g., a new IQ test vs.
an existing IQ test).
• Predictive validity: Checking if a measure predicts future outcomes (e.g., SAT scores
predicting college performance).
QUALITATIVE VS. QUANTITATIVE DATA ANALYSIS
- methods,
- purposes, and
- interpretations
1. Quantitative Data Analysis
Quantitative analysis deals with numerical data and focuses on statistical or mathematical interpretations.
Characteristics:
✅ Uses numbers and measurable data
✅ Structured and objective
✅ Often uses large sample sizes
✅ Results are generalizable
Examples:
• A survey measuring customer satisfaction on a scale of 1-10.
• Analyzing student test scores to compare performance across schools.
2. Qualitative Data Analysis
Qualitative analysis deals with non-numerical data such as text, audio, video, and open-ended responses.
It focuses on understanding meanings, patterns, and themes.
Characteristics:
✅ Uses words, descriptions, and subjective interpretations
✅ Exploratory and flexible
✅ Often based on small, in-depth samples
✅ Results are context-specific
Methods of Qualitative Analysis:
• Thematic Analysis: Identifying patterns and themes in textual data.
• Content Analysis: Categorizing and coding words, phrases, or sentences.
• Narrative Analysis: Studying stories and personal experiences.
• Discourse Analysis: Examining how language is used in communication.
Examples:
• Analyzing interview responses on job satisfaction.
• Studying social media comments to understand customer opinions.
Key Differences: Qualitative vs.
Quantitative
Aspect Quantitative Analysis Qualitative Analysis
Data Type Numbers, statistics Words, descriptions, themes
Compares the means of two groups - Comparing male vs. female job
T-tests to check if they are significantly satisfaction levels.
different. - Testing the effectiveness of a new
drug compared to a placebo.
Factor Analysis Reduces a large number of variables into fewer underlying - Identifying key dimensions of customer satisfaction.
factors. - Grouping psychological traits into broader personality factors.
Groups similar observations into clusters based on shared - Market segmentation (e.g., grouping customers by
Cluster Analysis characteristics. purchasing behavior).
- Identifying different student learning styles.
Discriminant Analysis Classifies data into predefined groups based on multiple - Predicting whether a customer is likely to default on a loan.
predictors. - Classifying patients into risk categories for a disease.
• Used when researchers do not know the structure of relationships between variables.
• Helps discover underlying factors without predefined expectations.
• Example: Identifying dimensions of job satisfaction (e.g., work environment, salary, career
growth).
• Used when groups are well-separated and assumptions of normality are met.
• Creates a linear boundary between groups.
• Example: Classifying students as pass or fail based on attendance, study hours, and
test scores.
• Used when group separation is nonlinear (more flexible but requires more data).
• Example: Distinguishing between different types of cancer based on genetic markers.
Steps in Discriminant Analysis
Loyalty Status (1 =
Customer ID Annual Income Spending Score Purchase Frequency
Loyal, 0 = Non-loyal)
001 50,000 75 30 1
002 35,000 40 15 0
003 60,000 80 45 1
004 25,000 35 10 0
Step 2: Check Assumptions
Where:
• D= Discriminant score
• Xn= Independent variables
• bn= Coefficients
• C = Constant
Field Application
Unlike classification, where groups are predefined, clustering finds hidden patterns in
data.
A. Hierarchical Clustering
• Creates a tree-like structure (dendrogram) that shows how data points are merged
into clusters.
• Two types:
• Agglomerative (Bottom-Up) → Starts with individual points and merges them into clusters.
• Divisive (Top-Down) → Starts with all points in one cluster and splits them into smaller
clusters.
• Unlike K-Means, where each point belongs to one cluster, this allows
points to belong to multiple clusters with probabilities.
• Example: Identifying customers who fall into multiple market
segments.
Steps in Cluster Analysis
Field Application
001 50,000 70
002 20,000 30
003 80,000 90
004 35,000 40
Process:
2.Clusters Formed:
1. Cluster 1 (High Income, High Spending) → Luxury Buyers
2. Cluster 2 (Low Income, Low Spending) → Budget Shoppers
3. Cluster 3 (Mid Income, Mid Spending) → Occasional Buyers
3.Marketing Strategy:
1. Target Cluster 1 with premium products.
2. Offer discounts to Cluster 2 to increase spending.
3. Improve engagement with Cluster 3.
MULTIPLE REGRESSION AND
CORRELATION IN RESEARCH
What is Multiple Regression?
Example:
• A company wants to predict employee performance (Y) based on years of
experience (X1), education level (X2), and training hours (X3).
The Multiple Regression Equation:
Y= b0+ b1X1 +b2X2 + b3X3+ ϵ
Where
B. Stepwise Regression
• Variables are entered one at a time, based on statistical significance.
• Example: Selecting the best predictors for student exam scores.
C. Hierarchical Regression
• Variables are entered in a predefined order based on theory.
• Example: Testing how social media influences consumer purchases while controlling for income.
Correlation
• Correlation measures the strength and direction of the relationship between two variables.
Types of Correlation
✅ Used when distances or dissimilarities between objects are known but not explicit features.
Example:
• A marketing team wants to understand how customers perceive different brands. MDS can create a
map where brands placed closer together are more similar in perception.
Types of MDS
B. Non-Metric MDS
• Works with ranked similarities or dissimilarities instead of exact distances.
• Focuses on ordinal relationships (preserving order rather than exact values).
• Example: Mapping people's perceptions of different smartphone brands based
on survey rankings.
Steps in MDS Analysis
A
0 3 6 9
B
3 0 5 8
C
6 5 0 4
D 9 8 4 0
Step 2: Choose Number of Dimensions (Typically 2 or 3)
• Fewer dimensions = easier visualization but potential loss of
accuracy.
• More dimensions = more accuracy but harder to interpret.
Field Application
• Conjoint Analysis is a statistical technique used to understand how people value different
attributes of a product or service. It helps businesses and researchers determine which
features are most important in decision-making.
Example:
• A smartphone company wants to know whether customers value battery life, camera
quality, or price more when buying a phone.
Example: Conjoint Analysis for
Fast Food Menu
A fast-food chain wants to launch a new burger and needs to decide on
pricing, patty type, and portion size.
• Attributes & Levels:
:
Attribute Level 1 Level 2 Level 3
Price $5 $7 $9
Price 40%
Patty Type 35%
Portion Size 25%
Conclusion: Customers care most about price, followed by patty type. The company should focus on offering a
competitively priced burger with preferred patty options.
Types of Conjoint Analysis
•.
Attribute Level 1 Level 2 Level 3
Price $500 $800 $1200
Battery Life 12 hours 24 hours 36 hours
Camera Quality 12 MP 24 MP 48 MP
Step 2: Create Product Profiles
• Combine different attribute levels to generate product profiles.
• Example:
:
Product Option Price Battery Life Camera Quality
A $500 12 hours 12 MP
B $800 24 hours 24 MP
C $1200 36 hours 48 MP
Step 3: Collect Responses
.
Attribute Importance Score
Price 50%
Field Application
H. Data Visualization
✅ Software Used: Tableau, Python (Matplotlib, Seaborn), R (ggplot2)
✅ Used For:
• Creating charts, heatmaps, histograms
• Making data insights easier to interpret
Choosing the Right Statistical
Software
Need Best Software