0% found this document useful (0 votes)

140 views

Data Preparation: March 6, 2010

The document outlines the key steps in preparing data for analysis: 1) checking questionnaires for completeness and quality, 2) editing unsatisfactory responses by returning to respondents or assigning missing values, 3) coding responses numerically, 4) transcribing data electronically, 5) cleaning data by checking for errors and inconsistencies, 6) statistically adjusting the data through weighting, variable respecification, and standardization to make it representative and suitable for analysis.

Uploaded by

Atisha Juneja

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views

Data Preparation: March 6, 2010

Uploaded by

Atisha Juneja

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 17

Data Preparation

March 6, 2010
Data Preparation Process
Prepare Preliminary Plan of Data Analysis

Check Questionnaire

Edit

Code

Transcribe

Clean Data

Statistically Adjust the Data

Select Data Analysis Strategy

Questionnaire Checking
A questionnaire returned from the field may be unacceptable
for several reasons.
– Parts of the questionnaire may be incomplete.
– The pattern of responses may indicate that the
respondent did not understand or follow the instructions.
– The responses show little variance.
– One or more pages are missing.
– The questionnaire is received after the preestablished
cutoff date.
– The questionnaire is answered by someone who does not
qualify for participation.
Editing
Treatment of Unsatisfactory Results
– Returning to the Field – The questionnaires with
unsatisfactory responses may be returned to the field,
where the interviewers recontact the respondents.
– Assigning Missing Values – If returning the questionnaires
to the field is not feasible, the editor may assign missing
values to unsatisfactory responses.
– Discarding Unsatisfactory Respondents – In this
approach, the respondents with unsatisfactory responses
are simply discarded.
Coding
Coding means assigning a code, usually a number, to each possible
response to each question. The code includes an indication of the
column position (field) and data record it will occupy.

Coding Questions

• Fixed field codes, which mean that the number of records for each
respondent is the same and the same data appear in the same
column(s) for all respondents, are highly desirable.
• If possible, standard codes should be used for missing data. Coding of
structured questions is relatively simple, since the response options
are predetermined.
• In questions that permit a large number of responses, each possible
response option should be assigned a separate column.
Coding
Guidelines for coding unstructured questions:
• Category codes should be mutually exclusive and collectively
exhaustive.
• Only a few (10% or less) of the responses should fall into the
“other” category.
• Category codes should be assigned for critical issues even if
no one has mentioned them.
• Data should be coded to retain as much detail as possible.
Codebook
A codebook contains coding instructions and the necessary
information about variables in the data set. A codebook
generally contains the following information:
• column number
• record number
• variable number
• variable name
• question number
• instructions for coding
Coding Questionnaires
• The respondent code and the record number appear on each
record in the data.
• The first record contains the additional codes: project code,
interviewer code, date and time codes, and validation code.
• It is a good practice to insert blanks between parts.
An Illustrative Computer File
Fields
Column Numbers
Records 1-3 4 5-6 7-8 ... 26 ... 35 77

Record 1 001 1 31 01 6544234553 5

Record 11 002 1 31 01 5564435433 4
Record 21 003 1 31 01 4655243324 4
Record 31 004 1 31 01 5463244645 6
Record 2701 271 1 31 55 6652354435 5
Data Transcription
Raw Data

CATI/ Keypunching via Mark Sense Optical Computerized

CAPI CRT Terminal Forms Scanning Sensory
Analysis
Verification:Correct
Keypunching Errors

Computer Magnetic
Disks
Memory Tapes

Transcribed Data
Data Cleaning
Consistency Checks

Consistency checks identify data that are out of range,

logically inconsistent, or have extreme values.
– Computer packages like SPSS, SAS, EXCEL and MINITAB can

be programmed to identify out-of-range values for each

variable and print out the respondent code, variable code,
variable name, record number, column number, and out-
of-range value.
– Extreme values should be closely examined.
Data Cleaning
Treatment of Missing Responses
• Substitute a Neutral Value – A neutral value, typically the mean
response to the variable, is substituted for the missing responses.
• Substitute an Imputed Response – The respondents' pattern of
responses to other questions are used to impute or calculate a
suitable response to the missing questions.
• In casewise deletion, cases, or respondents, with any missing
responses are discarded from the analysis.
• In pairwise deletion, instead of discarding all cases with any missing
values, the researcher uses only the cases or respondents with
complete responses for each calculation.
Statistically Adjusting the Data
Weighting

• In weighting, each case or respondent in the

database is assigned a weight to reflect its
importance relative to other cases or respondents.
• Weighting is most widely used to make the sample
data more representative of a target population on
specific characteristics.
• Yet another use of weighting is to adjust the sample
so that greater importance is attached to
respondents with certain characteristics.
Statistically Adjusting the Data
Use of Weighting for Representativeness

Years of Sample Population
Education Percentage Percentage Weight

Elementary School
0 to 7 years 2.49 4.23 1.70
8 years 1.26 2.19 1.74

High School
1 to 3 years 6.39 8.65 1.35
4 years 25.39 29.24 1.15

College
1 to 3 years 22.33 29.42 1.32
4 years 15.02 12.01 0.80
5 to 6 years 14.94 7.36 0.49
7 years or more 12.18 6.90 0.57

Totals 100.00 100.00
Statistically Adjusting the Data
Variable Respecification

• Variable respecification involves the transformation of data

to create new variables or modify existing variables.
• E.G., the researcher may create new variables that are
composites of several other variables.
• Dummy variables are used for respecifying categorical
variables. The general rule is that to respecify a categorical
variable with K categories, K-1 dummy variables are needed.
Statistically Adjusting the Data
Variable Respecification

Product Usage Original Dummy Variable Code

Category Variable
Code X1 X2 X3

Nonusers 1 1 0 0
Light users 2 0 1 0
Medium users 3 0 0 1
Heavy users 4 0 0 0

Note that X1 = 1 for nonusers and 0 for all others. Likewise, X2 = 1 for
light users and 0 for all others, and X3 = 1 for medium users and 0 for all
others. In analyzing the data, X1, X2, and X3 are used to represent all
user/nonuser groups.
Statistically Adjusting the Data
Scale Transformation and Standardization

Scale transformation involves a manipulation of scale values

to ensure comparability with other scales or otherwise make
the data suitable for analysis.

A more common transformation procedure is

standardization. Standardized scores, Zi, may be obtained as:

Zi = (Xi - )/sx X

Manual GammaFinder II
No ratings yet
Manual GammaFinder II
52 pages
Pascal Croci-Auschwitz (Graphic Novel) - Norma Editorial (2009) PDF
57% (7)
Pascal Croci-Auschwitz (Graphic Novel) - Norma Editorial (2009) PDF
83 pages
Fuzzy Identification of Systems and Its Applications To Modeling and Control
No ratings yet
Fuzzy Identification of Systems and Its Applications To Modeling and Control
17 pages
Data Preparation and Processing
No ratings yet
Data Preparation and Processing
30 pages
Data Preparation Process PDF
No ratings yet
Data Preparation Process PDF
30 pages
Data Collection-Methods
No ratings yet
Data Collection-Methods
42 pages
Data Collection-Methods
No ratings yet
Data Collection-Methods
42 pages
Session 1
No ratings yet
Session 1
23 pages
Data Preparation
No ratings yet
Data Preparation
16 pages
Week 9 Data Analysis Using SPSS 33
0% (1)
Week 9 Data Analysis Using SPSS 33
82 pages
Chapter Fourteen: Data Preparation
No ratings yet
Chapter Fourteen: Data Preparation
21 pages
Research Methodology: Lecture No
No ratings yet
Research Methodology: Lecture No
36 pages
Topic:-Editing and Coding : Business Research Method
No ratings yet
Topic:-Editing and Coding : Business Research Method
22 pages
Data Processing
No ratings yet
Data Processing
33 pages
Chapter Fourteen: Data Preparation
No ratings yet
Chapter Fourteen: Data Preparation
21 pages
Data Preparation - 2
No ratings yet
Data Preparation - 2
16 pages
RM Unit-4 & 5
No ratings yet
RM Unit-4 & 5
23 pages
Getting Data Ready For Analysis
No ratings yet
Getting Data Ready For Analysis
9 pages
Data Preparation
No ratings yet
Data Preparation
3 pages
3B. Kuantitatif - Data Preparation (Malhotra 14)
No ratings yet
3B. Kuantitatif - Data Preparation (Malhotra 14)
28 pages
Data Preparation
100% (1)
Data Preparation
38 pages
Lecture Note CH 7 Ed Ok1
No ratings yet
Lecture Note CH 7 Ed Ok1
77 pages
Chapter Fourteen: Data Preparation
No ratings yet
Chapter Fourteen: Data Preparation
26 pages
Research Methodoly 151 298
No ratings yet
Research Methodoly 151 298
148 pages
11
No ratings yet
11
23 pages
rRM 2023 (1)
No ratings yet
rRM 2023 (1)
105 pages
Getting The Data Ready For Analysis
No ratings yet
Getting The Data Ready For Analysis
2 pages
Block 4
No ratings yet
Block 4
50 pages
Unit 5
No ratings yet
Unit 5
55 pages
Data Preparation
No ratings yet
Data Preparation
12 pages
BRM Unit-4
No ratings yet
BRM Unit-4
18 pages
Data Processing: by Mrs P. K. Arunga 15. 06. 2021
No ratings yet
Data Processing: by Mrs P. K. Arunga 15. 06. 2021
41 pages
Unit - 4: Data Preparation and Analysis
No ratings yet
Unit - 4: Data Preparation and Analysis
73 pages
Data Processing in Research Methodology
100% (4)
Data Processing in Research Methodology
4 pages
Research Proposal Components-Methodology
No ratings yet
Research Proposal Components-Methodology
27 pages
Data Preparation & Univariate Analysis
No ratings yet
Data Preparation & Univariate Analysis
18 pages
RM II Data Processing
No ratings yet
RM II Data Processing
41 pages
Topic Five (5)
No ratings yet
Topic Five (5)
55 pages
Data Processing and Analysis: Chapter Six
No ratings yet
Data Processing and Analysis: Chapter Six
39 pages
Module-3 RM Vipul2
No ratings yet
Module-3 RM Vipul2
11 pages
6 Data Entry
No ratings yet
6 Data Entry
14 pages
Coding of Data
No ratings yet
Coding of Data
15 pages
Coding, Editing
No ratings yet
Coding, Editing
30 pages
Assign02 Ques03
No ratings yet
Assign02 Ques03
7 pages
Assign02 Ques03
No ratings yet
Assign02 Ques03
7 pages
UNIT-8 PROCESSING & ANALYSIS OF DATA -PPT
No ratings yet
UNIT-8 PROCESSING & ANALYSIS OF DATA -PPT
39 pages
Data Preparation Notebook
No ratings yet
Data Preparation Notebook
14 pages
2 Data Preperation
No ratings yet
2 Data Preperation
21 pages
RM-4
No ratings yet
RM-4
87 pages
Unit V Proessing & Analysis
No ratings yet
Unit V Proessing & Analysis
35 pages
CHAPTER-7(1)
No ratings yet
CHAPTER-7(1)
13 pages
Data Processing
No ratings yet
Data Processing
21 pages
Data Processing and Coding
No ratings yet
Data Processing and Coding
17 pages
4 q2 Practical Research
No ratings yet
4 q2 Practical Research
31 pages
Data Preparation and Analysis Final
No ratings yet
Data Preparation and Analysis Final
14 pages
Lecture 29
No ratings yet
Lecture 29
31 pages
Unit Iv (Research Methods in Business)
No ratings yet
Unit Iv (Research Methods in Business)
18 pages
SPSS Session
No ratings yet
SPSS Session
133 pages
Market Research 2
No ratings yet
Market Research 2
30 pages
RESEARCH
No ratings yet
RESEARCH
4 pages
CH 7 & 8 (Analyzing Data & Research Report Writing)
No ratings yet
CH 7 & 8 (Analyzing Data & Research Report Writing)
88 pages
Research Methods: PHD in Nursing
No ratings yet
Research Methods: PHD in Nursing
63 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
The World of Retailing: October 16, 2010
No ratings yet
The World of Retailing: October 16, 2010
75 pages
Job Analysis: Introduction, Importance, Methods Etc
No ratings yet
Job Analysis: Introduction, Importance, Methods Etc
36 pages
Job Analysis: Introduction, Importance, Methods Etc
No ratings yet
Job Analysis: Introduction, Importance, Methods Etc
36 pages
Base Rate
100% (1)
Base Rate
13 pages
Commercial Banks and Industrial Finance - Evolving Role: Prof. Neelam Tandon
No ratings yet
Commercial Banks and Industrial Finance - Evolving Role: Prof. Neelam Tandon
26 pages
PL Lab Manual-1
No ratings yet
PL Lab Manual-1
20 pages
An Innovative and Augmentative Android Application For Enhancing Mediated Communication of Verbally Disabled People
No ratings yet
An Innovative and Augmentative Android Application For Enhancing Mediated Communication of Verbally Disabled People
5 pages
3- R Programming main file
No ratings yet
3- R Programming main file
137 pages
Creo 7.0 Read This First
No ratings yet
Creo 7.0 Read This First
9 pages
4.. Driver and Driver Types
No ratings yet
4.. Driver and Driver Types
9 pages
My Possessive Bad Boy - Bayu Permana
No ratings yet
My Possessive Bad Boy - Bayu Permana
527 pages
Oillab 710 Asvp - Air Satured Vapour Pressure: Automatic Analysers: Oillab Range
No ratings yet
Oillab 710 Asvp - Air Satured Vapour Pressure: Automatic Analysers: Oillab Range
1 page
Baseline PPT Outline - ACS
No ratings yet
Baseline PPT Outline - ACS
8 pages
NEFW.A.L001 (JustBlog)
No ratings yet
NEFW.A.L001 (JustBlog)
14 pages
AAM3691 Assignment 1 and 2
0% (1)
AAM3691 Assignment 1 and 2
18 pages
You Got Cashapp Right?
No ratings yet
You Got Cashapp Right?
3 pages
IHP 18mm REF 15724.
No ratings yet
IHP 18mm REF 15724.
2 pages
CPAR - Q2 - Week 7 8 v2
No ratings yet
CPAR - Q2 - Week 7 8 v2
10 pages
ccs-352-multimedia-and-animation-question-bank-unitwise
No ratings yet
ccs-352-multimedia-and-animation-question-bank-unitwise
27 pages
Benchmark Analysis Ems DMS
No ratings yet
Benchmark Analysis Ems DMS
44 pages
LKPD Kelas 6
No ratings yet
LKPD Kelas 6
16 pages
Dray Tek
No ratings yet
Dray Tek
13 pages
Strama 1-7
No ratings yet
Strama 1-7
22 pages
IoT UNit 3 IPU
No ratings yet
IoT UNit 3 IPU
116 pages
Deploying High-Density Pods in A Low-Density Data Center
No ratings yet
Deploying High-Density Pods in A Low-Density Data Center
21 pages
Full Essential Mathematics For Games and Interactive Applications 3rd Edition James M. Van Verth Ebook All Chapters
100% (4)
Full Essential Mathematics For Games and Interactive Applications 3rd Edition James M. Van Verth Ebook All Chapters
84 pages
Maxserver Getting Started
No ratings yet
Maxserver Getting Started
164 pages
MPC 5676
No ratings yet
MPC 5676
92 pages
Generations of Computer
No ratings yet
Generations of Computer
6 pages
Information at Risk Online Safety Tips Internet Threats: Review
No ratings yet
Information at Risk Online Safety Tips Internet Threats: Review
68 pages
Unit 10 Describing Object: Study The Dialogue Below. Read, Discuss and If Necessary Practice The Dialogue!
No ratings yet
Unit 10 Describing Object: Study The Dialogue Below. Read, Discuss and If Necessary Practice The Dialogue!
4 pages
Fault Code 278-01
100% (1)
Fault Code 278-01
6 pages

Data Preparation: March 6, 2010

Uploaded by

Data Preparation: March 6, 2010

Uploaded by

Data Preparation

Statistically Adjust the Data

Select Data Analysis Strategy

Record 1 001 1 31 01 6544234553 5

CATI/ Keypunching via Mark Sense Optical Computerized

Consistency checks identify data that are out of range,

be programmed to identify out-of-range values for each

• In weighting, each case or respondent in the

• Variable respecification involves the transformation of data

Product Usage Original Dummy Variable Code

Scale transformation involves a manipulation of scale values

A more common transformation procedure is

You might also like