Nursing Research Methods: PH.D in Nursing
Nursing Research Methods: PH.D in Nursing
D In nursing
Faculty Name
Date:
Subject Code: School of Nursing
Learning Objectives
At the end of this module the students will be able to ;
understand the meaning of data entry
Identify the ten most common problems in terms of quality
standards of data
Avoid the Measurement error
Discuss the meaning of validity and reliablity
describe the Process of data entry
Explain the Basic approaches to data entry
Summary
References
2
list of Contents
introduction
meaning of data entry
ten most common problems in terms of quality
standards of data
Measurement error
meaning of validity and reliability
Process of data entry
Basic approaches to data entry
Summary
References
3
Introduction
After data have been collected from a representative sample of
the population, the next step is to analyze them to test the
research hypotheses. Data analysis is
now routinely done with software programs such as SPSS,
SAS, STATPAK, SYSTAT, Excel, and the like. All are user-
friendly and interactive and have the capability to seamlessly
interface with different databases. Excellent graphs and charts
can also be produced through most of these software programs.
4
Meaning of data entry
Data entry
Once the data have been returned from the respondents the data
need to be recorded in computer readable form. This section
provides an overview of different approaches to data entry and
then discusses two approaches to data entry in a more detailed
way.
5
Data entry
6
Data entry- flow diagram of data
analysis
7
Data Preparation: Data Entry
13
The ten most common problems in terms of making
quality/ standards of data-contd
2.Questions may have accidentally been misprinted due to
technical or organizational imperfections, thereby preventing
respondents from giving appropriate answers.
3. Questions may have been skipped, or not reached, by the
respondents either in a randomized fashion or in a systematic
way which results in “gaps” in the data to misleading results. •
4. Respondents may give two or more responses when only one
answer was allowed, or questions may have been answered in
other unintended ways.
14
The ten most common problems in terms of making
quality/ standards of data-contd
5.Certain data values may not correspond to the coding
specifications or range validation criteria.
6.Answers to open-ended questions may contain outlier codes,
that is, there may be respondents with codes which are
improbably low or high even though they could be the valid
answers.
7.The values for certain data variables might not correspond to
the values of certain control variables. (For example, the value
of a control variable may state that a particular student did not
respond to a particular question set, whereas the data variables
for this question set indicate actual responses).
15
The ten most common problems in terms of making
quality/ standards of data-contd
8.Data from a respondent may contain inconsistent values.
(That is, the values for two or more variables may not be in
accord).
9.• Inconsistencies between data values from different
respondents which belong to a certain group may occur for
questions which are related to this group. (For example, for
students in the same class there may be different values for
variables which are related to the class).
10.• Inconsistencies may also occur between data values of
different but related datafiles or levels of aggregation.
16
Measures to make quality/ standards of data
17
Measures to make quality/ standards of data-contd
M e a s u r e m e n t:
R e lia b ility
V a lid ity
18
Measures to make quality/ standards of data-contd
Type of Measures
19
Measures to make quality/ standards of data-contd
20
Measures to make quality/ standards of data-contd
Validity
21
Measures to make quality/ standards of data-contd
Theory-related Validity
Face validity
– participant believability
Content validity (observable)
Blue print
Skills list
Construct validity (unobservable)
Group differences
Changes of times
Correlations/factor analysis
22
Measures to make quality/ standards of data-contd
Criterion-related Validity
Concurrent
Measure two variables and correlate them to demonstrate
that measure 1 is measuring the same thing as measure 2 –
same point in time.
Predictive
Measure two variables, one now and one in the future,
correlate them to demonstrate that measure
1 is predictive of measure
2, something in the future.
23
Measures to make quality/ standards of data-contd
24
Measures to make quality/ standards of data-contd
Instrument Reliability
Instrument Reliability
Cronbach’s alpha
n
1 SD items
2
alpha = n
1
n 1 SD 2
SD = m X n 2
1
n 1
Measures to make quality/ standards of data-contd
> 0.90
Excellent reliability, required for decision-making at the
individual level.
0.80
Good reliability, required for decision-making at the group
level.
0.70
Adequate reliability, close to unacceptable as too much error
in the data. Why?
Measures to make quality/ standards of data-contd
Internal Consistency: Cronbach’s alpha
Person A: Internally consistent
Person B: Internally inconsistent
Reliability Values
Range: 0 to 1
No negative signs like correlations
Cohen’s kappa and Scott’s pi are always
lower, i.e. 0.50, 0.60
Measures to make quality/ standards of data-contd
Utility
Things you would like to know about an instrument.
Reporting on Instruments
File construction
1. Specifying a filename
In order to create a new datafile, the programme will first ask
you
to give your datafile an alphanumeric name with a length of up
to 8
characters, for example, SAMPLE1.
34
process of data entry-contd
35
Process of data entry-contd
36
process of data entry-contd
Variable Type: The next question asks about the type of coding
that is used for the variable. The letter “C” indicates
categorical variables with a fixed set of alphanumeric or
numeric categories. The letter “N” indicates non-categorical
variables with open-ended numerical codes.
EX; While there are a fixed number of schools and therefore
only a fixed set of possible school identification values, the
number of possible values is very large and can be understood
as quasi-open-ended, so you should enter “N” into the second
blank field.
37
process of data entry-contd
Variable Length:
Afterwards you need to specify the number of digits
(including decimal places) which are required to code the data
values of this variable.
Assuming that, in our example, there are 150 schools the
identification codes of which are the numbers 1 to 150, we can
use a three-digit code to identify the schools, so you would
enter “3” into the codebook field for the length.
38
process of data entry-contd
39
process of data entry-contd
40
process of data entry-contd
The “Carry on” Indicator: The question “Carry data values on
as default?” asks you to specify whether the value of a variable
is carried as a default value to the next record when you enter
data.
This is useful for variables which remain constant for a number
of records. If the “Carry” indicator is set to “Y” for a particular
variable, then every new record will have the data value from
the previous record as the default value. You can then modify
this default value as required. If the “Carry” indicator is set to
“N”, then the default value for this variable will be the default
value which was specified for this variable.
As we may be entering many students for the same school,
you should enter “Y”.
41
process of data entry-contd
42
process of data entry-contd
Order (File): Similarly, you can specify the sequential position
in which variables will be recorded in the datafiles. If you do
not specify anything, the programme will set these sequential
positions so that the variables appear on the display in the
sequence in which you define them.
Field Label: For the descriptive label you could fill in “School
identification code”.
43
process of data entry-contd
44
process of data entry-contd
“Default” Code: You can provide a code that will be used as a
programme default when you create a new record in the
datafile. In the case of the variable IDSCHOOL, you could
leave this codebook field blank or specify 999 as its default
code.
Valid Range: You can specify a valid range that determines
which data values the user is allowed to enter when entering
data. Assuming that in our example, there are 150 schools the
identification codes of which are the numbers 1 to 150, you
would enter the numbers 1 and 150 in the corresponding
codebook fields.
45
process of data entry-contd
46
process of data entry-contd
47
Improper data entry into excel
Correct data entry – variable name in first row, each variable
entry into separate columns
Basic approaches to data entry
50
Basic approaches to data entry-Contd
Adequate procedures for data entry depend on instrument
design and on the data collection methods. Sometimes in large
scale surveys, data entry procedures are used wherein data are
recorded directly in computer readable form using optical or
magnetic character readers, optical or magnetic mark readers,
or micro-computers during fieldwork.
Examples of this are computer assisted telephone interviewing
(CATI) and computer assisted personal interviewing (CAPI)
systems. Whereas transcription errors can be minimized with
these procedures, the use of such technical innovations requires
careful planning, an expensive technical environment, and
trained respondents.
51
Basic approaches to data entry-Contd
52
Basic approaches to data entry-Contd
53
Basic approaches to data entry-Contd
Key verification procedures, or better still, independent
verification techniques where two coders code and enter the
data independently, can help to ensure the correctness of the
data entered.
While perhaps too costly to process the whole dataset, at least a
reasonable sized sample of the data should be verified using
these techniques in order to estimate the error introduced and
to decide on further corrective measures to ensure sufficient
data quality.
Often it is advantageous to identify the coder who entered
each record so that any errors can be traced back. This can be
done by adding a coder identification code to the datafile.
54
Basic approaches to data entry-Contd
55
Basic approaches to data entry-Contd
Using a text editor for data entry
For each piece of information in the data collection instruments
the codebook defines which format and into which positions it
should be entered into the raw datafile. Following the
definitions in the codebook, it is possible to simply enter the
data into a text editor or word processor.
An example for how such a text file would look like is
provided in the following using the codebook of the above
sample questionnaire.
103103042 83941991019110
103103051124232130110110
103104063 92221241000110
56
Basic approaches to data entry-Contd
As you can see, the codebook starts with the School ID (103),
followed by the Student ID (10304), the student sex (the 2
indicates a girl), the students age (8 years), and so on until all
variables in the codebook have been coded.
However, a great deal of caution must be used when following
this approach and there is usually a great deal of work involved
in resolving problems resulting from such an approach. To give
an example for this, four frequently occurring problems are
listed in the following:
If, by mistake, a coder skips a code or enters a code twice, then
all subsequent codes in the datafile will be shifted and thus
change their implied meaning in the datafile:
Incorrect: 10310304 83941991019110
Correct: 103103042 83941991019110
57
Basic approaches to data entry-Contd
The student age should be coded in columns 10-11. If, as in the
above example for student 10304, the coder puts the code for
the age in the 10th position and then continues in position 11
with the remaining variables, then columns 10-11 would
contain the value 83 and the computer would interpret this as
the age of 83 years in later analyses.
All variables following the students age would be
misinterpreted similarly. This can have dramatic impacts on the
statistical results, for example, if we calculate the mean age
and there is an outlier with 83 years in the datafile, then the
overall mean can change substantially if the sample size is not
too large.
58
Basic approaches to data entry-Contd
The approach also does not allow to verify during data entry
whether the data values entered conform indeed to the
specifications in the codebook:
Example: 103104063 92221241000110
59
Basic approaches to data entry-Contd
In this example the position for the student sex contains
the value “3” which is outside the set of permitted values
(“1” for “boy”, “2” for “girl”, and “8” and “9” for the
missing codes) and is obviously a coding error. Besides
losing the information for this student it also has, if
undetected, an impact on the results of statistical analyses.
Furthermore, such an approach does not allow to verify
the data for internal consistency while the data are
entered:
Example: 103103042 83941341019110
60
Using a computer-controlled approach for data
entry, the data entry manager programme
Transcriptive data entry can greatly be facilitated through the
use of interactive data entry software that integrates the
processes of the entry, editing, and verification of the data.
Such data entry systems come often also with integrated data
and file management capabilities including mechanisms for the
transfer of the data to standard statistical analysis systems.
Using such systems, deviations of datavalues from pre-
specified validation criteria or data verification rules can be
detected quickly, thereby letting the user correct the error while
the original documents are still at hand.
61
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
62
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
With the FILE menu you can open, create, delete or sort a
datafile. As you have seen earlier in this module, you can use
this menu also to edit the electronic codebook which is
associated with each datafile and which contains all
information about the file structure and the coding schemes
employed.
Furthermore, you can use this menu to print the electronic
codebook or to transform the information in the electronic
codebook into SAS™, SPSS™, or OSIRIS/IDAMS™ control
statements which you can use later in order to convert the
datafiles into SAS™, SPSS™, or OSIRIS/IDAMS™ system
files.
Finally this menu allows you to exit the WinDEM programme.
63
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
With the EDIT menu you can enter, modify, or delete data in
data files. You can look at a data file in two different ways:
(a) in record view, you can view the data for one record at a
time with detailed information on each of the variables;
(b) in table view, you can view a data file as a whole in tabular
form with records shown as rows and variables shown as
columns.
The programme will control the processing of the data
entered, interrupting and alerting you when data values fail to
meet the range validation criteria which are specified in the
electronic codebook.
64
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
With the SEARCH menu you can search for specific records
using your own search criteria or locate a record with a known
record number.
With the SUBSET menu, you can define a subset of specific
records using your own criteria. This will then restrict your
view of the data to the records which match these criteria.
65
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
66
summary
67
References
Carol Leslie Macnee, (2008), Understanding Nursing Research:
Using Research in Evidence-based Practice, Lippincott Williams
& Wilkins, ISBN 0781775582, 9780781775588
Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
http://www.vbtutor.net/research/research_chp7.htm#sthash.ZtzD
oA7r.dpuf
http://adamowen.hubpages.com/hub/Understanding-The-
Different-Types-of-Research-Data
http://www15.uta.fi/FAST/FIN/RESEARCH/sources.html
http://www.medicotips.com/2012/01/datatypes-of-data-and-
sources-of-data.html
Thanks
Next Topic>>
Qualitative vs.
Quantitative data
analysis techniquea
69