0% found this document useful (0 votes)
91 views69 pages

Nursing Research Methods: PH.D in Nursing

ppt on research

Uploaded by

susila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views69 pages

Nursing Research Methods: PH.D in Nursing

ppt on research

Uploaded by

susila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 69

Ph.

D In nursing

Nursing research methods

Module No: 7 DATA PREPARATION

Name of the subtopic


7.2 Data entry, Validity of data

Faculty Name
Date:
Subject Code: School of Nursing
Learning Objectives
At the end of this module the students will be able to ;
 understand the meaning of data entry
 Identify the ten most common problems in terms of quality
standards of data
 Avoid the Measurement error
 Discuss the meaning of validity and reliablity
 describe the Process of data entry
 Explain the Basic approaches to data entry
 Summary
 References

2
list of Contents
 introduction
 meaning of data entry
 ten most common problems in terms of quality
standards of data
 Measurement error
 meaning of validity and reliability
 Process of data entry
 Basic approaches to data entry
 Summary
 References

3
Introduction
 After data have been collected from a representative sample of
the population, the next step is to analyze them to test the
research hypotheses. Data analysis is
now routinely done with software programs such as SPSS,
SAS, STATPAK, SYSTAT, Excel, and the like. All are user-
friendly and interactive and have the capability to seamlessly
interface with different databases. Excellent graphs and charts
can also be produced through most of these software programs.

4
Meaning of data entry

 Data entry
 Once the data have been returned from the respondents the data
need to be recorded in computer readable form. This section
provides an overview of different approaches to data entry and
then discusses two approaches to data entry in a more detailed
way.

5
Data entry

 However, before we start analyzing the data to test hypotheses,


some preliminary steps need to be completed. These help to
ensure that the data are reasonably good and of assured quality
for further analysis.
Four steps in data analysis:
 (1) getting data ready for analysis,
 (2) getting a feel for the data,
 (3) testing the goodness of data, and
 (4) testing the hypotheses.

We will examine each of these in other modules.

6
Data entry- flow diagram of data
analysis

7
Data Preparation: Data Entry

 Data entry converts information gathered by secondary or


primary methods to a medium for reviewing and manipulation.
 Keyboarding remains a mainstay for researchers who need to
create a data file immediately and store it in a minimal space
on a variety of media.
 However, researchers have profited from more efficient ways
of speeding up the research process, especially from bar coding
and optical character and mark recognition.
Entering the Data into the Computer
 We shall discuss entering of data into excel sheet. Other ways of
entering data is directly into the Spss workbook or into any data
management software's.
 Data entry into the excel sheet should follow the data base
structure and coding rules.
 Data entry should be accurate, neat and legible for processing of
data for analysis.
 The goal is to transfer all of the study data the investigator has
collected into the spreadsheets.
Data Preparation: Data Entry
 Keyboarding: A full screen editor, where an entire data file can
be edited or browsed, is a viable means of data entry for
statistical packages like SPSS or SAS.
 SPSS offers several data entry products, including Data
Entry Builder which enables the development of forms and
surveys, and Data Entry Station which gives centralized
entry staff, such as telephone interviews or online
participants, access to the survey.
 Both SAS and SPSS offer software that effortless accesses
data from databases, spreadsheets, data warehouses, or data
marts.
Data Preparation: Data Entry
 Bar-code technology is used to simplify the interviewer’s role
as a data recorder. When an interviewer passes a bar-code over
the appropriate codes, the data are recorded in a small,
lightweight unit for translation later
 Researchers studying magazine readership can scan bar codes
to denote a magazine cover that is recognized by an interview
participant.
Data Preparation: Data Entry
 Optical Character Recognition (OCR):
 Users of a PC image scanner are familiar with OCR
programs which transfer printed text into computer files in
order to edit and use it without retyping.
 Optical scanning of instruments is efficient for researchers.
 Optical scanners process the marked-sensed questionnaires
and store the answers in a file.
 This method has been adopted by researchers for data entry
and preprocessing due to its faster speed, cost savings on
data entry, convenience in charting and reporting data, and
improved accuracy.
 It reduces the number of times data are handed, thereby
reducing the number of errors that are introduced.
The ten most common problems in terms of making
quality/ standards of data

1.• Respondents may have been assigned invalid or


wrong
identification codes either during instrument preparation,
field administration or data transcription. This can lead to
difficulties if later analyses require linkages between
different respondents or between different levels of data
aggregation

13
The ten most common problems in terms of making
quality/ standards of data-contd
 2.Questions may have accidentally been misprinted due to
technical or organizational imperfections, thereby preventing
respondents from giving appropriate answers.
 3. Questions may have been skipped, or not reached, by the
respondents either in a randomized fashion or in a systematic
way which results in “gaps” in the data to misleading results. •
 4. Respondents may give two or more responses when only one
answer was allowed, or questions may have been answered in
other unintended ways.

14
The ten most common problems in terms of making
quality/ standards of data-contd
 5.Certain data values may not correspond to the coding
specifications or range validation criteria.
 6.Answers to open-ended questions may contain outlier codes,
that is, there may be respondents with codes which are
improbably low or high even though they could be the valid
answers.
 7.The values for certain data variables might not correspond to
the values of certain control variables. (For example, the value
of a control variable may state that a particular student did not
respond to a particular question set, whereas the data variables
for this question set indicate actual responses).

15
The ten most common problems in terms of making
quality/ standards of data-contd
 8.Data from a respondent may contain inconsistent values.
(That is, the values for two or more variables may not be in
accord).
 9.• Inconsistencies between data values from different
respondents which belong to a certain group may occur for
questions which are related to this group. (For example, for
students in the same class there may be different values for
variables which are related to the class).
 10.• Inconsistencies may also occur between data values of
different but related datafiles or levels of aggregation.

16
Measures to make quality/ standards of data

What we measure as Data are:

 Knowledge, Attitudes, Behaviors (KAB)


 Physiological variables
 Symptoms
 Skills
 Costs

17
Measures to make quality/ standards of data-contd

Classical Measurement Theory:

M e a s u r e m e n t:

R e lia b ility

O b s e rv a tio n = T ru th (fa c t) + /- E rro r

V a lid ity

18
Measures to make quality/ standards of data-contd

Type of Measures

 Standardized – evidence as follows:


1. Systematically developed
2. Evidence for instrument validity
3. Evidence for instrument reliability
4. Evidence for instrument utility – time, scoring, costs,
sensitive to change over time
 Non-standardized

19
Measures to make quality/ standards of data-contd

Types of Measurement Error

 Systematic - can work to minimize systematic error due to


poor instructions, poor reliability of measures, etc.

 Random - can do nothing about this, always present, we never


measure anything perfectly, there is always some error.

20
Measures to make quality/ standards of data-contd

Validity

Question: Does the instrument measure what it is supposed


to measure?
 Theory-related validity
 Face validity
 Content validity
 Construct validity
 Criterion-related validity
 Concurrent validity
 Predictive validity

21
Measures to make quality/ standards of data-contd

Theory-related Validity
 Face validity
– participant believability
 Content validity (observable)
 Blue print
 Skills list
 Construct validity (unobservable)
 Group differences
 Changes of times
 Correlations/factor analysis
22
Measures to make quality/ standards of data-contd

Criterion-related Validity
 Concurrent
 Measure two variables and correlate them to demonstrate
that measure 1 is measuring the same thing as measure 2 –
same point in time.

 Predictive
 Measure two variables, one now and one in the future,
correlate them to demonstrate that measure
 1 is predictive of measure
 2, something in the future.
23
Measures to make quality/ standards of data-contd

 Design Validity  Instrument Validity


Does the research design Does the instrument measure
allow the investigator to what it is supposed to
answer their hypothesis? measure?
(Threats of internal and
external validity)

24
Measures to make quality/ standards of data-contd

Instrument Reliability

Question: can you trust the data?

 Stability – change over time


 Consistency – within item agreement
 Rater reliability – rater agreement
Measures to make quality/ standards of data-contd

Instrument Reliability

 Test-retest reliability (stability)


 Pearson product moment correlations
 Cronbach’s alpha (consistency) – one point in time, measures
inter-item correlations, or agreements.
 Rater reliability (correct for change agreement)
 Inter-rater reliability Cohen’s kappa
 Intra-rater reliability Scott’s pi
Measures to make quality/ standards of data-contd

Cronbach’s alpha

 n

 1   SD  items 
2

alpha = n
 1

n 1  SD 2

 

SD =   m  X n  2

1
n 1
Measures to make quality/ standards of data-contd

Cronbach alpha Reliability Estimates:

 > 0.90
 Excellent reliability, required for decision-making at the
individual level.
 0.80
 Good reliability, required for decision-making at the group
level.
 0.70
 Adequate reliability, close to unacceptable as too much error
in the data. Why?
Measures to make quality/ standards of data-contd
Internal Consistency: Cronbach’s alpha
Person A: Internally consistent
Person B: Internally inconsistent

All the Much of A little of Rarely


time the time the time
Item
1 4 3 2 1
A B
2 4 3 2 1
B A
3 4 3 2 1
A B
4 4 3 2 1
A B
Measures to make quality/ standards of data-contd

Error in Reliability Estimates

“Error = 1 – (Reliability Estimate)2”


If alpha = 0.90, 1-(0.90)2
1-0.89 = .11 error
If alpha = 0.70, 1 – (0.70)2
1-.49 = .51 error
If alpha = 0.70, it is the 50:50 point
of error vs. true value
Measures to make quality/ standards of data-contd

Reliability Values

 Range: 0 to 1
 No negative signs like correlations
 Cohen’s kappa and Scott’s pi are always
lower, i.e. 0.50, 0.60
Measures to make quality/ standards of data-contd

Utility
Things you would like to know about an instrument.

 Time to complete (subject fatigue)?


 Is it obtrusive to participants?
 Number of items (power analysis)?
 Cultural, gender, ethnic appropriateness?
 Instructions for scoring?
 Normative data available?
Measures to make quality/ standards of data-contd

Reporting on Instruments

 Concept(s) being measured


 Length of instrument or number of items
 Response format (Likert scale, etc.)
 Evidence of validity
 Evidence of reliability
 Evidence of utility
process of data entry

File construction

 1. Specifying a filename
 In order to create a new datafile, the programme will first ask
you
 to give your datafile an alphanumeric name with a length of up
to 8
 characters, for example, SAMPLE1.

34
process of data entry-contd

 2. Defining the variables


 The next step is to define the information to be stored in the
 datafile. This can be done in the form of a “dialogue” with
 the computer, where the computer will ask you to specify the
 characteristics of the variables in the datafile.

35
Process of data entry-contd

 The following pieces of information are essential for the


definition of a variable.
 Unique Variable Name: Each variable must be identified by a
unique variable name
 Ex if school identification code which is presented in the
header of the questionnaire as“IDSCHOOL”, so you would
enter “IDSCHOOL” into the first blank field.

36
process of data entry-contd
 Variable Type: The next question asks about the type of coding
that is used for the variable. The letter “C” indicates
categorical variables with a fixed set of alphanumeric or
numeric categories. The letter “N” indicates non-categorical
variables with open-ended numerical codes.
 EX; While there are a fixed number of schools and therefore
only a fixed set of possible school identification values, the
number of possible values is very large and can be understood
as quasi-open-ended, so you should enter “N” into the second
blank field.

37
process of data entry-contd
 Variable Length:
 Afterwards you need to specify the number of digits
(including decimal places) which are required to code the data
values of this variable.
 Assuming that, in our example, there are 150 schools the
identification codes of which are the numbers 1 to 150, we can
use a three-digit code to identify the schools, so you would
enter “3” into the codebook field for the length.

38
process of data entry-contd

 Decimals: Afterwards you can specify the number of decimal


places to be used in the codes. In the school identification code
there are no decimal places, so you would leave the “0” in this
codebook field which is the default value and go to the next
codebook field.

39
process of data entry-contd

 Location in Instrument: The next piece of information will tell


the coders where (in the data collection instruments) they will
find the question used as the source of information. You can 33
The data entry manager software system
 fill in a short description that helps to locate the information
quickly. In our example, you could enter “School ID” into this
codebook field to indicate that the codes for this variable are
found in the identification part of the questionnaire.

40
process of data entry-contd
 The “Carry on” Indicator: The question “Carry data values on
as default?” asks you to specify whether the value of a variable
is carried as a default value to the next record when you enter
data.
 This is useful for variables which remain constant for a number
of records. If the “Carry” indicator is set to “Y” for a particular
variable, then every new record will have the data value from
the previous record as the default value. You can then modify
this default value as required. If the “Carry” indicator is set to
“N”, then the default value for this variable will be the default
value which was specified for this variable.
 As we may be entering many students for the same school,
you should enter “Y”.
41
process of data entry-contd

 Order (Display): You can specify the sequential position in


which variables will appear in the WinDEM display during
data entry. If you do not specify anything, the programme will
set these sequential positions so that the variables appear on the
display in the sequence in which you define them.

42
process of data entry-contd
 Order (File): Similarly, you can specify the sequential position
in which variables will be recorded in the datafiles. If you do
not specify anything, the programme will set these sequential
positions so that the variables appear on the display in the
sequence in which you define them.
 Field Label: For the descriptive label you could fill in “School
identification code”.

43
process of data entry-contd

 Code for “Missing” Data: Following the above specifications,


in the case of the variable IDSCHOOL you could enter the
code “999” to indicate missing or omitted data.
 Code for “Not Administered” Data: Correspondingly you could
specify “998” to indicate “not administered” data for the
variable IDSCHOOL.

44
process of data entry-contd
 “Default” Code: You can provide a code that will be used as a
programme default when you create a new record in the
datafile. In the case of the variable IDSCHOOL, you could
leave this codebook field blank or specify 999 as its default
code.
 Valid Range: You can specify a valid range that determines
which data values the user is allowed to enter when entering
data. Assuming that in our example, there are 150 schools the
identification codes of which are the numbers 1 to 150, you
would enter the numbers 1 and 150 in the corresponding
codebook fields.

45
process of data entry-contd

 Variable Class: You can classify variables according to their


use in later data analyses. Since the variable IDSCHOOL is an
identification variable, select the keyword “ID”. Note that only
when the variable class is “ID” can the distinguish these
variables as identification variables.
 Comment: You can associate a descriptive comment with the
variable which will be printed in the electronic codebook.

46
process of data entry-contd

 The “Hide variable” Indicator: The question “Allow


modification of variable?” asks you to specify whether a
variable will be visible and editable in the WinDEM display
when you enter data or not. “Y” indicates that the value will be
displayed during the data entry stage, “N” indicates that the
value will not be displayed. As the later users need to enter the
school identification code, you should enter “Y” in this
codebook field.

47
Improper data entry into excel
Correct data entry – variable name in first row, each variable
entry into separate columns
Basic approaches to data entry

 Data may be collected on free-text notebooks, questionnaires,


optical scanning forms, or micro-computers. All further steps
depend on the quality with which the data entry is completed.
Inaccurate data entry often causes substantial delays in the data
verification and data analysis phases of a survey.

50
Basic approaches to data entry-Contd
 Adequate procedures for data entry depend on instrument
design and on the data collection methods. Sometimes in large
scale surveys, data entry procedures are used wherein data are
recorded directly in computer readable form using optical or
magnetic character readers, optical or magnetic mark readers,
or micro-computers during fieldwork.
 Examples of this are computer assisted telephone interviewing
(CATI) and computer assisted personal interviewing (CAPI)
systems. Whereas transcription errors can be minimized with
these procedures, the use of such technical innovations requires
careful planning, an expensive technical environment, and
trained respondents.

51
Basic approaches to data entry-Contd

 The more common approaches for data entry in educational


surveys are transcriptive procedures in which respondents
write their answers onto the instruments. The answers are then
transcribed either to machine readable form or directly into the
computer. Transcription is usually costly, sometimes requiring
up to half of the total data processing costs. If the response
formats are complex or the coding requires specially trained
coding personnel, then

52
Basic approaches to data entry-Contd

 an additional coding stage may need to be inserted in which the


responses are translated into their codes which are then written
on the instruments or transcribed to special code-sheets.
Although introducing an additional source of error, nonetheless
separating coding from data entry allows faster coding of data
and does not require coding skills for the data entry personnel.

53
Basic approaches to data entry-Contd
 Key verification procedures, or better still, independent
verification techniques where two coders code and enter the
data independently, can help to ensure the correctness of the
data entered.
 While perhaps too costly to process the whole dataset, at least a
reasonable sized sample of the data should be verified using
these techniques in order to estimate the error introduced and
to decide on further corrective measures to ensure sufficient
data quality.
 Often it is advantageous to identify the coder who entered
each record so that any errors can be traced back. This can be
done by adding a coder identification code to the datafile.

54
Basic approaches to data entry-Contd

 It is important to trial test data entry procedures at an early


stage so that resources required for timely entry can be
planned.

55
Basic approaches to data entry-Contd
 Using a text editor for data entry
 For each piece of information in the data collection instruments
the codebook defines which format and into which positions it
should be entered into the raw datafile. Following the
definitions in the codebook, it is possible to simply enter the
data into a text editor or word processor.
 An example for how such a text file would look like is
provided in the following using the codebook of the above
sample questionnaire.
 103103042 83941991019110
 103103051124232130110110
 103104063 92221241000110
56
Basic approaches to data entry-Contd
 As you can see, the codebook starts with the School ID (103),
followed by the Student ID (10304), the student sex (the 2
indicates a girl), the students age (8 years), and so on until all
variables in the codebook have been coded.
 However, a great deal of caution must be used when following
this approach and there is usually a great deal of work involved
in resolving problems resulting from such an approach. To give
an example for this, four frequently occurring problems are
listed in the following:
 If, by mistake, a coder skips a code or enters a code twice, then
all subsequent codes in the datafile will be shifted and thus
change their implied meaning in the datafile:
 Incorrect: 10310304 83941991019110
 Correct: 103103042 83941991019110
57
Basic approaches to data entry-Contd
 The student age should be coded in columns 10-11. If, as in the
above example for student 10304, the coder puts the code for
the age in the 10th position and then continues in position 11
with the remaining variables, then columns 10-11 would
contain the value 83 and the computer would interpret this as
the age of 83 years in later analyses.
 All variables following the students age would be
misinterpreted similarly. This can have dramatic impacts on the
statistical results, for example, if we calculate the mean age
and there is an outlier with 83 years in the datafile, then the
overall mean can change substantially if the sample size is not
too large.

58
Basic approaches to data entry-Contd

 The approach also does not allow to verify during data entry
whether the data values entered conform indeed to the
specifications in the codebook:
 Example: 103104063 92221241000110

59
Basic approaches to data entry-Contd
 In this example the position for the student sex contains
the value “3” which is outside the set of permitted values
(“1” for “boy”, “2” for “girl”, and “8” and “9” for the
missing codes) and is obviously a coding error. Besides
losing the information for this student it also has, if
undetected, an impact on the results of statistical analyses.
 Furthermore, such an approach does not allow to verify
the data for internal consistency while the data are
entered:
 Example: 103103042 83941341019110

60
Using a computer-controlled approach for data
entry, the data entry manager programme
 Transcriptive data entry can greatly be facilitated through the
use of interactive data entry software that integrates the
processes of the entry, editing, and verification of the data.
 Such data entry systems come often also with integrated data
and file management capabilities including mechanisms for the
transfer of the data to standard statistical analysis systems.
 Using such systems, deviations of datavalues from pre-
specified validation criteria or data verification rules can be
detected quickly, thereby letting the user correct the error while
the original documents are still at hand.

61
Using a computer-controlled approach for data
entry, the data entry manager programme-contd

 An example for such a programme is the WinDEM which is


provided by the IEA and which is briefly described in the
following:
 This programme has been designed to be used by users with
limited experience in computer use. The programme can
handle datafiles with more than 1000 variables and data for
more than 1 000 000 000 respondents. All datafiles created are
fully compatible with the dBASE IV™ standard.
 The WinDEM programme operates through a system of menus
and windows. It contains nine menus with which you can
accomplish different tasks.

62
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
 With the FILE menu you can open, create, delete or sort a
datafile. As you have seen earlier in this module, you can use
this menu also to edit the electronic codebook which is
associated with each datafile and which contains all
information about the file structure and the coding schemes
employed.
 Furthermore, you can use this menu to print the electronic
codebook or to transform the information in the electronic
codebook into SAS™, SPSS™, or OSIRIS/IDAMS™ control
statements which you can use later in order to convert the
datafiles into SAS™, SPSS™, or OSIRIS/IDAMS™ system
files.
 Finally this menu allows you to exit the WinDEM programme.
63
Using a computer-controlled approach for data
entry, the data entry manager programme-contd
 With the EDIT menu you can enter, modify, or delete data in
data files. You can look at a data file in two different ways:
 (a) in record view, you can view the data for one record at a
time with detailed information on each of the variables;
 (b) in table view, you can view a data file as a whole in tabular
form with records shown as rows and variables shown as
columns.
 The programme will control the processing of the data
entered, interrupting and alerting you when data values fail to
meet the range validation criteria which are specified in the
electronic codebook.

64
Using a computer-controlled approach for data
entry, the data entry manager programme-contd

 With the SEARCH menu you can search for specific records
using your own search criteria or locate a record with a known
record number.
 With the SUBSET menu, you can define a subset of specific
records using your own criteria. This will then restrict your
view of the data to the records which match these criteria.

65
Using a computer-controlled approach for data
entry, the data entry manager programme-contd

 You can use the Print menu to print pre-selected records on a


printer or to a text file.
 You can use the IMPORT/EXPORT menu to generate fixed
form ASCII raw datafiles or free format datafiles from the
WinDEM system files or to import raw datafiles or free format
datafiles created with other software packages into the
WinDEM programme

66
summary

 In this module we learned about process of data entry in detail.


the following module we will learn about data verification,
prepare for data analysis , quantitative and qualitative data
analysis techniques

67
References
 Carol Leslie Macnee, (2008), Understanding Nursing Research:
Using Research in Evidence-based Practice, Lippincott Williams
& Wilkins, ISBN 0781775582, 9780781775588
 Densise.Polit, et.al, (2013). ‘Nursing research-principles and
methods’, revised edition, Philadelphia, Lippincott
 http://www.vbtutor.net/research/research_chp7.htm#sthash.ZtzD
oA7r.dpuf
 http://adamowen.hubpages.com/hub/Understanding-The-
Different-Types-of-Research-Data
 http://www15.uta.fi/FAST/FIN/RESEARCH/sources.html
 http://www.medicotips.com/2012/01/datatypes-of-data-and-
sources-of-data.html
Thanks

Next Topic>>
Qualitative vs.
Quantitative data
analysis techniquea
69

You might also like