Excel Data Entry Practices
Excel Data Entry Practices
• Be sure that each variable name is unique (no duplicate variable names).
2. If possible, put variable names in the first row of excel sheet.
• Choose readily recognizable names for variables - but not too long (<= 16
characters best).
3. Use a separate column for each piece of information.
• Enter systolic blood pressure as one variable and diastolic blood pressure as
another variable. Don't enter data as "A,C,D" or "BDF" if there are three possible
answers to a question. Include a separate column for each answer.
4. When entering dates, include a 4 digit year.
• Two-digit years can cause problems for statistical software when reading data
from Excel files.
• To enter a missing data value either enter a blank or an easily recognizable single
digit character code.
• Be sure, if you use a missing value code, that it cannot be confused with a "real"
data value.
6. Be consistent in your data entry
1 12/31/1976 F 1 12/31/1976 f
2 01/01/1977 M 2 1-Jan-77 m
4 01/03/1977 F 4 01/03/77 F
6 01/05/1977 F 6 01/05/1977 F
8 01/07/1977 M 8 01/07/1977 m
9 01/08/1977 F 9 08-Jan-77 F
10 01/09/1977 F 10 01/09/1977 f
Notice in the Good Example above that the date variable has the same format (mm/dd/yyyy)
and the sex variable is consistent throughout in both case and type (character variable). In the
Bad Example the date variable is in different formats without a 4-digit year for all the
observations. The sex variable is still a character variable, but statistical software will read
this variable as having six different levels instead of two.
7. Use only one worksheet for your data.
• If you decide to use multiple sheets for your data, follow the variable naming
conventions for the tabs that name the sheets (keep the names simple and
unique).
8. Do not "stack" data on the same sheets.
• It is a good idea to document what your variables are and what they mean. The
data dictionary should include all of the variable names, data type that
corresponds to the variable, a label or longer name that describes the variable
including the units it is measured in, the codes for any categorical variables, and
any notes for the variable. This can be a separate worksheet or document file.
ID Numeric Patient ID