0% found this document useful (0 votes)
69 views

Excel Data Entry Practices

The document provides best practices for entering data into Excel spreadsheets, including putting variables in columns and observations in rows, using unique variable names and column headers, avoiding special characters in names, keeping consistent formatting, using a single worksheet, documenting data with a data dictionary, and avoiding unnecessary Excel features that could complicate analysis. Consistency and clarity are emphasized to avoid errors and ensure data can be easily analyzed by statistical software.

Uploaded by

Jasbindar Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Excel Data Entry Practices

The document provides best practices for entering data into Excel spreadsheets, including putting variables in columns and observations in rows, using unique variable names and column headers, avoiding special characters in names, keeping consistent formatting, using a single worksheet, documenting data with a data dictionary, and avoiding unnecessary Excel features that could complicate analysis. Consistency and clarity are emphasized to avoid errors and ensure data can be easily analyzed by statistical software.

Uploaded by

Jasbindar Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Excel Data Entry Practices:

1. Put variables in columns and observation in rows.

• Include a unique identifying number for each case.

• Be sure that each variable name is unique (no duplicate variable names).
2. If possible, put variable names in the first row of excel sheet.

• Variable names must start with a letter.

• Do not include special characters (#, !, ?, %, etc.) or spaces in your variable


names.

• Choose readily recognizable names for variables - but not too long (<= 16
characters best).
3. Use a separate column for each piece of information.

• Don’t enter data such as "120/80" for blood pressure.

• Enter systolic blood pressure as one variable and diastolic blood pressure as
another variable. Don't enter data as "A,C,D" or "BDF" if there are three possible
answers to a question. Include a separate column for each answer.
4. When entering dates, include a 4 digit year.

• Two-digit years can cause problems for statistical software when reading data
from Excel files.

• The best format for dates is mm/dd/yyyy, where mm is a 2 digit month, dd is a 2


digit day and yyyy is a 4 digit year.
5. Decide on "missingness" conventions.

• Missing data can cause a multitude of problems.

• To enter a missing data value either enter a blank or an easily recognizable single
digit character code.

• Be sure, if you use a missing value code, that it cannot be confused with a "real"
data value.
6. Be consistent in your data entry

• When entering data keep the same format throughout.


Good Example Bad Example

ID DOB Sex ID DOB Sex

1 12/31/1976 F 1 12/31/1976 f

2 01/01/1977 M 2 1-Jan-77 m

3 01/02/1977 F 3 01/02/1977 Female

4 01/03/1977 F 4 01/03/77 F

5 01/04/1977 M 5 01/04/1977 Male

6 01/05/1977 F 6 01/05/1977 F

7 01/06/1977 M 7 1/6/77 12:00 AM M

8 01/07/1977 M 8 01/07/1977 m

9 01/08/1977 F 9 08-Jan-77 F

10 01/09/1977 F 10 01/09/1977 f

Notice in the Good Example above that the date variable has the same format (mm/dd/yyyy)
and the sex variable is consistent throughout in both case and type (character variable). In the
Bad Example the date variable is in different formats without a 4-digit year for all the
observations. The sex variable is still a character variable, but statistical software will read
this variable as having six different levels instead of two.
7. Use only one worksheet for your data.

• If you decide to use multiple sheets for your data, follow the variable naming
conventions for the tabs that name the sheets (keep the names simple and
unique).
8. Do not "stack" data on the same sheets.

• For example, treated versus non-treated patients can be handled by column


variable that has a code for Treated (yes/no).
9. Avoid using "special" Excel features, example, hidden columns, graphs, colors, italics,
bold on the data sheet that is your primary database.

• These features can be used on a separate "subset" or "analysis" spreadsheets.


10. Document your database with a data dictionary and/or codebook.

• It is a good idea to document what your variables are and what they mean. The
data dictionary should include all of the variable names, data type that
corresponds to the variable, a label or longer name that describes the variable
including the units it is measured in, the codes for any categorical variables, and
any notes for the variable. This can be a separate worksheet or document file.

Sample Data Dictionary:


Variable Name Data Description Codes Notes
Type

ID Numeric Patient ID

Treated Numeric Treatment group 1=treated,


2=control

Age Numeric Age (years)

Sex Character Gender M=Male,


F=Female

Height Numeric Height (cm) Blanks=missing data

Weight Numeric Weight (kg) Blanks=missing data

Proc_date Date Procedure date

systolic_BP Numeric Systolic blood pressure Blanks=missing data

diastolic_BP Numeric Diastolic blood pressure Blanks=missing data

You might also like