Kirkpatrick Model Handout
Kirkpatrick Model Handout
Level 1 Evaluation—Reaction
Level 2 Evaluation—Learning
Level 3 Evaluation—Behavior
Level 4 Evaluation—Results
Level 1—Reaction
The level one questionnaires shown in Exhibit 4.3 and 4.4 are
acceptable. The main changes I suggest are to put “neutral” rather
than “agree” in the center of the 8-point rating scale used in the
Exhibit 4.3 rating scale (actually, I’d probably recommend using a five-
point rating scale) and include open-ended items about the program
strengths and weaknesses. I don’t recommend the questionnaires
shown in Exhibits 4.1, 4.2, or 4.5.
þ You will know how the participants felt about the training event.
þ It may point out content areas that trainees felt were missing from
the training event.
þ It will tell you how engaged the participants felt by the training
event.
Level 2—Learning
knowledge, skills, and attitudes, and (b) what research design should
be use to demonstrate improvement in level two outcomes?
responses are for only one distractor). Note Kirkpatrick’s brief example
of a knowledge test on page 44.
To give you a better idea of the design issues here, I will review several
experimental research designs.)
Visual Depiction
of the Design Design Name
-------------------------------------------------------------------------
X O2 Posttest-only nonequivalent
O4 control group design
-------------------------------------------------------------------------
-------------------------------------------------------------------------
O1 X O2 Pretest-posttest nonequivalent
---------------- control group design
O3 O4
-------------------------------------------------------------------------
R O1 X O2 Randomized pretest-posttest
R O3 O4 control group design (Note: this
design has random assignment to groups)
-------------------------------------------------------------------------
-------------------------------------------------------------------------
Here is the basic logic of analysis for each of the designs just listed.
The counterfactual, discussed in an earlier lecture, is estimated slightly
differently in some of these designs, which means that the comparison
may be different from design to design. Generally, you will check each
of the following comparisons for practical significance.
Level 3—Behavior
Remember that level one and level two outcomes are still important
because participants generally need to react positively to the training
program (level 1 outcome) and they need to learn the material (level 2
outcome) if they are going to be motivated and able to apply what
they have learned when they return to their jobs.
A. These are some factors in the training program or event that can
help facilitate transfer of learning:
B. These are some factors in the receiving organization that can help
facilitate transfer of learning:
1. Use a control group if possible. That is, use the strongest design that
is feasible.
3. Evaluate both before and after the program if practical. Again, use
the strongest design that is feasible.
Level three is often harder than level one and level two evaluation
because behavior changes at the workplace are often harder to
measure than reaction and learning directly after the training event.
You must give the behavior time to transfer and collect data at the
workplace.
Probably the most common design used for level three evaluation is
the one-group pretest-posttest design (i.e., get a baseline measure of
the behavior you plan on training, train the participants, and then
measure the participants’ behavior again after the training). If you are
able to include a control group, you will be able to use the pretest-
posttest nonequivalent control group design (i.e., in addition to
measuring the training participants before and after the training, you
also find a set of similar people, who do not undergo training for the
control group, and you measure these control group participants’
behavior before and after the training program). Earlier (above) I
showed the comparisons you make for the different designs during
data analysis.
Level three outcomes are required for level four outcomes (i.e.,
they are the intervening variables or factors that lead to level
four outcomes); therefore, it is good news when level three
outcomes are found.
Here your goal is to find out if the training program led to final results,
especially business results that contribute to the “bottom line” (i.e.,
business profits). Level four outcomes are not limited return on training
investment (ROI). Level four outcomes can include other major results
that contribute to the well functioning of an organization. Level four
includes any outcome that most people would agree is “good for the
business.” Level four outcomes are either changes in financial
outcomes (such as positive ROI or increased profits) or changes in
variables that should have a relatively direct effect on financial
outcomes at some point in the future.
2. Allow time for results to be achieved. In other words, many level four
outcomes will take some time to occur.
3. Measure both before and after the program if practical. Again, use
the strongest experimental design that is feasible.
5. Consider costs versus benefits. You may not want to perform a level
four evaluation if the costs of that evaluation are high in comparison to
the potential benefits or impacts of the training program.
Level four outcomes tend to fall far down outcome lines, which means
that many intervening factors must take place in order for the level
four outcomes to take place. This means that we should not be overly
optimistic in expecting large level four outcomes from single training
programs.