Weiss 1995
Weiss 1995
65
66 NEW APPROACHES TO EVALUATING COMMUNI~ INITIATIVES
attack on the social and economic constraints that lock poor children and
families in poverty. They bring local residents into positions of authority
in the local program, along with leaders of the larger community, public
officials, and service providers. Examples of foundation-sponsored initia-
tives include Annie E. Casey Foundation’s New Futures Initiative, Pew
CharitableTrusts’ Children’s Initiative, and the Ford Foundation’s Neigh-
borhood and Family Initiative. Recent federal programs, such as the
Empowerment Zone and Enterprise Community Initiative, include some
parallel features.
A number of evaluations have been undertaken to discover the effects
of the recent initiatives. Much effort has gone into developing appropriate
outcome measures that can indicate the degree of success-or at least
progress-in attaining desirable results. The evaluation strategies being
used and proposed have tended to follow standard evaluation practice,
emphasizing quantitative measurement on available indicators of out-
come, sometimes supplemented by case studies. Influential members ofthe
foundation community have wondered whether these evaluation strategies
fit the complexity of the new community initiatives and the knowledge
needs of their practitioners and sponsors.*
It is in this context that I suggest an alternative mode of evaluation,
theory-based evaluation. In lieu of standard evaluation methods, I advance
the idea of basing evaluation on the “theories of change” that underlie the
initiatives. I begin by describing this evaluative approach and discussing its
advantages. I then make a preliminary attempt to elucidate the theories, or
assumptions, on which current initiatives are based. Although this is a
speculative enterprise, its aim is to suggest the kinds of questions that
evaluation might address in the current case. The paper concludes with
some issues concerning the feasibility of theory-based evaluation and a
discussion of steps that might test its utility for the evaluation of CCIs. The
paper is meant as a contribution to the discussion of how evaluation can
derive the most important and useful lessons from current experience.
THEORY-BASED EVALUATION
Chen 1990; Lipsey 1993). The evaluation should surface those theories
and lay them out in as fine detail as possible, identifying all the assumptions
and sub-assumptions built into the program. The evaluators then con-
struct methods for data collection and analysis to track the unfolding of the
assumptions. The aim is to examine the extent to which program theories
hold. The evaluation should show which of the assumptions underlying
the program break down, where they break down, and which of the several
theories underlying the program are best supported by the evidence.
Let me give a simple example. There is a job-training program for
disadvantaged youth. Its goal is to get the disadvantaged youth into the
work force (thus forestalling crime, welfare dependency, drug use, and so
forth). The program’s activities are to teach “job-readiness skills”-such as
dressing appropriately, arriving on the job promptly, getting along with
supervisors and co-workers, and so on-and to teach job skills. What are
the assumptions-what is the theory-underlying the program?
The theory obviously assumes that youths do not get jobs primarily
because they lack the proper attitudes and habits for the world ofwork and
they lack skills in a craft. The program’s sponsors may or may not have
considered alternative theories-for instance, that high youth unemploy-
ment rates are caused by forces in the larger economy and by the scarcity
of entry-level jobs with reasonable long-term prospects; or that youth
unemployment is a consequence of youths’ lack of motivation, their
families’ failure to inculcate values of work and orderliness, health prob-
lems, lack of child care, lack of transportation, a lack of faith in the reality
of future job prospects, or ready access to illegal activities that produce
higher financial rewards for less work.
Those responsible for the program may have rejected (implicitly or
explicitly) those alternative theories, or they may believe that alternative
theories are not powerful enough to overwhelm their own theory, or they
may believe that other interventions are concurrently addressing the
factors that their work neglects.
At the program level, the program theory is based on a series of
“micro-steps” that make important assumptions-for example:
. Trainers will offer quality training and they will help youth learn
marketable skills.
. Youth will learn the lessons being taught about work habits and
work skills.
. Having attained the knowledge and skills, the youth will seek jobs.
. Youth will remain on the job and theywill become regular workers
with good earnings.
When we examine the theory, we can see how many of the linkages are
problematic. At the program level, we know that the quality of instruction
may be below par. It can be difficult to recruit young people to job-training
programs. Many attendees drop out of the programs; others attend
erratically. In some job-training programs, the promised jobs fail to
materialize; either the skills taught do not match the job market or
employers do not hire the trainees. Many young people get jobs but leave
Nothing as Practical as Good Theoy, 69
them in a short time, and so on. There are a host ofreasons why the benefits
originally expected from job-training programs are usually so small-in
the best cases resulting in perhaps a 5 to 10 percent higher employment rate
among program participants than among people who do not participate.
The San Diego welfare-to-work program, the Saturation Work Initiative
Model, was heralded in policy circles as a great success after two years, on
the basis ofevidence that about 6 percent more of the program participants
than of the control group were employed after two years (Hamilton and
Friedlander 1989). A five-year follow-up indicated that some of the
difference between trainees and controls faded out over time (Friedlander
and Hamilton 1993).
In fact, one reason for the current emphasis on community-based cross-
systems reform is the need to deal with multiple factors at the same time-
education, training, child care, health care, housing, job creation, commu-
nity building, and so on-to increase the chances of achieving desired
effects. The initiatives aim to work on the whole array of needs and con-
straints, including those that create opportunities, connect young people
to opportunities, and prepare them to take advantage of opportunities.
0 0 0 0 0
and that the results generalize to other programs of the same type. These
are strong claims, and inasmuch as only a few large-scale theory-based
evaluations have been done to date, it is probably premature to make
grandiose promises. But certainly tracing developments in mini-steps,
from one phase to the next, helps to ensure that the evaluation is focusing
on real effects of the real program and that the often-unspoken assump-
tions hidden within the program are surfaced and tested.
they foresee a process of long-term change; they do not even try to foresee
the ultimate configuration ofaction. But ifwe cannot spell out fine-grained
theories of change that would apply generally, we can attempt to identify
certain implicit basic assumptions and hypotheses that underlie the larger
endeavor. That is what the rest of this paper is about.
An Examination ofAssumptions
I read a collection of program documents about community-based com-
prehensive cross-sector initiatives for children, youth, and families (Chaskin
1992; Enterprise Foundation 1993; Pew Charitable Trusts, n.d.; Rostow
1993; Stephens et al. 1994; Walker and Vilella-Velez 1992), and here I
outline the theoretical assumptions that I discerned. These assumptions
relate to the service-provision aspects that appear to underlie confidence
that the initiatives will improve the lot of poor people. (I limit attention to
service provision here, even though additional assumptions, including
those about structure and institutional relationships, are also important.)
Some of the assumptions on which the initiatives appear to be based are
well supported in experience; others run counter to the findings of much
previous research. For most of them, evidence is inconclusive to date.
0 0 0 0 0
THE PROVISIONALITY
OF THE UNDERLYING HYPOTHESES
Some of the hypotheses in the list are well supported by evidence and ex-
perience. Some are contradicted by previous research and evaluation. For
82 NEW APPROACHES TO EVALUATING COMMLJNITY INITIATIVES
The aim of this paper has been to indicate a style of evaluation that
comprehensive community initiatives might pursue. Evaluators could set
forth a number of hypotheses that underlie the initiatives. After discussing
relevant factors with program participants and reaching agreement on
theories that represent the “sense of the meeting,” the evaluators would
84 NEW APPROACHES TO EVALUATING COMMUNITY INITIATIVES
select a few of the central hypotheses and ask: To what extent are these
theories borne out in these cases? What actually happens? When things go
wrong, where along the train of logic and chronology do they go wrong?
Why do they go wrong? When things go right, what are the conditions
associated with going right? Also, the evaluation could track the unfolding
of new assumptions in the crucible of practice. The intent is not so much
to render judgment on the particular initiative as to understand the
viability of the theories on which the initiative is based. The evaluation
provides a variegated and detailed accounting of the why’s and how’s of
obtaining the outcomes that are observed.
But sponsors and participants may also want periodic soundings on
how the local program is faring and how much it is accomplishing. For
purposes of accountability, they may want quantitative reports on progress
toward objectives. Theory-based evaluation does not preclude-in fact, is
perfectly compatible with- the measurement ofinterim markers andlong-
term outcomes, such as high school graduation rates, employment rates, or
crime rates. As a matter of fact, if wisely chosen, indicators of interim and
long-term effects can be incorporated into theory-based evaluation.
Indicators can cover a gamut ofcommunity conditions before, during,
and after the interventions. Evaluators can collect information on:
overall crime rates, auto theft rates, arrests of minors, and other
crime statistics;
Such data can give some indication of the state of the community before
the initiatives start up, and they can be periodically updated. However,
they represent gross measures of the community, not of individuals in the
community. To find out about individuals (by age, race/ethnicity, income
level, gender, family status, and so on), indicator data can be supplemented
by survey data on a random sample of individuuh in the community.
Periodic updates can show whether changes are taking place, in what
domains, and ofwhat magnitude, and they allow comparison of those who
received direct help versus those who did not, two-parent versus one-
parent families, and so forth.
The shortcomings of relying only on indicator data are several-fold:
2. Any changes that show up in the data are not necessarily due to
the initiative. (This is true not only in the case of community-
based indicators, but of survey data on individuals.) Many things
go on in communities other than the intervention. Economic
changes, new government programs or cutbacks of programs,
influx ofnew residents, outflowofjobs, changes in the birthrate-
all manner of exogenous factors can have enormous consequences
in this volatile time. It would be difficult to justify giving the
credit (or blame) for changes (or no changes) on outcome indica-
tors to the initiatives.
86 NEW APPROACHES TO EVALUATING COMMUNITY INITIATWES
4. One ofthe key features ofCCIs is their beliefthat it is vital not only
to help individuals but also to strengthen the community, and that
strengthening the community will reciprocally work to trigger,
reinforce, and sustain individual progress. CCIs tend to believe in
the significance of changes at the community level, both in and of
themselves and as a necessary precondition for individual ad-
vancement, just as they believe that individual improvement will
support a revitalized community. But few data are systematically
and routinely collected at the level of the neighborhood, and those
data that are available rarely fit the boundaries of the neighbor-
hood as defined by the CCI. It is problematic how well available
indicators can characterize community-level conditions.
Problems of Theorizing
A first problem is the inherent complexity of the effort. To surface
underlying theories in as complex and multi-participative an environment
as these communities represent will be a difficult task. At the first level, the
level of the individual stakeholder, many program people will find the task
uncongenial. It requires an analytical stance that is different from the
empathetic, responsive, and intuitive stance of many practitioners. They
may find it difficult to trace the mini-assumptions that underlie their
practice, dislike the attempt to pull apart ideas rather than deal with them
in gestalts, and question the utility of the approach.
The next level arrives when agreement is sought among participants
about the theory of the whole CCI. There is likely to be a serious problem
in gaining consensus among the many parties. The assumptions of
different participants are likely to diverge. Unless they have had occasion
before to discuss their different structures of belief, there will be a
confrontation over what the real theory of the CC1 is. When the confron-
tation surfaces widely discrepant views, it may prove to be unsettling, even
threatening. I believe that in the end, the attempt to gain consensus about
the theoretical assumptions will prove to have a beneficial effect on
practice, because ifpractitioners hold different theories and a.im to achieve
different first- and second-order effects, they may actually be working at
cross-purposes. Consensus on theories of change may in the long run be
good not only for the evaluation but for the program as well. But getting
to that consensus may well be painful.
There is a third level, which comes when a CC1 goes public with its
theoretical statement, whether formally or informally. A CC1 may run
political risks in making its assumptions explicit.5 Canny community
actors do not always want to put all their cards on the table. Such revelation
may lay them open to criticism from a variety of quarters. Particularly when
racial and ethnic sensitivities are volatile, even the best-meaning of
assumptions may call forth heated attacks from those who feel slighted or
disparaged as well as from those who dispute the analytical reasoning of the
theories proposed.
88 NEW APPROACHES TO EVALUATING COMMUNITY INITHTWES
Problems of Measurement
Once consensual theories of change are in place, evaluators have to develop
techniques for measuring the extent to which each step has taken place.
Have agencies adapted their procedures in ways that enable them to
function in a multi-agency system? Have practitioners reinterpreted their
roles to be advocates for clients rather than enforcers of agency rules? Some
of the mini-steps in the theories of change will be easy to measure, but
some-like these-are complicated and pose measurement problems.
Whether they will all lend themselves to quantitative measurement is not
clear. My hunch is that some will and some will not.
Whether exclusively quantitative measurement is desirable is also not
clear. To the extent that theory-based evaluation represents a search “for
precise and decomposable causal structures” (Rockman 1994,148) through
quantitative measurement and statistical analysis, it may be taking too
positivistic a stance. The logic of qualitative analysis may be more compel-
Not&pas Practicalas Good Theory 89
ling, since it allows not only for rich narrative but also for the modification
of causal assumptions as things happen in the field. But since sponsors
often find quantitative data more credible than narrative accounts, efforts
should probably be made to construct measures of key items.
Probhns of Interpretation
Even ifwe should find theories that tend to explain the success ofparticular
initiatives in particular places, it is uncertain how generalizable they will be.
Will interventions in another community follow the same logic and bring
about the same outcomes? On one level, this is a question ofhow sufficient
the theories are. It is possible that even when available data seem to support
a theory, unmeasured conditions and attributes in each local case actually
were in part responsible for the success observed. Unless other CCIs
reproduce the same (unmeasured and unknown) conditions, they will be
unable to reproduce the success. Only with time will enough knowledge
accrue to identify all the operative conditions.
On a deeper level, the question involves the generalizability of any
theory in the social sciences. Postmodern critics have voiced disquieting
doubts on this score. But this subject gets us into deeper waters rhan we can
navigate here.
CONCLUSION
For all its potential problems, theory-based evaluation offers hope for
greater knowledge than past evaluations have generally produced. I believe
that the current comprehensive community initiatives should try out its
possibilities. Ifwe are to make progress in aiding children and families, the
90 NEW APPROACHES TO EVALUATING COMMUNITY INITIATPJES
NOTES
REFERENCES
Pew Charitable Trusts. n.d. The Children ? Initiative: Making Systems Work, A
Program of The Pew Charitable Trusts.Typescript. Philadelphia: Pew Chari-
table Trusts.
Rockman, Bert A. 1994. “The New Institutionahsm and the Old Institutions.”
In New Perspectives on American Politics ed. L. C. Dodd and C. Jillson, pp.
143-61. Washington, DC: CQ Press.
Rostow, W. W. 1993. “TheAustin Project, 1989-1993: An Innovational Exercise
in Comprehensive Urban Development.” Paper prepared for Seminar on
Inner City Poverty, Yale University Institution for Social and Policy Studies,
October 1993.
Schorr, Lisbeth B. 1988. Within Our Reach: Breaking the Cycle OfDisadvantage.
New York: Anchor Press/Doubleday.
. 1991. “Attributes of Effective Services for Young Children: A Brief
Survey of Current Knowledge and its Implications for Program and Policy
Development.” In Effective Services for Young Children, ed. L. B. Schorr,
D. Both, and C. Copple. Washington, DC: National Academy Press.
. 1994. Personal communication, August 9.
Shadish, W. R., Jr. 1987. “Program Micro- and Macrotheories: A Guide for Social
Change.” In UsingProgram Theory in Evaluation,ed. Leonard Bickman. New
Directions for Program Evaluation, No. 33, pp. 93-108.
State Reorganization Commission (South Carolina). 1989.An Evaluation ofthe
Human Service Integration Project, 1985-1988 December. Columbia, S.C.:
State Reorganization Commission.
Stephens, S. A., S. A. Leiderman, W. C. Wolf, and P. T. McCarthy. 1994.
Building Capacity for System Reform. October. Bala Cynwyd, Pa.: Center for
Assessment and Policy Development.
Walker, Gary, and Frances Vilella-Velez. 1992.Anatomy ofa Demonstration: The
Summer Training and Education Program (STEP)pom Pilot through Replica-
tion and Postprogram Impacts. Philadelphia, Pa.: Public/Private Ventures.
Weiss, Carol H. 1972. Evaluation Research: Methods OfAssessing Program Effective-
ness. Englewood Cliffs, N.J.: Prentice-Hall.
Wilner, Daniel, and others. 1962. The Housing Environment and Family Lifp: A
Longitudinal Study of the Effects of Housing on Morbidity and Mental Health.
Baltimore: Johns Hopkins Press.
Wilson, William Julius. 1987. The Truly Disadvantaged The Inner City, the
Underclass, and Public Policy. Chicago: University of Chicago Press.