Data Driven Decision Making
Data Driven Decision Making
Effective Data-Driven
Decision Making
Philip A. Streifer
Scarecrow Education
Lanham, M aryland • Toronto • Oxford
2004
Published in the United States of America
by ScarecrowEducation
An imprint of The Rowman & Littlefield Publishing Group, Inc .
450 1 Forbes Boulevard, Suite 200 , Lanham, Maryland 20706
www.scarecroweducation.com
PO Box 3 1 7
Oxford
OX2 9RU , UK
r.:::::, TM
� The paper used in this publication meets the minimum requirements of
American National Standard for Information Sciences - Permanence of
Paper for Printed Library Materials , ANSI/NTSO Z39 .48 - 1 992.
Manufactured in the United States of America.
Contents
5 Case Studies/Examples 79
Index 155
iii
2 Ch apte r One
know how to ask the right questions ." I At one level, I agree; in fact, my pre
vious book co-published with AASA took that very position, but my further
research into this issue reve als that most administrators ask great ques
tions - complex and powerful ones about teaching and learning . Thus , it
would appear that the problem is not that school leaders fail to ask the right
questions , but that they do not know how to deconstruct their questions into
doable analyses (or subqueries) using sound research principles and tech
niques . This is an important distinction, one that if not solved through train
ing and the development of new decision-support systems, could render this
latest round of school improvement efforts another failure in the long list of
attempts since A Nation at Risk. An even greater problem is that educators
do not have access to essential tools to address their most important and
complex questions that typically focus on causality- identifying those vari
ables that predict some outcome . Thus , it appears that collectively we are
now at the "data admiration stage ," ready to move to the next phase - but
what will it really take to move from admiration to action?
My work with school leaders on data-driven decision making reveals
that too few educators possess the knowledge and skills needed to address
these tougher, more complex questions . For example , to address a
straightforward question such as "does our early intervention program
have a positive impact on elementary school success?" requires a com
plex set of procedures to answer. It assumes an awful lot about what edu
cators know and can do in this arena of data-driven decision making . My
experience tells me that precious few of us know how to answer this ques
tion, have access to the resources to have it analyzed, or even have the re
quired data to do the analysis . Many of the so-called data-driven decision
making processes and toolkits used in the field today are excellent
overviews of what is required, but most fail on one major point- the as
sumptions they make about our collective knowledge , skills , and re
sources to get the job done .
With the passage of No Child Left Behind (NCLB) , the problems these
deficiencies unveil are quickly coming to the forefront of our attention.
For example , all of a sudden a new language is thrust on us: "adequate
yearly progress by subgroups ." And, as is evident in state after state, these
are not simple issues to understand and effectively deal with. They are
made complex by the science of psychometrics , the limits of testing , and
the pol itics of accountabil ity.
The Ne w Lite racies Re qui red/o r Data-Drive n De cisio n Maki ng 3
Consider for a moment the knowledge and skills needed by school ad
ministrators to properly respond to the federal initiative No Child Left Be
hind. They need to:
MAKING DECISIONS
but it is revealed as we look through binoculars, and we see even more de
tail through an eight-inch telescope, and so on. It is in this detailed view
that we learn what this phenomenon really is, a vast area of dust and mat
ter "lit up" by four hot young stars. My work in data-driven decision mak
ing reveals the same phenomenon-that when we drill down into the de
tailed data using more sophisticated analyses, we often uncover new and
important information that can be persuasive enough to move us to action.
Moreover, when we are presented with persuasive data that addresses a
need or concern, we will act. If we need a car, the offer of zero percent fi
nancing is enough to move us to a purchase decision rather than having
the old buggy fixed.
There is a growing national movement to hold schools and school dis
tricts accountable today. The ESEA No Child Left Behind Act of 200 1 is
adding to that pressure at both the state and local levels. The law (or
rather, the state implementation of the federal definition of "adequate
yearly progress") will reveal the compelling news-it will identify
schools that do not make adequate yearly progress over three-year inter
vals and, as a result, will be sanctioned in some way. This law will cer
tainly be persuasive, but will it tell us what needs to change even if we are
compelled to do so? Returning to our automobile example, many of us
passed up on this opportunity, even though we were presented persuasive
information about total cost of ownership. Why? It is likely that many of
us performed a cost-benefit analysis (Do I really need a new car? How
long will mine last? What will it cost even at zero percent interest?) and
decided that even such a great deal as this was not worth incurring an un
necessary expense.
For the automobile example, we are capable of performing the cost
benefit analysis because all of the needed information is available. More
over, we are familiar with these cost-benefit analyses when making major
purchase decisions by calculating the pros and cons.
However, what useful detailed data do we have available to calculate
"adequate yearly progress" that can lead to meaningful corrective actions?
The most useful data we have is a simple "roll-up" or summary report
based on aggregate school averages by how one school did versus another.
These data tell us little about which students did well over time, which did
not, who is contributing to flat performance (impact of mobility), what
programs they participated in, and so on.
T he New Literacies Required/or Data-Driven Decision Making 7
Since the unit of analysis is now the school, from a simplistic point of
view, a straightforward report of how the school did from year to year will
suffice-at least from the federal perspective. But to know specifically
what to change, we need the detailed or student level data to do more com
plex, "drill-down" analyses. In my car example, just seeing the commercial
will not move me to buy; only after careful analysis of the details, and a
comparison of the various options, am I able to make a reasoned decision.
Persuasive data-data that moves us to take corrective action-come
in two forms. Primarily, we need to see the big picture, but we also need
to know the details about what should be changed-if anything. At that
point, we are likely to define and decide what action(s) to take. And, by
the way, sometimes a reasonable decision is a conscious one to do noth
ing.
Thus, at one level, we can see that learning about an issue from a sim
ple summary report or seeing it at a basic level (learning about zero per
cent financing or seeing the Orion Nebula through binoculars) is the first
level of decision making or learning about an issue. However, actions
based on deeper learning require more complex analysis found in de
tailed/student-level data and often require more sophisticated tools and
techniques to complete. Just learning that a school is deficient under
NCLB analysis will not lead to improvement-specific corrective actions
are needed. Deeper analyses of the detailed/student-level data are required
to identify just what these corrective actions should be.
leaders and teachers are ones of causality. Yet this is a new and emerging
field- one that we are just now beginning to explore . Chapter 4 demon
strates what it takes to perform this level of analysis and what we can
know about causality given that we are very early in the development of
these tools and techniques .
Thus , the purpose of this book i s to explain , by example , what can be
learned about student achievement at Stages I , II , and III with a focus on
what is knowable at the causality stage and what it takes to get this work
done .
SUMMARY
We make many decisions in our daily lives , some quick and easy, some
tough and drawn out. We are learning that arriving at decisions about im
proving achievement is challenging work because the variables are not
that clear, often missing or unmatched with those we want to compare ,
and the analyses needed to gain actionable information are hard and com
plicated . Fortunately, there are tools available to help , but this work re
quires a new set of literacies . Interestingly, even when this work is con
ducted properly, one is often left with a better sense of "informed
intuition" about what to do , but not with an absolute , certain direction .
What we can say at this point is that the review of summary Stage I re
ports is not likely to significantly inform our intuition or guide decision
making . However, it is a necessary first step to decide what to investigate
more deeply. Interestingly, No Child Left Behind reports are at the Stage
I level . To use another analogy, it's like going to the ge.neral
who performs a basic examination and then decides to send you to a spe
cialist to investigate a particular symptom. Without this necessary first
step , we don ' t know what specialist to go to , which can result in a waste
of precious time and resources . S imilarly, NCLB analyses need to learn
specifically what improvements are critical . To achieve the promise of
NCLB , we must proceed to Stage II at least. It is in Stage II that we may
find data that can truly guide decision making if performed properly and
interpreted correctly.
Th us, S tage I reports may uncover persuasive trends that prompt us
to reinvestigate the data for h i dden truths that w i l l gu i de intervention
10 Chapter One
NOTES
11
12 Ch apte r Two
often run . The problem is a function of today 's technology coupled with
what school districts can afford, along with the typical issues associated
with messy school data.
S chool data can be (and often are!) messy and incomplete . For example ,
to run a mobility query for sixth grade , we need to know the accurate list
of students in the school at either the beginning of sixth grade (preferably
at the end of fifth) and the accurate list at the time of the test administra
tion (let's say in May) . Since the purpose is to determine how students
performed who spent the entire year in school dis aggregated from those
in attendance for only part of the year, we need to create two groups : "Pre
sent All Year" and "Present Part of the Year."
The problem I find with most school data is that schools typically drop
from their student information systems students who leave during the
year, sometimes re-assigning that student's ID to another student. When
this happens , all electronic records of the first student are lost . Frankly, I
am surprised to see how often this happens .
Even if you have all o f these records saved , you will need to sort the
data by date in the student registration (database) table . Thus , we will
w ant to retrieve , on one list , all students enrolled from the first to the
last day of school and , on the second list, any student who was enrolled
for some lesser amount of time and the dates of that attendance . That
c an require a sophisticated set of commands for the query engine (of
the database) depending on what system you are using . To complicate
matters even further, at this point it is possible to find that the database
field containing this attendance information for each child is not prop
erly formatted as a date/time field - the wrong format would make the
sort impossible .
To resolve some of these problems , several vendors of data ware
housing systems create special database tables that resolve these issues
for running a mobility query. But doing so require s a fair amount of
data cleanup and can become expensive . Unless you do the cleanup or
have it done for you , running the mobility query is going to be a night
m are .
Dat a-Drive n De cisio n Maki ng -Whe re Are We Today? State of the Te ch no lo g y 13
DATA INTEGRITY
There is help on the horizon . Many school MIS directors are realizing
that they need to capture and retain student IDs and teacher IDs , and
create uniform course IDs acros s schools . Most data warehousing sys
tems , which are becoming more affordable , are including specially de
signed data tables (created by changing or transforming your raw data)
to assist in these more complicated queries , such as mobility. And fi
nally, there is the School Interoperability Framework (SIF)l initiative
that is working to standardize school data acros s disparate systems and
enable more reliable data transfers from system to system .
For the short term , though, your more complicated queries will require
patience and skill due to the inherent messiness of school data and the lim
its of the technology. Over time these problems will be resolved , but not
for the i mmediate future .
14 Chapter Tw o
Working with an affluent school district recently, where the staff are very
data driven in their approach to school improvement, I was impressed
with the analyses that they had been conducting. Their testing and eval
uation director had been disaggregating several of the state mastery test
data for the district's schools with many of the schools themselves run
ning further analyses. When I asked what tools they were using, almost
all of them said "paper and pencil," "calculators," and a few said "Mi
crosoft Excel." Regarding the latter, I asked how they had entered the
data into Excel. Puzzled at the question, they said that they had entered
it all by hand!
This characterizes the state of the art for most of the districts with
whom I have worked or spoken over the past few years. No doubt, this
labor-intensive approach seriously handicaps the extent to which edu
cators can explore relationships between inputs (such as program en
hancements, resources, instructional techniques, and materials) and
outcomes. I will have more to say on the problems this situation creates
in responding to accountability demands in chapter 6 , but suffice it to
say for now that working under these conditions is analogous to trying
to diagnose a serious medical condition with mid-twentieth-century
medicine.
run a query through to its logical conclusion or to the point where the data
for follow-up queries do not exist.
Open Architecture
The most useful data warehouse is the full open-architecture system be
cause all of your data, regardless of source and type, can be loaded for
use in querying and new data can be easily added at any time with mini
mal or no extra cost. These systems have come down in cost, whereas
five or more years ago even a basic system could cost upwards of
$250 ,000 for a relatively small school district (around 10 ,000 students).
That same system is now available for far less, with prices ranging from
$4 to $10 per student depending on how much functionality you initially
purchased.
These cost savings have been realized with more powerful, yet less ex
pensive, databases, query engines, and web technologies. The advantage
to this type of system is that it can hold all of your data, making dynamic
querying only a function of user knowledge and skill.
16 Chapter Two
The closed-architecture data warehouse takes selected data from your dis
trict and fits or maps it to the vendor's predesigned database structure. The
advantage to this system is that the core database structure is fixed, no
matter whose data are being loaded. Thus, to the end user, data from dis
trict A will look the same as data from district B. The remapping process
can force users to employ the vendor's definitions for the data elements.
Obviously, the data from both districts are secure from one another, but
the underlying structures are identical, unlike some open-architecture sys
tems where the structures will be different and more fully reflect what the
district has been familiar with.
At least one vendor has created the best of both worlds-a fixed data
model that is so robust that virtually any school data can be mapped into
the system. This example can be considered a hybrid-both fixed yet ca
pable of handling all data-thus taking advantage of the benefits of both
systems.
What is the difference? Let's say that your school uses a unique system
of computing grade point average (GPA). In fact, you may use several
forms of GPA-weighted, unweighted, and you may calculate this quar
terly. In an open-architecture system, all of this data can easily be loaded.
In the fixed system, there may not be a place for the quarterly unweighted
GPA-it may be either dropped, not loaded, or it may be mapped to a field
called "nonstandard" GPA.
However, the advantage that the closed/fixed-architecture system has is
t he abi l ity of the vendor to develop predefined queries that can run no
Data-Driven Decision Making -Where Are We Today? State of the Technolog y 17
matter what district is using the system, which cuts the cost of delivering
these reports. In some of these systems, the vendor, not the district per
sonnel, actually runs the query. For example, the district might call the
vendor, ask them to run a query on mobility for a particular year and co
hort of students, and receive the results in a day or two.
There is at least one major disadvantage to closed-architecture systems.
They typically do not as easily (or affordably) accept all of your data as
the open-architecture system-you would have to pay extra for that quar
terly unweighted OPA to be loaded or to add your own school-based test
ing data. However, this is not a problem of the hybrid systems-a system
that has all the advantages of both closed and open-architecture systems.
As the technology advances, more vendors will certainly adopt this design
structure.
Data Cube
The third option, and a fairly popular one, is the system built around the
data-cube design. The technical name for this version is the OLAP design.
OLAP stands for online analytical processing. These specially designed
systems take subsets of data and design them into hierarchical relation
ships. These designs are fixed, like the fixed-architecture system described
above, but they too can be reengineered to accommodate more data over
time (at additional costs). Once these hierarchical data relationships are de
signed, the user is given access via a web interface that is fairly easy to use
and often allows for graphic data representations. Both of the previous data
warehouse designs (open- and closed-architecture) also allow for graphic
representations of data, but the data cube is particularly strong in this re
gard.
The disadvantage to the data-cube design is that it is, for all intents and
purposes, a very fixed design, and querying ability is limited to the hier
archical relationships that the designer built into the system. If your ad
hoc query requires comparing two data elements that are not logically
linked in this hierarchical relationship, you are out of luck.
However, the advantage to these systems are: (1) ease of use-they are
particularly easy to use, even for novice users; (2) they are web accessi
ble; and (3) they are particularly strong in disaggregating data along the
predefined hierarchical relationships designed into the system.
18 Chapter Two
Selecting the system that is right for you ultimately depends on what you
want to do with it. I advise school districts to make a list of the queries
they want to run (a comprehensive list-see chapter 3 for a discussion on
how this plays out) and then determine the utility vs. cost of a system by
whether it can address the majority of these queries, what it will cost, and
how difficult it is to use. You will also need to ask about the process of
adding new data, how often you can refresh the existing data, and the as
sociated costs for these updates.
As to ease of use, none of these systems are simple-ease of use here
is a relative term. Nevertheless, some systems are, in fact, easier to use
than others. To take advantage of the first system-open architecture
you will need a staff that has three skill sets:
(1) someone who knows their way around a computer (I use the example
of someone who buys a new program, loads it, and gets it running
without ever looking at the Users' Manual);
(2) someone who knows your data (again, by example, someone who
knows that the names/codes for algebra courses might not be the same
across schools or knows that the "standard" score is different from the
"raw" score and how it is different); and
(3) someone who knows and can run basic statistics.
Many districts that have these systems assign full ad hoc query access
to a few advanced users who have most or all of these skills and who can
then create queries/reports and post them for other staff to view.
Finally, if you want to provide all of your users access to some of the
data for basic data disaggregation, the data cube is the way to go. How
ever, not all of your queries can be run with these systems (fixed
architecture and/or data-cube) regardless of who is using it (hybrid sys
tems aside). Again, the final selection of a decision-support system and
data warehouse should be based on a cost-benefit analysis of the ques
tions you have and whether the system can adequately address those
questions.
Data-Driven Decision Making -Where Are We Today? State of the Technolog y 19
INDEPENDENT RE V IEWS
RECOMMENDATION
Testing Grades
Data
Prof Staff
Developme.t Demographl••
reacher_ID Teacher_ID
Oats Data
Oats Data
Data Data
populated the database with the students ' names , their mastery scores , and
what courses each student took, but had not linked every course to a de
partment code . When the computer tried to pull up courses by department
code , the system crashed. It is interesting to note that this issue of missing
department codes is a common problem with school data- one of those
problems associated with data messiness discussed earlier. You should
discuss these problems with vendors to determine the extent to which they
may be resolved . Vendors are beginning to solve these problems by engi
neering very specialized plumbing designed around the unique character
istics of school data .
place to determine which system is right for you is to start with the ques
tions your school leaders ask since they know best the important (and con
textual) questions to improving student achievement in their school.
These questions almost always require access to data elements selected
from among all the data domains that are collected by a district. In gen
eral , when you try to fit your data into some predetermined design , the
system will not work or will have to be re-engineered at increasing cost to
accommodate all of those data.
Finally, data cubes limit the users' queries to not only the data that have
been loaded , but to the hierarchical relationships built into the system.
Cube systems are very useful , however, if you keep in mind what they
were designed for. These systems make exploration of selected data ele
ments and their relationship to other selected data elements extremely
easy to accomplish . Thus , if you want to give users some control over an
easy-to-use system, albeit limited in what queries it can run , then the data
cube is a good option . Several vendors are now packaging data cubes with
other report options to expand the usefulness of the overall system.
Once a decision-support system is selected , what can be done with
them in the larger sense: that is , what are these systems' inherent strengths
and limitations in total? As we will see in the next section , no matter what
system you have , each one has certain limitations in terms of what it can
do as measured against our expectations . Throughout this discussion a ba
sic gap emerges between our desired reporting needs and the analyses
needed to create them. Unfortunately, none of these systems are as good
as they need to be for many of our analyses.
1 0000
.J .. M A M ' rl A. S 0 ft D
JI03
2000
1 500
1 000
that- more in-depth analyses are essential to reveal the changes necessary
to improve achievement.
S o , as we enter the twenty-first century, educators are deluged with
data and information as they search for ways to summarize it, make
sense of it, and look for trends and aberrations to help plan interven
tions . It would be wonderful if we had a report as informative as the
school-data weather report , but it does not exist as yet . Until then , we
are left to reports such as those represented in figure 2 .4 - roll-up re
ports of aggregate data . Subsequent chapters of this book will also de
scribe useful school reports that take detailed data and chart it for in
terpretation .
need to use root cause procedures, if possible, to try to isolate the read
example, a cost-benefit analysis of the reading recovery program would
ing recovery program as the key input variable . This analysis would
likely require longitudinal techniques (because the analysis is done over
time) as well as disaggregation techniques (since we need to know how
this group did compared to the standard popUlation) . The more compo
nents of analysis we add , the more challenging the work becomes. Most
of the questions asked require multiple strategies to answer.
Data-Driven Decision Makin g-Where Are We Today? State a/the Technolo gy 29
To explore the challenge of performing a query, let ' s look at a few ex
amples of typical queries that school leaders ask . Here are a few that I
have worked on with school principals and central office administra
tors :
I have found that most educators ' questions are searching for causality.
These are questions that require the identification and isolation of some
dependent variable , generally the achievement measure such as a test
Sl:ore . or drop-out rate , and a series of i ndependent variables -the most
30 Chapter Two
likely factors that are causing the outcome . Of the example questions
listed above , they could all fall into this general category, but questions
#1 , 2 , 3 , 5 , 7 , and 8 in the first list are clearly questions of causality at one
level of sophistication or another.
One reason why this is such hard work is that it is often very diffi
cult to assemble all the needed data into one database to do the analy
sis . You might be mis sing important data altogether (what if you did not
capture all the students who attended each preschool - question #3?) or
you may not have captured how long they attended each preschool , as
it would be important to determine if time in attendance of the pre
school mattered .
Now, assuming you had all or most of the data you needed, how would
you go about doing this analysis? Questions of root cause require regres
sion analysis , which is not a simple statistic to run and analyze . Even if
you got this far, what are the results likely to tell you? When these analy
ses are run with a computer statistics program, a readout is generated of
which input variables best predict the outcome for a given level of statis
tical probability. However, that level of statistical probability is not al
. ways convincing . Here we need to differentiate between statistical and
practical significance . As you will see in chapter 4, statistically significant
results are not always practically significant - at least not to a level that
persuades you to take action .
To consider any finding practically significant- that is, one that leads you
to take action toward change - you would want to know that this finding
could be generalized beyond the " study" group of analyzed students . I
have found that interventions are often warranted only when a convincing
case is made after review of several cohorts of students (running the same
analysis on each cohort) . In general , a similar trend (or finding) across
three cohorts , following each cohort over at least three years (in other
words a 3 X 3 matrix) , will often yield convincing results . However, be
cause of "data messiness" issues , you will not always be able to replicate
these analyses over the full 3 X 3 matrix , but you should certainly try for
as many years as i s practical .
Data-Driven Decision Making-Where Are We Today ? State of the Technology 31
What seem simple questions are often not so simple . Take , for example ,
question #4 from the list above : "Is there a correlation between
achievement levels on the eighth-grade reading mastery test and high
school drop-out rates?" The complication here is created by how a
dropout rate is computed . Do you include students who reenter school
after a year or two? What about those who go on to finish with a GED?
After all , they may have spent a good deal of time in your school . What
about those students who move out of town? How does your guidance
office or attendance office code these students into your student infor
mation system in order to differentiate them from drop outs and stu
dents who left the school for other reasons? Do you have the date/time
fields coded correctly so that you can dissagregate the different groups
by date/time/periods?
Thus , to perform this analysis you will need to place each student into
one of many groups ; one group is composed of those students that gradu
ate from your school and the other students need to be placed into as many
groups as needed for the categories discussed above . Once you have all of
this information in one database , it needs to be properly ordered into a
spreadsheet program by group in order to run the correlation among the
groups . Then you need to interpret the correlation - not an easy task given
the inherent weakness of correlation as a statistic . When you are done with
a l l of this work, you will need to repeat this entire exercise once or twice
more to look for trends across a 3 X 3 matrix !
If that sounds like a huge task, it is , and it is why this is such hard, frus
trati n g work for many busy and time-starved educators . A university
scholar asked me recently, after hearing one of my presentations , "Isn't
this just good , basic research?" The answer is - yes . And doing good, ba
s i c research is hard work .
32 Chapter Tw o
Beyond the time and hard work needed to do this work , there is a set of
discernible skills required for success . These are :
SUMMARY
You must be wondering at this point whether this is all worth the effort.
The answer is that we simply can no longer afford not to do this work
the pressures of accountability and the need to find practical teaching
strategies and interventions are driving the field . Although this is hard
work, the advances that school districts are making through data-driven
decision-making techniques indicate that we are finally on the verge of
identifying those interventions and strategies that truly impact achieve
ment . This accomplishment alone makes the journey worth your time.
It is also clear that this will remain hard work for some time
the promise of a super technology is still a long way off. Our current
Data-Driv en Decision Making-Where Are We Today? State of the Technology 33
NOTES
A good place to begin this discussion is with your questions about student
achievement . Thus , I would like you to pause for a moment and answer
the following question . It would be helpful if you could write down your
response so that you can refer to it throughout this discussion .
Question: If you could get an answer to any question(s) you have about
student achievement in your school or school district today , what would
you like to know?
(' j
OK - you have a question . Now, the first serious issue to consider is
whether your question(s) is answerable given the condition of your data
and the resources needed to get an adequate answer. Many outstanding
questions are just not answerable . Others can be given the proper ap
proach and adequate resources . The purpose of this chapter is to help you
make that determination .
A few years ago , we used to say that school administrators did not
know how to ask the right questions . Frankly, we were dead wrong ! My
l atest work indicates that great questions are typically asked , including
tough questions about school improvement. The real problem is that most
school leaders do not know how to deconstruct those questions into
36 Chapter Three
doable analyses , nor do they know how to run the analyses and interpret
the findings . And even if they did know how to deconstruct the questions ,
run , and interpret the analyses , they often do not have the time since this
is complex , time-consuming work. As a result, a good place to start is to
determine how hard your question(s) is to answer and what it would take
to get that answer.
To guide you I will present a framework for (1) thinking about data-driven
decision making , and (2) categorizing your question(s) in terms of its an
swerability and the level of effort it will take to get that answer. As men
tioned earlier, I refer to this as a three-stage model of data-driven decision
making . Briefly, Stage I uses aggregate data to compare institutions (such
as how your school performed compared to others) . Stage II uses detailed
data (the individual students ' responses) to determine what specific (con
tent) areas and students contributed to your school 's better (or worse) per
formance . Stage III gets at prediction and , to the extent possible , root
cause , answering the question "why did you do better (or worse) ?" As we
go through this discussion , please think about the question(s) you wrote
down and consider what it would take to answer it.
Connecticut Mastery Test, Th i rd Generation, % Meeting State Goal: The state goa l
was estab l i shed with the advice and assi stance of a cross section of Connecticut
educators. Stu dents scoring at or above the goa l are capa b l e of successfu l l y
perform i n g appropriate tasks with m i n i ma l teacher ass i stance.
Con n ecti cut Mastery Test, 3 rd Gen. D i strict D i strict ERG State
% Meeting State Goal 2 00 1 -0 1 2 0 0 1 -02 2 0 0 1 -02 2 0 0 1 -02
G rade 4 Read i ng 67 69 62 57.9
Writi n g 72 70 64 6 1 .2
Mathematics 74 77 65 61 .0
A l l Th ree Tests 5 0.4 5 3 .4 44 . 9 42 . 8
G rade 6 Read i n g 68 72 65 63 . 6
Writi ng 64 70 62 60 . 0
Mathemati cs 66 76 63 61 .0
All Three Tests 46.7 54.0 45.8 45.4
G rade 8 Read i ng 70 72 66 66.3
Writi ng 63 60 58 58.8
Math ematics 64 63 57 5 5 .4
Al l Th ree Tests 47.1 46.1 42 . 7 44.0
Parti c i pation Rate 97.3 96.9 97.2 95.0
*Incl udes resu lts based on an alternate form of the CAPT due to an adm instrative i rregu larity
figure 3 .2. Connecticut Academic Performance Test, Mi lford, CT, 2001 -2002
data: Connecticut Mastery Test results in fourth , sixth , and eighth grades ;
the Connecticut Academic Performance Test (tenth-grade mastery); and the
SAT results . Figure 3 . 1 shows a brief section of that strategic school profile .
We can see from the data presented that Milford , a district of about
8 ,000 students in Connecticut's "Educational Reference Group F" (pri
marily a wealth-based indicator used to group Connecticut districts from
most wealthy "A" to least wealthy "I") did well compared to its peer ERG
districts in the state . They also made improvements in almost all achieve
ment areas from 2000-200 1 to 200 1-2002 .
The district was equally successful in grade ten as shown in figure 3 .2 .
This shows performance o n the tenth grade Connecticut Academic Per
formance Test - an even more rigorous test than those used in fourth ,
sixth , and eighth grades .
Exploring the strategic school profiles further, we can see in figure 3 .3
that Milford also performed better than its Educational Reference Group
(similar districts) on the verbal SAT score , dropout rates , and percent of
students pursuing higher education .
The overall picture for Milford looks good; they have made improve
ments over the past several years that have apparently resulted in im
proved achievement across the board , and whatever they are doing ap
pears to be working better, overall, than their peer districts . However,
we have no idea what specific interventions and groups of students may
be contri b u t i n g to the i r success from th i s Stage I review. T h u s if we do ,
A Frameworkfor Analyzin g Your Questions 39
2 79 75 . 7 75.8
2001 44 1 0. 6 1 7.5 1 7. 1
1 99 6 59 1 6.4 1 7.98 1 8. 7
2001 0 0.0 0.7 0.7
1 99 6 7 1 .9 1 .7 1 .4
and math component provided a s a scale score between 200 and 800 ,
while the tenth grade CAPT has several layers (various subject areas bro
ken down by objective) , as do the Connecticut Mastery Tests in grades
four, six , and eight . And several score types are provided: raw, scale , and
standard . For the CAPT alone , there are separate literature , math , science ,
writing , and reading scores, each with some level of subcomponents . For
the fourth , sixth , and eighth grades , each learning area (reading , writing ,
and math) has several subscores , with math having as many as forty sep
arate objectives measured (reported only as standard or ordinal scores) .
Each state's mastery test has similar levels of complexity. Figure 3 .4 only
=
..
displays the top layer, but this can become very complex . The real picture
can look more like what is shown in figure 3 .5 .
This is where decision-support and data-warehousing systems come in
handy, as the system allows you to easily retrieve and analyze data from
within this vast data set.
Let us now look more deeply at the second strategy or question I raised
earlier: What groups of students may be contributing to Milford's suc
cess? The real issue here is how we can deconstruct this question into
doable analyses ; that is , what analyses should we run?
There are several options - some will be interesting but not particularly
revealing , and others are truly interesting and revealing . Many educators
might first ask what the relationship is between and among these vari
ables - which is a question requiring correlation . We can also display the
results of these analyses on a scattergram. Let's see what we can learn by
running some of these analyses .
First, we need to get the required data that, fortunately in this case , re
sides in a database . Using Milford's data-warehousing system, I am going
to select all of the relevant individual students ' scores for the following
longitudinal cohort:
Once the data are selected in the data warehouse , I can export them to a
spreadsheet program for further analy s i s I use Microsoft Excel and import
.
the Excel resu lts to the Stati stical Package for the Social Sciences
(SPSS) - ulthough some of this could be done d i rectly in Excel .
A Framework for Analyzing Your Questions 43
Correlations
Figu re 3.6. Correlation of Mi lford SAT 2002, CAPT 2001 , CMT 8 1 998, and CMT
1 'J'J5 major learning areas
44 Chapter Three
To begin , we can see that the correlation between SAT and the other
scores are :
Each correlation denoted with an asterisk (*) indicates that the relationship
between these two variables is statistically significant (shown also by the
significance level of the "2-tailed t-test") . We are also presented with the
number of students in each cell . While this is all very interesting , what is it
actually telling us? What can be inferred? What decisions can be made?
Correlation is a statistical technique that tells us whether the two scores
move in the same direction . In other words , the fact that the correlational
value between the SAT-Math with MATHSCALE (tenth) is .829* tells us
that there is a pretty good chance that if a student does well on the tenth
grade test , s/he also did well on the SAT. But the opposite could just as
easily be true . In other words , if a student does poorly on one , they likely
did poorly on the other. How often is this directionality accurate when
looking at this particular correlational value of .829? To answer this ques
tion , we would square the correlation to find the percent chance that these
values move together. In this case , the correlation is .829 , so .829 multi
plied by .829 is .687 , or about 69 percent of the time (rounded off) .
The issue now i s this : Has any of this analysis s o far answered my ba
sic question of "what groups of students may be contributing to this suc
cess?" No ! All we know at this point is that about one-half to two-thirds
of the time (considering all correlations performed) , students ' scores
move in the same direction as they progress from sixth to eighth to tenth
to eleventh grade . S o , while this might be interesting information , it re
veals nothing to us about the key question we asked earlier.
Would displaying these results on a scattergram help? Let' s see - figure
3 .7 is a scattergram showing the SAT math score plotted against the tenth
grade mastery math test score (where the correlation above was the
strongest) .
Again , this is interesting , but not revealing very much in answer to my
question . Here we can see how the scores "travel" together in certain direc
tional ity. but we do not yet have the specific information we are looking for.
A Framework for Analyzing Your Questions 45
.. ����----�---,
11
. .. .
• II • •
• I
- ., '
.. . "" .
••
4111 .' . ' I
•
'
11 . ' .
�:
• 11
. '.
• 111
. � : .
I · •• iIt------:r--....---:..:-
... ..
- ...
-- --:1..
"" .
4th Mastery
1 995 1 9 94 1 993
"these same students" improve year to year (in other words , factor out
mobility) .
The first two definitions of success are easier to measure than the third be
cause we do not have to follow a specific cohort of students over time , we
simply need to know how each grade level does from year to year (fourth
in 2002 to fourth in 2003 , etc .) . To address the first two definitions , we
could simply use state reports , available often over the Internet, as dis
cussed earlier in this chapter under Stage 1. But the third definition is a lot
harder to attain and requires special tool s . It is also the answer school dis
tricts most want to know; that is , how students who have been in the dis
trict for several years perform over time . Let's focus on this problem be
cause it will help answer my key question .
To know which groups of students are contributing to success in our
Milford, Connecticut, example , we would need to have data covering co
horts for several years to select matching students . Figure 3.8 shows how
this would work .
F i g u re 3 .8 di splays the ne ce s sary data sets to explore the longitudinal
growth of each cohort of students by Year of Graduation (YOG) . (Note
A Framework for Analyzin g Your Question s 47
that there is a three-year difference between the tenth and eighth mastery
test even though they are given in eighth and tenth grades . This is because
of the naming nomenclature used in Connecticut and because the eighth
grade test is given in the fall , while the tenth is given in the spring - thus ,
a three-calendar-year difference in the naming conventions .) Earlier I
spoke about the three skill levels needed for this work: computer savvy,
knowing one 's data, and statistical knowledge . This three-year difference
between the eighth and tenth grade tests illustrates the need to "know"
one 's data .
To "know" which students contributed to Milford's success , we could
use the model displayed in figure 3 .8 and select specific students who were
at a lower level of proficiency in sixth grade to distinguish which ones im
proved to a higher level of proficiency in eighth grade , and so on . Since No
Child Left Behind will require that states set three proficiency levels , we
could easily explore which students were at the mid level in sixth grade to
see who improved to the higher level in eighth grade, and so on.
NOTE: I am unable to demonstrate the specific student results in Mil
ford as doing so might unintentionally disclose the actual students. The
same rationale applies to the establishment of minimum group size dis
closure under No Child Left Behind-to maintain student anonymity. But
to demonstrate how a decision-support system could workfor the Milford
example, the following data (figure 3. 9) are contrived.
Decision support systems and data warehouses with ad hoc query ca
pability can filter in and out specific data elements , which makes this
query rather easy to run . In this case , we would set the filter to determine
which students were at proficiency (level "2") in sixth grade (the mid
level) and level "3" or higher in eighth grade (goal) . This query result
would display the names of students who met both of these conditions .
Using this technique , several queries could be run looking at the range
of data contained in an example such as those in figure 3 .8 . Figure 3 .9
shows an example of what might be retrieved, as again , this particular
query result is fake data (see note above) . Connecticut currently uses five
levels of proficiency - with "4" being goal and "5" signifying advanced.
The number 9 is used to denote a student who was absent for this partic
ular subtest.
We can see that of the eight hypothetical students who scored low in
si xth grade (Reading Standard = "2") , all improved (and one was absent
-
-
48 Chapter Three
I
+ I
� -i=if �-
- - --
6th
� Grade Reading Standard Score to 8th
l
r --r
Read i n g Standard Score
I --] -- --
2
-: �- - I
5-Advanced; 4=Goal ; &3 • lower levels of proficiency; 1 =In rventlon _
--
+
=--=-+� - >
j .,...
.-
- -I
Grade Level �D STD _ "�I STO_ _ _
- -
- --
-
,
- -._ -
-
�� - : �- T-
-I -;� -- fU¥
--�I --- -t -
�-t
-� - - -
1 ---
�-----i --+ ----
- --
- - � -- -- -
6 - --
2 - --
-==-
4 ---] -
- --- - -
2 --
I
8 - 4 6 I
6
-�
--t -�
- 8 --- 6 2
=t
8 9 ,
I
,
Figure 3.9. Contrived Data: Specific Students Cal led U p Using the Data Warehouse
from the eighth-grade test) . The students who scored a "4" and "5" in
eighth grade improved dramatically. Since all of these data would already
be loaded into the data warehouse, we could continue to explore innu
merable relationships such as this one . Using the ad hoc query capability
we could also find out a lot more information about these individual stu
dents - what classes they took , who the teachers were , whether they re
ceived support, and so on.
For this example , I established a fake data set for one reading achieve
ment cohort, but there are literally hundreds of data elements I could have
used since reading , writing , and mathematics are broken down by objec
tive and subobjective on these tests (which is true for most mastery tests
used by states) . Furthermore , I demonstrated this for only one cohort
analysis, sixth to eighth grade . As figure 3 .8 shows , there are many longi
tudinal relationships that could (and should) be run . Thus , if my data were
real I would have answered , in part, the first question (at least for read
ing): "Which students are contributing to success?"
My second question was "What specific interventions may be con
tributing to this success?" Given the results shown in figure 3 .9 , we could
query the database further to see what programs these students partici
pated in during the two-year period . I could pull up their classes, special
programs , spec ial charac te ri s tic s , etc . , which is all very easy to do once
A Framework for Analyzin g Your Questions 49
you learn to use the ad hoc query tool . The assistant superintendent in Mil
ford regularly runs queries such as this one and discusses the results with
his administrative and teaching staff to determine what is working and
what else they can do to improve achievement.
Due to the level of specificity of this analysis and discussion, Milford
is making important progress and is one reason , albeit not the only reason ,
they are doing well compared to similar districts . Thus , through Stage II
analysis , we can learn which specific students are contributing to success
and begin to understand why they are doing well .
But this analysis cannot tell us which interventions actually cause that
improvement - it only serves to identify possible relationship s . While this
might not seem like a huge difference , it is and this distinction is the sub
ject of the next section .
made note of a "missing link" that he felt would lead to a better under
standing of the "root cause" of the accident . During my revision of this
chapter later that summer, the NASA focus shifted to the weakness of the
leading edge tiles as the "root cause ."
The best definition of root cause I have found is "a systematic process
whereby the factors which contributed to an incident are identified ."4
There are many definitions, most focused on the engineering fields , but I
prefer this one as it comes from the health-care field , which has many sim
ilarities to education in terms of the nature of our data and administrative
systems . Root cause analysis is , therefore , the search for those factors that
contributed - or caused - an incident , or, in our case , the improvement of
achievement .
The key here is on the "systematic process" that helps to "identify" the
cause . As we are about to see (in both this section and the next chapter de
voted solely to this issue) , root-cause analysis is not easy work and the
tools are still hard to use and expensive, but my early work in this area is
starting to illustrate that when we take the time to run a "systematic
proces s ," we can begin to isolate those factors that contribute to achieve
ment improvements (or problems) .
At this point , please pause a moment and take another look at the ques
tion(s) you posed at the beginning of this chapter to determine what stage
of exploration and analysis is needed to answer it .
While these questions do not directly ask for root cause , they are questions
of causality because they seek to identify how one or more variables affect
another. The directionality of that question is what distinguishes it from
Stage II. We learned in the Stage II discussion that correlation only tells us
whether two variables are moving in the same direction , not whether one
predicts another. For many of our questions , just "knowing" that students
who scored well on the eighth-grade mastery test also scored well on the
tenth-grade mastery test may be enough . But what if you want to know
which earlier test scores , from the myriads available (see figure 3 .8 and its
rel ated discussion) , can be used to identify students at risk of performing
poorly in the future in order to design and implement interventions? In this
case , prediction analysis is required . The following example will demon
strate the cost-benefit of w h ether m ov i n g to the expense and complexity of
52 Chapter Three
Figure 3 . 1 0. Contrived Data: Students Who Participated i n Read ing Recovery During
the 1 992-1 993 School Year
not worth the added effort given all the inherent limitations of performing
root-cause analysis . But, as we will see in the next chapter, when root-cause
proof is needed, special tools and techniques can sometimes provide it.
Purists would argue this is not root-cause analysis - and they would be
correct . Consequently, is drawing such a conclusion blasphemy? You will
have to decide that for yourself but, given the level of work needed to ob
tain a more perfect analysis , I would be satisfied with the decision .
This Stage II analysis does not "prove" that Reading Recovery works
or that participating in Reading Recovery predicts future success . What
we can conclude from this analysis performed over three independent co
horts is that there appears to be a relationship between participating in
Reading Recovery and later success in school . True , other factors can be
at work here , and probably are . But exploration of the question at Stage II
seems to indicate that the program should be retained . And , given this ev
idence , there is no cost benefit of going further.
This example used a 3 X 9 matrix to explore the problem . We used three
cohorts of tenth graders (2000 , 200 1 , and 2002) and looked back at their
experience and performance when they were first graders - hence the
3 X 9 (three cohorts , each going back nine years) . My experience indicates
that when we explore trends over this length of time and observe com
monalities , evidence so obtained can be both plausible and persuasive
even though we have not "proven" causality.
A graphical representation of the three-stage model I have presented in
this chapter is displayed in figure 3 . 1 1
This chart shows the relationship between the complex tools and analy
sis techniques needed for tougher questions leading to root-cause analy
sis . It also shows that the more useful results are often obtained at the
outer limits of this schema.
SUMMARY
We were wrong a few years ago when we said the problem with educators
is that they do not know how to ask the right questions . In fact, they ask
great questions . We now understand , however, that they have trouble de
constructing thei r q ue s tion s i nto doable anal yses , co l l ect i n g and an a l y zin g
the needed dat a . and properly interpret ing the res u l t s .
A Frameworkfor Analyzin g Your Questions 55
.. &s
-----. � �
C'tac..
•• , = --
At the heart of the matter is the time required to do this work and the
skills needed . This is complex work, demanding time to explore the rela
tionships between and among data elements . The more important your
key question is , the more variables you will want to consider.
Through this proces s , you are informing your intuition about what is
going on , what is working , and what is not. Thus , you are guiding your
decision making through the analysis of data. In this way, most of your
important work can be accomplished using Stage II methods , assuming
that the data are all collected and accessible in a decision-support system
and data warehouse . Then , with enough time and skill , these questions can
be explored.
Occasionally though , you need more proof or the variables are not as
easy to "eyeball ," as in the Reading Recovery case . For example , many
districts would like to know which of the myriad of scores they have on a
child going back to first grade are the best predictors of tenth grade per
formance . Dropout? SAT performance? College acceptance? And so on .
One superintendent asked his assistant to collaborate with me to iden
tify what he termed the "key leading indicators" were of fourth-grade
mastery results , since his district had loaded a wide range of early ele-
56 Chapter Three
mentary data into their data warehouse . Questions such as these cannot be
addressed through either Stage I or Stage II techniques . In this case , we
need to address the superintendent's question through root-cause analysis ,
and the key challenge then becomes: "What is really knowable when we
do this work?" Chapter 4 explores this question in depth .
Please go back and take a look at the question(s) you asked at the be
ginning of this chapter and ask yourself what stage that question falls into .
What data are needed (aggregate or detailed) ? Are the data accessible in
electronic format? What steps are needed to answer your question(s)?
What analyses will be needed? Who has the skills and time to do this
work? What is the cost benefit of doing the analysis ; that is , how impor
tant is the question? And what do you expect to learn as a result? If you
answer these questions before you start your achievement-related ques
tion , you will be more likely to succeed in the end.
NOTES
Imagine being able to answer your Stage III questions regarding program
effectiveness and causality accurately and precisely. Consider the value of
determining the causes of poor achievement identified from one cohort
and then scientifically applying that knowledge to predict students in sub
sequent cohorts who are at risk well before they reach that significant
level of schooling .
To ground this discussion in your experience , please take a moment and
revisit the question(s) you wrote down at the beginning of chapter 3 - was
it a Stage III question? If you have not had a chance to do that exercise ,
take a moment now and answer the following question. (It would be help
ful if you could write down your response so that you can refer to it
throughout this discussion.)
Question: If you could answer any question you have about student
achievement in your school or school district today, what would you like
to know?
57
58 Ch apte r Four
If your question fits into Stage III , this chapter will demonstrate new tech
nologies that will allow you to more effectively answer these questions .
While these technologies are not ready for everyday use by practitioners
that is, they are not user friendly enough for your everyday use - this chap
ter will demonstrate the power of these emerging technologies . To reveal
this software's capabilities , the last case study of this book (discussed in
chapter 5) uses this technology to address one of my student's questions
about finding best predictors of high school grade point averages (GPAs)
from a host of earlier achievement measures with the purpose of being able
to intervene much earlier in bettering a student's performance in high school.
More broadly speaking , I am often asked by school administrators if I
can help them identify the important data from the bulk. I always ask,
"What do you mean by 'really important' ?" To which most respond, "The
data that tell me if our programs work ! " I have been working on that prob
lem for years because I , too , wonder whether we could filter out all the
extraneous noise from what really matters . In one case , as I mentioned
earlier, a colleague asked a related question (his district was the recipient
of several grants) - he wanted to know where to focus these resources so
that they made the greatest impact. The common thread is that all of these
questions demand root-cause analysis . That i s , they seek answers to
"Which of the data can predict outcomes?" And an even bigger question
is , "What analysis is appropriate to attain this answer that would achieve
fmdings that are reliable and valid?"
ROOT-CAUS E ANALYSIS
Background
My interest in writing this book came about when I heard some educa
tors speak nonchalantly about conducting root-cause analysis, as if it's
60 Chapter Four
mainstream. I expect that, over the next several years , researchers and
vendors will find ways to commercialize these procedures , bringing them
to the field at l arge . But for now, my purpose in this chapter is to demon
strate ( 1 ) that this work is possible , (2) what it takes to do the work, and
(3) that root-cause analysis as currently practiced by the field today is
more fad than science .
presuming that what mattered for the cohort of 2000 also applies to the
cohorts from 200 1 and 2002 .
This chapter is about going the next step or knowing with statistical
precision that the set of variables found for the first cohort also applies
to subsequent cohorts . This is what is referred to as rule-based decision
making , where a rule that is determined as applicable to one cohort can
then be applied to another. The analog here is medicine . If ! present cer
tain symptoms to my physician , he will be able to recognize my illnes s
and prescribe a regimen because the symptoms meet the same criteria
as ones that have been identified previously through specialized trial
studies .
A key assumption in the federal No Child Left Behind law is that edu
cators posses the tools and abilities to identify these leading "key indica
tors" that will make a difference in student achievement. Current data
analysis techniques used in data-driven decision making , however, limit
us to looking back in time: "How did this cohort of students perform and
what were their predictors of performance?" Educators are then left to
make potentially erroneous assumptions that the predictors of a particular
cohort will hold true for subsequent cohorts . While this may turn out to be
the case , there has been no way to systematically prove it, in other words ,
to prove that a "finding of fact" or "root cause" identified for one cohort
can be applied with confidence to subsequent cohorts . If such a system
could be devised , the potential success of school interventions could be
greatly enhanced and the nation would have a better chance of meeting
NCLB 's lofty goals .
New techniques in data mining and Artificial Intelligence (AI) now pro
vide the tools to develop a predictive model developed from a current or
past cohort that can be applied to subsequent/future cohorts to predict
their performance more accurately. These predictive models will allow for
the identification of leading indicators to develop effective interventions
for improving student achievement. To the extent that prediction analyses
can be considered findings of fact (or the identification of root cause from
the variables studied) , these new techniques can provide crucial answers
to the question of "what works in schools?"
Finding Root Causes of Achievement-Can We Know ? 63
Over the past several years , I have been experimenting with tools and
techniques that might prove successful in this work . While I know of a
few tools that can get the j ob done , they are extremely expensive and
out of reach for schools . Thus , I was left to work with the ones that are
affordable - a key principle that I hold to in my work . (It does not make
sense to work with tools that are unattainable for the field at large .)
However, to my chagrin , none of those efforts proved very successful .
What appeared to work on one cohort could not be reliably duplicated
on another. While one technology would work on certain data types , it
would not work with others (because of a mix of variables types , such
as combining scale scores with ordinal scores) .
These early attempts did not work principally because I was using tools
that limited me to looking back in time - i .e . , how did this cohort perform
and what were the leading indicators of their achievement? But when I tried
to apply that knowledge to a second cohort and a third, I could not find any
persuasive (valid and reliable) trends across cohorts . Thus , I was unable to
develop a systematic way to apply the knowledge learned from one cohort
to another in the same school or school district. This is due to the fact that
traditional regression methods enable one to find which variables best pre
dict an outcome for a particular cohort, but there is no way to systemati
cally apply that knowledge to a subsequent cohort. Having reported on
these attempts over the past several years at various conferences and in pa
pers , I ended my work about a year ago in a fit of frustration . The problem
is further compounded by a lack of educational literature on this topic .
Then I recently learned of new advancements in Artificial Intelligence
and data mining that might lead to success . Most of these new tools are
very expensive , well beyond the financial resources of educators . As
noted above , one of the concepts I have held onto through all my work is
that whatever we do , it must be affordable . While the new toolset I report
on here is not a shrink-wrap (a few hundred dollar) application , it is a lot
less expensive than anything else I have seen on the market . Of course , it
requires that you already have a relational database or data warehouse and
the skills necessary to run the software , which are necessities beyond
most of us , but this will improve in time . The price for the data mining
software alone (at the time I am writing this - late Spring 2003) is around
$22,000 . That does not include the development of the data warehouse or
the h u man resource time necessary to learn the toolset . Thi s m ight sound
64 Chapter Four
expensive , however, this application is a lot less costly than the others I
reviewed , literally a fraction of the cost of another well-known toolset in
this space , and it works with educational data on the questions school
leaders most want to know. The toolset is modular, that is , you only need
purchase the algorithms you desire . And my early work with it indicates
that educators might simply need the base system plus a couple of the
twelve algorithms that comprise the total toolset . If this turns out to be
true , then that $22 ,000 figure can be reduced substantially.
We are still a long way from taking this toolset, or ones like it , and
weaving it into the everyday fabric of school decision-making applica
tions , but we are now on the way. The purpose of the remainder of the
chapter is to demonstrate what can be done with these technologies . Over
the coming months and years , I expect to find ways to make this technol
ogy more accessible and useful . But, for now, this is what it takes to do
root-cause analysis .
To explain what this new technology is capable of, the following pages re
port on a pilot study I conducted using this software . The study was per
formed across four Connecticut school districts , one urban and three sub
urban , where a new Artificial Intelligence tool was used on each district's
data to accurately develop a rule-set from one cohort's data that was then
applied to predict outcomes for subsequent cohorts . The advent of these
technologies raises the potential of providing educators important and
useful information about where to focus their intervention resources in the
pursuit of improving student achievement.
data mining as used here is the classic definition , utilizing prediction analy
ses , not how it is often used in education, parochially, to describe the basic
review of performance trends in many of the writings on data-driven deci
sion making .)
New advancements in Artificial Intelligence software have made these
technologies accessible to educational researchers because (1) the cost is
coming down and (2) these programs can now more easily "connect" to
existing school data warehouses . These tools are being used in the private
sector to identify root causes of specified outcomes (dependent variables)
with new speed , efficiency, and accuracy.
I used Megaputer 's PolyAnalyst® computer program to identify lead
ing indicators in the pilot study reported in this chapter.2 The winner of
several awards in the AI field , PolyAnalyst® develops a rule-set , or pre
dictive model based on a particular cohort's data and then that rule- set
can be applied to a subsequent cohort's data to accurately predict their
same dependent variable . Prior to the advent of this technology, we had
been able to determine predictors of achievement using classic regres
sion methods , but we had not been able to transform that knowledge
into a statistically reliable rule-set for application to another cohort's
data to predict their outcome variable . This is a new and exciting de
velopment in Artificial Intelligence that could greatly assist school
leaders in understanding what interventions work and what interven
tions are needed early-on in a child 's educational career to make a dif
ference in their learning .
Writing on the topic of rule-based systems , Dhar and Stein note that
"rule-based systems are programs that use preprogrammed knowledge to
solve problems . . . . You can view much of problem solving as consisting
of rules , from the common sense ' If it's warm , lower the temperature ' to
the technical ' If the patient appears to have pallor, then he must have an
excess of bilirubin in his blood or be in shock; and if there 's an excess of
bilirubin in the blood , then administer drugs to lower it . ' . . . Of all the sit
uations you can think of, whether they involve planning , diagnosis , data
interpretation , optimization , or social behavior, many can be expressed in
terms of rules . It is not surprising , then , that for several decades rules have
served as a fundamental knowledge representation scheme in Artificial In
tcl l igence ."3
66 Chapter Four
While my pilot study did not follow the exact steps and processes of
classic rule-based decision making as outlined by Dhar and Stein , the
same concepts apply in that I am identifying a set of predictive indices
that, taken together, form a rule-set or model that can be used to predict
the future achievement of subsequent cohorts accurately.
I
... -
� """' - I.- -......
.
.. [ ..... . .......
.. �- ... , I
..... .. .... III I
.. ... ..
I
�-.-. ....
1
.... � - ... -e� ;rP'4
II1II " ",, - ..
",.� - t•• _
... � ..
� ..... . .I1...111t
presenting them explicitly in the form of rules and algorithms . The system
builds empirical model s of an investigated object or phenomenon based
on the raw data. The user does not have to provide the system with any as
sumptions about the form of the dependencies - Poly Analyst discovers
the hidden laws automatically, regardless of how complex they are .4
This pilot study was conducted on four Connecticut school districts '
achievement data that were already loaded into a robust data warehouse
containing a great deal of historical data on each student. A typical data
warehouse , which forms the basis for extracting the data needed for these
analyses , would house several hundred data variables on each student for
each year that they are in school . Some of these data warehouses contain
historical data dating back ten , even twenty, years . Mining these data
warehouses becomes a virtually impossible task without the proper tech
nologies . For this pilot study, tightly aligned state mastery test scores were
used to determine the usefulness of the Find Laws data-mining algorithm
in PolyAnalyst® to predict a selected dependent variable . A graphical rep
resentation of the study is shown in figure 4 . 1 .
For each district in the pilot study, the baseline cohort from 2000 was
selected as either tenth or eleventh graders (for selection of the dependent
variable) and several earlier test scores were then used as potential pre
dictors (independent variables) . A rule-set was developed using the Poly
Analyst® Find Laws algorithm that was then applied to one or more sub
sequent cohorts (200 1 and 2002) . Since I already knew the actual
dependent variable from 200 1 and 2002 , I could test the accuracy of the
Find Laws rule-set in predicting the dependent variable .
What does a rule-set look like? Here is a very simple rule-set developed
by PolyAnalyst:
The computer program has determined that the DRP (degrees of reading
power) score from eighth grade and the eleventh-grade objective level
score of the math subtest from the sixth grade are the best predictors of
tenth-grade science mastery score . The computer identified these pre
dictors from the many that I used as independent variables . The rule-set
is in the form of an algebraic equation , although not fully intended to be
one . It is represented as a symbolic language that is interpretable by the
user so as to identify the predictor variables . You cannot take this for
mula and directly apply it to the subsequent cohort data set in a spread
sheet program ; this must be done by the Megaputer PolyAnalyst® soft
ware i tself. B u t th i s symbol i c langu age i s very u seful i n al lowing liS to
Findi ng Root Causes ofAchieveme nt-Can We Kno w? 69
see "inside the black box" that is normally hidden to identify which vari
ables matter most.
The rule- set above is a straightforward one . Here is a better example of
what most of them look like , at least the ones I encountered during my pi
lot study:
Best significant rule found:
A word or two on the statistics reported here . Regre ssion models pro
vide many detailed and complicated statistics to help interpret the value
of the outcome in the form of output/information . Since this is not a re
search paper, I have decided to provide only the critically important in
formation to help interpret the results of the pilot study. Among the out
put provided, regre ssion models are typically expressed in terms of the
percent of variance explained by the prediction model that in tum is ex
pressed by the value R 2 . An R2 over .4 is generally acceptable as a wor
thy research outcome , with an R 2 greater than .6 being very strong .
Many of the R 2 values (outcomes) of this pilot study were in the ranges
of .5 and .6 .
I also used two other statistical procedures to test the accuracy of the
predicted variable (by the PolyAnalyst® rule-set) against the actual vari
able from the subsequent cohorts - correlation and paired-samples
t-test. Here I wanted to know how well the "predicted" outcome (or de
pendent variable) mirrored the actual outcome from the subsequent co
horts .
The first way I did this was to run a correlation between the predicted
and actual scores expressed as a Pearson R correlation value . A Pearson
R of .4 or greater is generally viewed as an acceptable positive correla
tion , but we really want to see these values higher than .4 . Most of the
70 Chapter Four
The results o f the pilot study were encouraging with the R2 of the pre
diction models ranging from .260 to 625 with the majority greater than
. ,
Suburban District 1:
Suburban District 2:
Suburban District 3:
Urban District:
For each district in the pilot study, several pieces of information are
provided in tables 4 . 1 and 4 .2 . First, the R2 of the data-mining model de
veloped by PolyAnalyst's Find Laws algorithm is shown . The R2 for all of
the rule-sets is very respectable , except for the urban district "Lang/Read"
resultant model .
At one point it might be concluded that the language arts area does not
lend itself to this analysis format, that the content is not tightly coupled
enough, but that is proven untrue given the very high R2 for suburban dis
trict 3 , where the SAT verbal score was predicted from several of these
same language arts independent variables .
The Pearson R correlation value is provided for each actual outcome vs .
predicted outcome for the cohorts of 200 1 and 2002 . These values are also
very respectable .
Finally, to determine how tightly the prediction rule-set was working , a
paired-samples t-test was run on each matching set of data (actual vs . pre
dicted dependent variable for 200 1 and 2002 since these actual values
were known) . It was expected that the paired-samples t-test would yield
no significant difference between the actual and predicted outcome
scores. This was not always the case , as noted in tables 4 . 1 and 4.2. Upon
review of the actual detailed data (student by student) , I determined that
the prediction model resulted in several outliers that, no doubt, are the rea
son for the significance in the paired-samples t-tests . The issue of why
there are outliers and what other variables might be included in the pre
diction model rule-set are important follow-up questions .
The p i lot study districts ' data warehouses contain literally hundreds of
variables that could have been included in the prediction rule-set . Which
74 Chapter Four
Summary
1 . For each year and content area of testing , as required by NCLB , what
are the key leading indicators of success from year to year for profi
ciency in urban districts?
2. For each year and content area of testing , as required by NCLB , what
are the key leading indicators of poor student achievement and what
interventions are most successful at helping to improve achievement in
urban districts?
3 . When early childhood data are loaded into a data warehouse , what key
leading indicators of poor performance (that represent "close-the-gap"
cohorts) emerge and what interventions are most successful at thwart
ing poor performance by third grade in urban districts?
4. Will the use of additional independent variables (beyond those used in
this pilot study) result in even more powerful rule-sets and prediction
models in urban districts? Can more powerful prediction models reduce
the outliers described in the pilot study? If so , what are those variables?
5. What level of granularity in the independent variables (total score , sub
test score , object level score , item analysis level score) result in the
most powerful prediction model rule-sets with the best cost-benefit ra
tio (powerful model vs . cost of collecting and analyzing more granular
data)?
Finding Root Causes of Achievement-Can We Know ? 75
6 . Can a general prediction model that covers the span of education im
pacted by NCLB be developed nationally across urban districts?
7 . Regardless of school district type and size , are there common key-lead
ing indicators that are universally important for the span of education
covered by NCLB ?
8 . What interventions can be planned that utilize the results of these
analyses , and what follow-up questions do school leaders have when
presented these results in urban districts? What actions do school lead
ers take?
9 . Once program participation is systematically coded and loaded into the
data warehouse , can rule-sets be developed that show the effectiveness
of those interventions in urban districts?
There are at least two important implications of the work discussed in this
chapter for school leaders . First , it should be apparent that performing true
root-cause analysis is challenging work requiring special skills and tools .
There does not appear to be a quick and easy way to complete this analy
sis , and the currently available data warehousing query tools are not de
signed for this level of analysis . While it may be possible to eyeball sta
tistics when using the data warehouse (and I will admit to having done
that now and then myself) , really knowing which variables matter most
the key leading indicators - takes hard work, special skills , and very spe
cial software .
Second , the accountability demands on schools will only become more
intense , driving the need to find what works and what does not so that we
can focus our limited financial and human resources . While it should be
apparent that there are some answers on the horizon to identifying the
root cause of student achievement within the available data collected by
schools , the reality is that we are not there yet . Artificial Intelligence soft
ware is being used in the private sector for this purpose , so one can un
derstand why many business leaders expect educators to be able to do
t h i s work . Tow ard this end , popular processes , such as Six Sigma9 , which
arc h i g h l y spec i a l i zed , stat i s t i ca l l y dri ven processes that seek to i mprove
76 Chapte r Four
NOTES
6. S . Carroll and D . Carroll , Statistics Made Simple for School Leaders: Data
Driven Decision Making (Lanham, Md.: Scarecrow Education , 2002) .
7 . T. Creighton , Schools and Data: The Educator 's Guide for Using Data to
Improve Decision Making (Thousand Oaks , Calif.: Corwin Press , 2000) .
8 . J . Utts , Seeing through Statistics , 2d ed. (Florence, Ky.: Brooks Cole , 1 999) .
9 . P. Pande , R. Neuman, and R . Cavanaugh, The Six Sigma Way Team Fieldbook:
An Implementation Guide for Process Improvement Teams , 1 st ed. (Hightstown,
N.J .: McGraw-Hill Trade, 2000).
Ch a p ter Fiv e
Case Studies/Examples
79
80 Ch apte r Five
useful outcome , but not an overextension of what the data were telling
us about the problem under study. Thus , the following examples will
demonstrate "what it takes to get this work done" and what is knowable
in terms of actionable results .
In my University of Connecticut Executive Leadership Program , we in
clude a full course on data-driven decision making . This is unique among
leadership programs , but my students continue to tell me that they learn a
great deal about how to do the work of data-driven decision making and
what its limitations and real uses are . Thus, it remains an integral part of
our university preparation for the superintendency. (As noted in my pre
vious book, Using Data for Making Better Educational Decisions,3 I was
formerly superintendent of schools in Avon, Connecticut, and Barrington ,
Rhode Island , where I tested many of the ideas that eventually led to my
work at the University of Connecticut .)
PRINCIPLES OF DDDM
Most of the examples used in this chapter are ones my students worked
on during that course . They were required to select a problem that pre
sents itself in their work setting , and then work through that problem us
ing the principles of data-driven decision making . These principles in
clude:
7 . Are the data in the fonn of aggregate group scores or individual stu
dent scores?
8 . What are the metrics of the data? Are they scale scores , raw scores ,
nominal scores (i .e . , yes/no) , percentiles ? Are they student perfor
mance portfolios? Writing samples , etc .? Are the data comparable? If
not, what transfonnations are needed?
9. What statistical procedures are needed to addres s the problem?
1 0 . What do you expect to learn from the analysis? What actions do
you expect to take as a result of this work? What are your options
for analysis and exploration? What information c an you find to
help?
1 1 . Who else might have the same question(s)? Can you collaborate to
share understandings/findings?
1 2 . How can state standards and frameworks guide this inquiry?
If you have access to the Internet please go to the website for Standard &
Poor 's School Evaluation Services , available at www.ses .standardand
poors .com.
Once you choose a state , you will be presented with a wide range of
choices regarding what district or districts you want to view data from
or what comparisons you want to see ; you can also view Standard &
Poor ' s "Observations" about a district ' s performance across a wide
range of predetermined data points . There is also a "Quick Compare"
function that allows you to compare four school districts ' data against
one another very easily.
By far, the Standard & Poor's website is the most comprehensive pre
sentation of Stage I data used in the country. Currently, only two states have
contracted with S&P to present these data -Michigan and Pennsylvania
although the system has been recently named a promising practice by the
U .S . Department of Education and is slated for a national rollout.
The value of this system is that it provides easy access to a wide range
of data for making comparisons across school districts . You can quickly
determine how your district is performing compared to either those that
S&P has chosen for your comparison group (based on a range of compar
ative variables) or those that you select. And because S&P has included
far more than just student achievement data, this website provides the
broadest possible external window into the operation of the schools .
The problem associated with Stage I data is that you cannot know the
reasons for apparently good or poor performance . To address t h i s issue
with the S&P website . I had the opportun ity to work with the Pennsy l va n i a
Case Studies/Examp les 83
• What data does the district have or need to address its goals and objec
tives , both in terms of measuring succes s in reaching goals and in mak
ing better decisions about future activities ? Thus , what data are col
lected? In what format is data collected , where is it stored, and how is
it made available?
• What data is currently used and how is it used?
• What additional data needs to be collected? How do the district's data
collection practices need to change in light of the requirements of the
No Child Left B ehind statute and regulations?
• What infrastructure is in place to support the collection and use of data?
• What additional systems might be used to make data more accessible to
the various audiences (school board, district administrative personnel ,
teachers , parents , and media) that need access to selected data from time
to time?
• What processes should the district implement to develop a culture that
values and understands data as a vehicle for systemic improvement?
The district had access to a wide range of Stage I data available from
the New York State Department of Education 's website and various paper
reports . It also had a great deal of detailed data on student achievement,
but those data were housed all around the district in various formats and
databases - a similar condition found in many districts across the country.
Thus , the district, knowing that Stage I data were not detailed enough to
drive instructional improvement and not knowing if they had the right
data for their improvement initiative , conducted the data audit.
As I began my work in Corning-Painted Post, I observed a powerful ex
ample of how Stage I data can be used persuasively. At a convocation
exercise of the entire district staff to kick off the initiative , their superin
tendent, Dr. Donald Trombley, wanted to demonstrate to the staff the im
pact of poor student achievement. To do this, he asked ten s t u de nt s to as
sist him that day. As he d i scussed various pe rfo rm a n ce outcome s . s li c h a s
pe rc e n t attendance . graduation rute , mastery rate o n stutc tCNIN , lind SIl Oil ,
Case Studies/Examples 85
• Interviews revealed that the staff did not feel it needed any "new" data,
that they already had all the data points needed for making the kind of
improvements they know are necessary. The key issue that emerged was
how to provide access to the data in a timely way to those who need it.
• We learned that the district already collects a wide range of data, from
student demographics to various forms of local , state , and national as
sessments , to program participation , staffing , and financial data. Thus ,
like so many districts with which I have worked, the problem shifts
from data collection to data access . The problem facing Corning
Painted Post is that these data are in various formats and held in dis
parate databases. As a result, a major recommendation was to acquire a
data warehous i ng system .
86 Chapter Five
these data are protected under the Family Educational Rights and Pri
vacy Act (FERPA) , the federal law that governs confidentiality of stu
dent data records . Whether data are in a paper or electronic file , they are
still student records and need to be protected . Thus, whatever policies
and procedures the district already has in place to protect all of these
data in their current format will not change when data are provided to
staff in other ways . (Several years ago , we thought that access to stu
dent data would become a volatile issue in school districts . This has not
been the case , due , in part, to the care that districts have taken with re
spect to the release of data.)
A safe rule to follow is to provide access only to those individuals re
sponsible for student information and to have policies in place govern
ing how those data should be used (e .g . , not releasing specific informa
tion on one child to the parent of another child; keeping public reports
at the aggregation level to protect individual student data from release) .
By following basic rules and policies , many districts throughout the
country have successfully navigated these thorny issues while provid
ing teachers and staff access to the data they need .
The second security issue has to do with the technology. No electronic
data storage system is completely hacker proof - even the federal gov
ernment from time to time has its systems hacked (broken into) . Any
good data warehouse/information system will be protected by various
firewalls and/or Internet security procedures/technology. Without addi
tional cost, these systems provide the same level of security as online
banking systems . While not perfect, they are very good .
• In addition to providing professional development to teachers and ad
ministration on the use and interpretation of various data, as required in
all districts , written policies regarding the fair use of data should be de
veloped at the administrative and board levels . A policy should be de
veloped concerning teacher evaluation , specifying that student-testing
data should not be used as prima facie evidence of teacher effectiveness .
There are just too many threats to the validity and reliability of these
data to make them suitable for use in this context. This policy should
state that student-testing data should be reviewed and used by staff in
their planning (once these data are more easily accessible and in the for
mats needed) and can be used in the supervisory proces s . This should be
an important part of a data fair-use policy for staff supervision and eval-
88 Ch apte r Five
uation . Once again , these are not unique issues for Coming-Painted
Post, but ones I find in all the districts with which I work.
STAGE II EXAMPLE S OF
DATA-DRIVEN DECISION MAKING
One of the most interesting issues that I have worked on recently was in
Simsbury, Connecticut . The district and community were concerned over
whether the h i gh school 's grad i ng system and c u l ture was negat i vely
i m pact ing students ' chances of gai n i ng acceptance to t h e i r nrsH:hoicc
Case Studies/Examp les 89
1600
1500
1400
1300
1200
�
!: 1 1 00
� 1000
900
600
700
600
10 11 12 13 14 15 16 17 ,. 19 20
SHS 20 Point GPA , _. " . . . . .... . A..
. .A ...
l ....� .CoI;HGr__...,C."
L",,&I fJ COUllleGrade".c...•.•.." . . . . _ .. . _ . ,8,... ... " ......... " .. "" .A
National GPA (in this case an "A") . Also presented is the student's GPA
set against the SHS 20-point GPA scale and compared with the schoo1 's
leveled courses/grades . Thus, on one chart we can see how Mary Middle's
GPA compares with the national self-reported student GPA as reported by
College Board (the most reliable information available) . As students ap
ply to colleges , the school sends out that student' s chart along with the of
ficial transcript.
I thought this was an intriguing idea because I knew that imposing an
"easier" grading standard would not work; staff needed time to explore
their grading practices against curriculum standards and comparative
practices in other school s . Yet , I also felt that students were truly disad
vantaged by Simsbury 's grading standards; B ob's initial idea, if developed
fully, could provide an interim solution for the school .
I asked B ob where he got the idea to do this and he replied , "It was an
earlier version of this table which provided the ' aha ! ' moment for me in
July 2000: as soon as I saw i t , I real i zed that I co u l d trans late SHS grades
and dec i les to national equ i v alents if the Col l ege Board wou l d g i ve me
the underl y i ng p l ot s . They d i d . and I d i d ." Cool ! The basi c reference
Case Studies/Examp les 91
document that B ob used was one of the College Board's Research Notes
on High School Grading Practices . s
I have been impressed with the professionalism with which the district
staff tackled this tough data-driven issue . By working together- community
members , teachers , administrators , and the Board - the issue will be resolved
over time . In fact the latest results I have seen from the school indicate that
they are already well on their way. This example is unsurpassed, where a
staff, administration, board, and community came together to successfully
address one of the toughest issues in education -high school grading
standards - using the principles of data-driven decision making to identify
the problem and help design improvement strategies . A status report of the
district's progress on this issue has been posted on the Internet at www.
simsbury.k1 2 .ct.us/shs/shswhitepaper060403r.pdf.
Colleen had just accepted the position of Simsbury High School principal
but was not to begin her duties there until summer 2002 . During that
spring , she served as principal of another large , comprehensive high
school , and she wanted to work on a project that would ( 1 ) prep her for
her upcoming challenge in Simsbury and (2) address a problem at her
current school . Thus , as part of her work with me , Colleen performed a
mini-study to broaden her understanding of the following problem state
ment: "Determine if the differences in levels of achievement, as measured
by GPAs (weighted and nonweighted) , class rank , and SAT ' s , is statisti
cally significant between subgroups of students stratified by gender and
ethnic/racial classifications ."
This problem is an important one , common among school leaders at
tempting to understand the achievement gap and the necessary interventions
for improvements . But the process of constructing a meaningful set of ques
tions , and then properly conducting the necessary analyses , is hard work.
The problem statement was broken down into researchable analyses - a
necessary step in deconstructing the bigger question into a set of steps that
can be analyzed as follows:
• If a gap exists , does it exist in similar patterns for all stated indicators
of achievement?
Then we decided to break this down even further by asking the following
questions :
• How does GPA compare with SAT scores: ( 1 ) overall , (2) math
course(s) GPA with math SAT, and (3) English GPA with verbal SAT?
• What effect do the (course) level designations for freshman year have
upon the four-year sequence of classes and subsequent achievement of
students?
As you might imagine the first set of questions was tough to analyze ,
while the second set was even more difficult. The first set directly ad
dresses whether there is a problem , and the second set attempts to uncover
deeper understandings about what might be going on that contributes to
poorer (or better) performance later on in high school . These are all great
questions , worthy of the required time and effort to find the answers .
All the data needed for the study resided in the school 's student infor
mation system except SAT scores . Thus , after the required data were ex
ported from the student information system into a spreadsheet, the SAT
scores were entered by hand from hard copies available from the guidance
department. While this took a lot of secretarial time , it was critically im
portant to get all of these data into one place for proper analysis . To en
sure that she had accurate data, Colleen carefully scrutinized the data for
twenty-five randomly selected students to check for accuracy.
I worked with Colleen on the actual statistics to address these questions
as these became quite complex , beyond the scope of this book. But it makes
the point that when a top-notch school leader wants to address an important
issue, such as achievement gap analysis , it often entails sophisticated
analyses to address their questions .
I recognize that statistics is sometimes a frightful subject , but I am al
ways amazed at how complex and important school leaders ' questions
are and what it really takes to answer those questions . In my classe s , I
ask students to pick a problem that i s important to them i n thei r current
work setti n g . S i nce Col l een was worki n g in an urban school at the t i m e ,
t h e ach ievement g a p w a s a pressing i ssue . A n d s h e w u s mov i ng to a
Case Studies/Examples 93
while these findings were certainly interesting , they led to the asking of
other questions related to student achievement and the interrelationships
with other factors in school to understand more completely the complexity
of learning and "spheres of influence" that affect student learning .
As always happens doing this work , initial questions and their analy
ses yield answers that only beget further questions . Broad understand
ing of the achievement gap issue in this school would require , at the
least , running these same analyses over additional cohorts , but Colleen
was hampered in her work by not having these data in a data warehouse
because a great deal of her time was spent on preparing the data for
analysis . As she reflected on all of this , she wrote the following sum
mary :
standardized within a school? If yes , what does that tell us , if not , what
use are grades ?
2 . Why did the girls outperform the boys o n the S AT mathematics? I
would like to see a histogram of the patterns of distribution of
grades - what might this tell me ? What were the enrollments (by gen
der) in each level of study for junior math class e s ? The CAPT (Con
necticut Academic Performance Test) results for this clas s had boys
outperforming girls - what changed between these two te sts? It would
again be worthwhile to examine the patterns of distribution. Did the
band system of calculating CAPT goal distort the outcome ? [Note by
author: the bands that Colleen refers to are the same grouping desig
nations now mandated by No Child Left Behind where a student 's
achievement score is placed into 1 , 2 , or 3 performance designa
tions -basically goal , proficient , or needs improvement.]
3 . What implications for teaching and learning might result by a discus
sion of the average GPA in each department with the departmental di
rectors? How can data prompt decision-making ? How does data con
tribute to informed intuition? Data that are organized in meaningful
ways provide us with information with which we are more likely to
make decisions that will enhance learning opportunities for students.
4. As I contemplate my role as principal , it is critical to understand to
what extent that data may be used for analysis if no data warehouse is
available. I was pleasantly surprised that even with this limited
spreadsheet I was able to extract useful information. Given that I most
likely will not have acces s to a data warehouse for next year, I would
like to explore how far I can go in my analy sis with a reasonable
amount of human resource energy to develop these comprehensive
spreadsheets.
5. Tracking continues to persist in high schools. Even though students are
permitted to take more rigorous classes each year, while policy allows
this shift , the school culture does not. I would be interested in tracing
patterns of students who move from one level of study to another over
four years , including those who move to a less rigorous level as well as
those who opt for a more rigorous level .
CMT (Connecticut Mastery Test) scores , placement in level of study for the
freshman year and other pertinent information . With all the research in the
literature about the negative effects of tracking , it is my hypothesis that the
phenomenon of tracking minority students into the lower levels of rigor of
study during freshman year closes doors for access to more rigorous levels
during subsequent years of study.
Conclusion
While I have been afforded scores of hours of training in total quality man
agement , my approach to continuous improvement has been hindered with
out the appropriate schools for systematic data collection and analysis . With
the advent of the new technology for data warehousing , each administrator
must become competent in the skills of planning for, improving and con
trolling the quality of the institution's processes and outcomes . This is an
exciting time to prepare for the superintendency.
Earl ier i n t h i s book , I asked whether we cou l d know i f spcc i u l l y des i g ned
e l e me ntary prngru m s hud uny rea l i m pact on student ac h ievcmc n t i n
Ca se Studies/Examples 97
The fIrst two questions are sufficiently complex to analyze , while the
third is a question of root cause . Glen acknowledged the difficulty in ad
dressing the third question when he wrote,
This may be a very difficult question to answer because Early Success is only
one of the many initiatives designed to raise academic and reading achieve
ment that have been initiated during the last three years . We have also added
summer school , after-school tutoring, homework club , paraprofessional sup
port in the intermediate grades , CMT coaching , and we have substantially
improved the elementary reading curriculum by changing from a whole class
model to one that focuses on guided reading instruction in small groups.
As a result, Glen and I decided not to attempt this question; rather, we de
termined to intuit the overall value that the program might contribute to stu
dent achievement as a result of doing the fIrst two analyses . In addition , we
resolved to further focus those questions , which resulted in the following:
• To what extent does the Early Success program help the weakest second
grade readers become profIcient readers?
• Do the students receiving Early Success catch up to their peers?
Once we had focused our questions , Glen began determining what data
were needed and which data were actually available , and started entering
them into a spreadsheet with a structure that allowed for the required
analysis . Glen 's study is an important one because it reflects the typical
questions that principals ask about program effectiveness , requiring a
good deal of analysis to answer. Yet Glen did not have a data warehouse
nor was he a statistician by trade . As you will see in the following pages ,
Glen uncovered some surprising findings after he scrutinized the data . In
his own words , here is what he did and what he found.
Success grade 2 , as well as all CMT reading and writing test scores in
grades three through five , and end of year reading tests .
DRP scores for the past two years have been entered into an Excel
spreadsheet , but other information was only available on state CMT sum
mary reports , or in students' individual files . Developing the database was
time consuming , but it certainly took less than 40 hours to collect and input
the data for 273 students in the four cohorts . A task of this size could easily
be assigned to an administrative assistant, elementary school secretary, or
talented paraprofessional . Although it has been recommended by data
warehousing experts not to retroactively try and create databases , the task is
a very valuable one that can be done in house at a small or medium sized
elementary school over a reasonably short period of time.
Data Screening
One of the most time-consuming tasks related to the database was
screening and cleaning the data. This process included both selecting and
deselecting students to include in the database and converting Degrees of
Reading Power (DRP) scores so they could be properly compared over
time .
Our district has a highly mobile population. Of the 273 original students
in the four classes , only 1 98 of them took the DRP in grades two , three, and
four (or the DRP in both grades two and three for the current third graders) .
Students who hadn' t taken the DRP for three consecutive years or two in the
case of third graders were removed from the database to be analyzed. Sim
ilarly, it was also decided to remove most students receiving special educa
tion services from the database . Prior to the 2000 test administration, the
State of Connecticut Department of Education (SDE) allowed greater num
bers of special education students to be exempted from the testing. Although
the number of exemptions at the school has been historically low (well be
low 10 percent) , it was decided to remove four students in the class of 2008
from the database so the class makeup would be similar to the two preced
ing classes.
Touchstone Applied Science Associates (TASA) , the developers of the
DRP, report scores based on comprehension levels (p scores) - the Inde
pendent , Instructional , and Frustration levels . The DRP scale is an ab
solute value equal interval scale , and as such, scores can be converted
from one comprehension level to another for comparison purposes . Scores
in grades two through four are reported at the p = .70 level while fifth and
s i x t h g rade s co r e s are re p orted a t the p = .75 level . Converting DRP scores
100 Chapter Five
250
200
..
c:
CD
e 1 50
CD
e
Q. 1 00
.5
:::!!
0 50
No Ea r ly S u cce s s
2 to 4
G r ad e s
Figure 5.2. Grade to grade percentage improvement in DRP scores, class of 2009
Data Analysis
The analysis of the DRP data included several measures: percentage change
in DRP scores; percentage of students above goal and below remedial level;
comparison of students who received Early Success to their class as a whole
and to their peers who didn't receive the intervention; and finalJy, correla
tions and analysis of variance between Early Success participation and DRP
scores .
Case Studies/Examples 101
1 60
1 40
+"
Il: 1 20
CD
E 1 00
� 80 Early Success
e
� 60
.E
40
'*
20
0
2 to 3
Grades
Figure 5.3. Second t o third grade improvement in DRP scores, class o f 201 0
1 00 �----,
80
0
It:!
II 60
.9:
(])
""ffi
()
en
40
D..
0:::
0
20
* * *
O �__________��__________��____________�____�
Figure 5 .4. Mean DRP Scores of Early Success Students and All Students, Class of 2009
Gr. 2: ES = 5 7.05 % of All; Gr. 3: ES = 78.36% of All; Gr. 4: ES = 86.42 % of All
from grade two to three . The students who didn' t receive Early Success
would have had to improve by over 30 DRP points to achieve the same per
centage increase .
Not only have the DRP scores of Early Success students risen dramati
cally from grades two to three to four, their scores are getting closer to those
of all students in their grade.
In other words , the students receiving Early Success in grade two are
catching up to their grade level as a whole and to their peers who were
stronger second grade readers . Figures 5 .4 and 5 .5 show the degree to
which the Early Success students are gaining ground in reading vis-a-vis
their grade level as a whole . In both years the mean of the DRP test
scores in grade two for Early Success students was less than 60 percent
of the mean score for the entire grade cohort . By grade three , the mean
of the Early Success students was above 74 percent of the mean or all
students for both the 2009 and 20 1 0 cohorts . The Early Success students
in current fourth grader class of 2009 h ave reached 86 percent of the class
as a whole.
Case Studies/Examples 103
�OCi
S
In
! 1111
I'D.
l!
IJ)
4:'
a
.11:
0
3D
I)
• * # * #
Grade :3
Figure 5 . 5 . Mean D RP scores of Early Success Students and A l l Students, C lass of 201 0
Gr. 2: ES = 53.59% of All; Gr. 3: ES = 74.87% of All
CI_ of2009 " Stud.... fllbuve DRP Goal CI._ of 2018 II. Students .bov. OAP Goal
f
Cr4
'------------'
Evty Sue:cets
earty Success
All Stude"
A l l students
Or4
Figure 5.6. Percentage of students above DRP goal and below DRP remedial level in
grades 2, 3, and 4 for the 2009 and 2010 classes
1 60
140
-
>1:: 1X1
I
100
Q,
�
�
Yl!ar «f Graduation
From grade two to grade four, as shown in figure 5 .8 , the students who
didn't receive the Early Success program in the class of 2008 actually made
better gains than the students who had the benefit of the program the follow
ing year. Figure 5 .8 shows that the DRP scores of the "Early Success-like"
students in the class of 2008 improved by 224 .30 percent while the same
scores for actual Early Success students in the class of 2009 improved by
2 1 1 .95 percent. Fifty percent of the "Early Success-like" students in the class
of 2008 achieved goal on the fourth grade DRP while only 3 1 .8 1 percent of
the 2009 Early Success students reached this level. The numbers of the Early
Success groups falling below the remedial level in grade four were very close.
Early Success
All Students
No Early Success
2009
Year of Graduation
Figure 5.8. I mprovement in DRP from Grade 2 to 4
One would certainly expect the scores of the students who received the re
medial program to far surpass those of the control groups who didn't receive
the intervention . Of course, other factors come into play. For one thing, the
class of 2008 had the highest fourth grade DRP scores in the history of the
school . The students chosen for the control groups , who seem similar to the
Early Success students in grade two, may not be good matches. We may sim
ply have too few cohorts of students to make any generalizations regarding
the efficacy of the Early Success program. We may need another year or two
of concrete data to answer these questions adequately.
This analysis certainly supports taking a closer look at the components of
the Early Success program: staff, supervision, training , curriculum, and the
lesson delivery model . It also makes sense to work to discuss the results of
this analysis with the Early Success staff to work to improve the program,
and look into other remedial reading programs .
As a final note , this study only looked into changes in the annual DRP
score s . Other important measures of reading achievement such as classroom
grade s , running records , guided reading level , and Development Reading
Assessments (DRA) were not taken into account because they fell beyond
the scope of this analysis . Those other outcome variables would need to be
considered before any final determi nation regarding the Early S uccess pro
gram could be made .
Case Studies/Examp les 1 07
takes time and effort unless there i s a data warehouse in place . Even
then , the data need to have been loaded into the data warehouse (which
sometimes hasn ' t been done because some of these data are not con
sidered important enough to load until an administrator asks a question
about achievement) . Fourth , the analysis work must be completed,
which takes time and skill . Fifth , the initial data analysis typically re
sults in answers that only beget other questions for follow-up study.
And finally, completing this work , more often than not, results in find
ings that are less than definitive , that may only inform your intuition
about what should be done .
Viktor then assembled the needed data that included Cognitive Abilities
Test (CogAT) and Connecticut Mastery Test data for the two cohorts :
Multiage students (students who remained in the same Multiage class
room for two years , as first and second graders) and Traditional students
(students who attended separate first- and second-grade classrooms) . He
looked at three cohorts of students over a two-year period , thus achieving
a 3 X2 cohort analysis model as follows :
had access to a data warehouse , most of this work would have been un
necessary.
The previous examples discussed in this chapter have been either Stage I
or Stage II analyses . In chapter 4 , I discussed the application of Artificial
Intelligence software to the challenge of doing root-cause analysis . I w i l l
close thi s c h apte r w i th the ap pl i c at ion o f thi s technology by one o f my stu
dents toward an i ssue his school was address i n g .
Case Studies/Examp les 111
Here we can see that essentially six variables emerged as important pre
dictors of GPA and that these variables could be used by the software to
apply to subsequent cohorts to predict GPA as discussed in chapter 4 :
PSAT writing; CAPT reading , math , and science; Connecticut Mastery
Test (CMT) eighth-grade math; and CMT sixth-grade reading .
To explore this issue more deeply we used the "Find Laws" algorithm
from Megaputer 's PolyAnalyst program, conducting a number of analy
ses similar to those described in chapter 4 . Here we wanted to know what
variables could be best used to predict PSAT verbal scores . The software
yielded the following rule-set:
*captread -
1 4 672
.
the school does not have a data warehouse , so extracting the essential data
for several additional cohorts was beyond the scope of David' s work and
he has now moved on to another position. When the district installs a data
warehouse , further analyses would be possible with someone who pos
sesses the same knowledge of statistics as David.
SUMMARY
the examples I present here should be a wake-up call for policy makers . It
should show them just what is needed to carry out this work as they enact
tighter controls on schools, requiring deeper and more sophisticated
analyses of student performance .
NOTES
INTRODUCTION
Clearly, in the eyes of both the Congress and Bush administration . the
American educational system has failed to meet the needs of all students .
To have come to this conclusion , they must believe that most past reforms
have failed. Even the Clinton administration was an advocate of test
driven accountability as the roots of what became No Child Left Behind
(NCLB) began in the l ate nineties . The result is NCLB - a strong and po
tentially illegal federal intrusion into state and local school governance .
NCLB , at the minimum, is an expression of dissatisfaction with past ed u
cational reform attempts on the part of both the political "elites" and the
public in general . Whether for good or bad, NCLB is forcing the educa
tional establishment to change .
The strategies of change enacted by NCLB , largely relying on t h e usc
of standardized tests and the measurement of schools - not students
were the only statistically defensible tactics open to the Feds in attempt
ing to achieve this level of school-based accountability. At this j uncture . ll
great many of the states lack the capacity to implement all of the compo
nents of the act, even if they desired to do so . Thus, it i s a n open question
whether NCLB , in its current form , will survive the next reauthorization .
Regardless , the bipartisan support that this act garnered i s a clear ex
pression of dissatisfaction with the public school system by Congress and
the ad m i n i stration . And while it is probably too late to stave otT t h e grow
i n g movement of charter school s , vouchers , and some of t h e sanct ions i m
posed h y NeLB o n h ah i t u a l l y fa i l ing publ ic schoo l s . s u c h as prov i d i ng
I I�
1 16 Chapter Six
• We have been through A Nation at Risk, Goals 2000 , and now No Child
Left Behind - what is different and will it work? What impact will the
l imits of testing have on this agenda? What have been some recent
abuses?
How can tests be used positively and how can the limits of testing be
mitigated in the decision-making improvement process?
- What are the flaws in year-to-year, grade-to-grade level comparisons as
required by NCLB , as opposed to following a cohort of the same stu
dents over time? What value can doing year-to-year assessment have?
• Given the hard work and sophisticated technology that DDDM requires ,
what are the implications on our expectation that accountability-driven
reform will succeed?
• What are the national , state , and local policy implications?
some meaningful change in at least the state ' s urban districts . Connecti
cut's problems are not unique , and we can see across the nation an attempt
to address poor achievement through a program of federally mandated
process controls and sanctions .
There are many who argue that the "failure" of public education is
contrived , based on the inappropriate use of test scores to evaluate
schools . In fact there are those who would blame educators themselves
as the cause of this degradation of public confidence , alleging that we
have failed to be honest about testing , pointing out its limitations for such
school evaluation purposes . ! At the very least, test scores need to be part
of an overall assessment process that uses a host of measures to be valid
(APA2 ) . Part of the problem has been a lack of consensus among educa
tional insiders as to what reforms will work and which cannot . For
example , there is ongoing debate about whether we should have a
stronger standards-based approach and whether we should test teachers .
We debate "more is less" while Carnegie units remain entrenched in the
national high school psyche - and so on . We can see this ongoing debate
played out through A Nation at Risk to H .R . 1 804 Goals 2000 the
Educate America Act3 to the meeting of the nation's governors in
Palisades , New York, in 1 99 6 .4
Moreover, the limitations of testing can confuse or mask deeper
problems . For example , Hannaway argues that one essential dilemma
may be with the schools ' control structures , which are masked by the
already cited problems with performance assessments . She concludes
that ,
There are also more esoteric arguments about the problems associated
with understanding what test data tell us . For example , English6 argues
that most data (test scores) are not transformational ; that is , we cannot
learn enough from them to design mean i ngful de c i s i o ns given the i nher
ent complex ity of schoo l " .
1 20 Chapter Six
school ' is i t now, anyway? What should b e the curriculum and character
of public schools for the emerging 'new ' America? Which of the many so
lutions being proposed should be adopted?" 1 7
It may be that the very point of NCLB is to serve as a political lever to
force change , any change , from the current one-size-fits-all , monopolis
tic system of American public education . Is Greenberg 1 8 right? - is this
the "train wreck" public education is headed toward? After all , even if
one accepts the arguments for using testing in the manner mandated by
the law, the conclusion that not all children can succeed under the current
system is inescapable . We know that poor achievement is closely related
to poverty, changing family structures and parent's educational level
factors well beyond a school 's control (see Biddle , 1 9 Grissmer et al . ,2°
and Rothman2 l ) . Thus , how can all of the nation 's schools succeed in
leaving no child behind when so many external factors are outside their
direct control? They probably cannot and alternative options will likely
come onto the scene in greater numbers .
If tests are going to be used for school improvement, there is a more ap
propriate way to proceed. NCREL,25 drawing on the work of Hills ,26 notes
that any serious (program) evaluation effort must match the purpose of the
evaluation to the type of assessments needed for that purpose . They argue
that to assess teaching and children's progress appropriately, one needs to
use both formal assessments (appropriate criterion-referenced achieve
ment tests) along with informal assessments (teacher-made tests and pro
cedures , observation , analyses of work samples) .
Popham27 argues that standardized achievement tests are misnamed;
that they do not measure what we think they measure - what students
have learned in school . Rather, they are designed to measure differences
acros s students . There are rather strict development rules for creating
these tests to ensure test validity (does the test measure what it is intended
to measure?) and test reliability (does the test produce the same results
over multiple administrations?) (see DieteP8 and National Forum on As
sessment29) •
WILL IT WORK?
We need to ask will this work? Will NCLB have any chance of succeed
ing at its fundamental goals of improving the achievement of underper
forming children? Popham argues that it could if states were to utilize in
structionally sensitive tests ; that is , tests that are able to "assess students '
mastery of a modest number of genuinely significant skills in reading and
mathematics ."43 He notes that states have unfortunately selected the use of
instructionally insensitive tests that are "incapable of detecting improved
student learning . even if learning is present ."
What is an example of an instructionally insensitive test? Popham states
it is "uny of t he nutiunu l l y stundard i zed ach ievement tests that . in order to
1 26 Chapter Six
Simply identifying which schools are failing and which students are
contributing to that condition will not be very useful . We need to drill
down to search for relationships among performance and input data to
help expose what specific areas of learning require intervention , for
which students , and what programs are most likely to succeed at inter
vention . These steps require activity at Stages II and III , as discussed in
earlier chapters . And because of the many problems associated with
standardized tests , more specific and appropriate measures of student
performance will be needed to help design meaningful interventions .
Short , Short , and Brinson48 present a useful schema for collecting a wide
range of information that would be needed to make sound decisions
well beyond test scores . They also review the statistical functions
needed to make these decisions , raising in our minds the challenge of
professional development .
Bringing about change is not easy work. It requires capacity building
investing in the people who do the work to improve their knowledge and
skill sets . It also requires providing the structural supports to help them
get the work done . The proliferation of decision-support technologies , as
discussed in chapter 2 , are examples of structural supports . Providing
training on the proper use and interpretation of test scores , for example , is
capacity building .
But school improvement is even more complicated than this , as mean
ingful change requires effective local leadership that works on a founda
tion based on systems thinking and the recognition that systems are totally
interrelated with one another, yielding often unintended outcomes requir
ing flexibility as the enterprise moves towards a common goa1 .49 Sup
porting this point, Garmston and Wellman50 note that school improvement
requires an understanding of systems and their interrelatednes s . They ob
serve that school improvement is a complex task, requiring multiple
sources of information and, even then , the outcomes are not guaranteed .
Program evaluation is also not a precise science . Hernandez5 1 explains
that one problem with evaluation is that most strategies "fail to account
for essential information apart from the outcomes , which could later allow
the agencies/programs to i n te rp ret and u t i l i ze the res u l t i n g i n format i on ."
He argues that when there is a l i gnment among these essentiul datu w i t h
Welcome to the Test-Driven Accountability Era 1 29
RECOMMENDATIONS
• The federal government should revise the NCLB act to require that each
state set as its baseline achievement level some common national per
centile ranking . Doing so would make for comparisons that are more
reasonable across states and school districts . While NAEP does achieve
the goal of determining how well children do as compared to other na
tions and across states , NAEP results are not being used in the same way
as NCLB testing . Thu s , to level the playing fiel d across states , but also
to allow states the freedom to conti n ue to select their lIccountub i l i ty
Welcome to the Test-Driven Accountability Era 131
SUMMARY
The past several years have brought about major changes in how schools
are governed and managed . Control has systematically shifted from local
educators and school boards to the state and political elites in Washington .
Because education has largely been perceived as a failure over the last
quarter century or more , Congress enacted the most intrusive test-driven
accountability measure feasible to embarrass schools into improvement
and , failing that , to close or reconstitute them , allowing for broader use of
charter schools , giving students the right to attend more successful
schools within a district.
Whether one supports the use of standardized testing for school ac
countability seems to be moot. And while it is almost certain now that
some changes will be enacted in the law, what is most interesting and re
vealing is that very few in the nation argue against the fundamental target
and goal of the law. Thus , some form of test-driven accountability and
federal control is likely to continue . And this may be the true goal of
NCLB , to break the establishment's stronghold of control on school re
form, which has only led to deadlock, at least as viewed by the political
elite .
Thu s , we are left with one of two choices : ( 1 ) accept the challenge the
law presents and attempt to make systemic changes aimed at improving
the achievement of underperforrning students , or (2) ignore the law, which
would more likely result in continued federal and state intrusion and con
trol . For those who choose to accept the challenge , there are a host of
capacity-building tasks ahead . And , like so many other organizational
problems , little can be accompl ished without effective leadershi p - a topic
to be addressed in the fi nal chapter.
We lcome to the Test-Drive n Accountability Era 133
NOTES
1.\ 7
138 Ch apte r Seve n
things and rely on tangible infonnation for many of these tasks , the i ntangi
hies may often concern the critical i ssues that define success or fai lure .
1 40 Chapter Seven
ergy and capac i ty . There is no cquation for hu man el1er�y. Leade rs '
The Softer Side of Leadership 141
rat i o n i s contrary to peop l e ' s need for connect i o n and sense of efficacy.
1 42 Chapter Seven
Leaders , especially the great one s , are true to their beliefs and values .
They stand for noble causes and they "fight the good fight" with integrity
even if failure is probable . Many urban educators are fighting the good
fight today given the demands of No Child Left Behind , the achievement
gap and its fundamental causes - poverty. Serving and reaching ideals
presupposes that leaders recognize and commit to the ideals that are true
to them . Inspiring leadership does not come from processes or data . It is
borne in noble goals . You cannot mobilize support and commitment based
on tactics and numbers . It takes passion and leadership that is value
driven to inspire leadership in others .
In schools , another primary relationship leaders have is with students .
It is understood that children are at the center of education and schooling .
But, losing sight of children and the sanctity of childhood can result as we
"target" service , pursue measurable standards , integrate technology, and
differentiate our strategies. Children have been dehumanized in some or
ganizations to customers , clients , stakeholders , target audiences , and AYP
numbers . They are categorized into ethnic groups , abi l ity groups, quar
ti les , subgroup s . remed iation groups , and a phalanx of spcc illl cdu(;ulion
The Softer Side of Leadership 143
• Indecisiveness
• Meekness
• Fearfulness
• Avoidance of problems with people , finance s . or substulltive d i rect ion
The Softer Side of Leadership 1 45
• S appy emotionalism
• Weaknes s
• Illogical o r mushy thinking
• Shrinking in the face of crises
In life , it seems , paradox is our companion . The same is true with the
terms "soft" and "hard" when it comes to leadership . It is harder to be soft,
and it is easier to be hard. B eing uncompromising , tough , and directing is
the easiest form of management because you strictly follow regulations
and do not expose yourself as a human being . You simply play a role and
focus on consistent processes and superordinate controls and regulations .
Making people tow the mark is easy, particularly if you have the abil
ity to impose retribution and punishment. Either by errors of omission
(not knowing better) or errors of commission (purposeful misuse) , data
driven decision making can too often be used destructively. Driving and
pushing people is not a difficult act: it may be frustrating , but it is not dif
ficult. And for short bursts of time , people may be moved in a particular
direction by this approach and show increases in performance .
"Hard" leaders can move people , but they may not motivate them.
Working from fear drives out the love people have for their work and for
the organization . Fearful people may be moved to do something in the
face of power or control , but they will not commit and be motivated to go
beyond the defined request . They don't add their creativity or innovative
nes s to the solving of problems . "Tough" leaders who operate from a "you
don 't have to like it , you just have to do it" framework, leave the organi
zation sapped of pride and passion.
So-called "soft" leaders , who work from understanding , compassion ,
and commitment to stewardship , are concerned with wholenes s and in
tegrity. When things are disharmonious, a leader does not break the spirit
or damage the talent in the system. Schools run on talent and talent should
not be intimidated out of people . Creative people will turn their talents on
beating the system, not enhancing it . That is why creating a compassion
ate and caring environment is so important.
The intangibles , or soft side of leadership , are wrapped in our relation
sh ips w i th ourselves and with others . Facing issues with pride and passion
comes from within u s , as individual s , and from the groups we work with .
Destroy ing a rel ationsh i p has repercussions throughout the organ i zat i o n .
146 Chapter Seven
Tough leaders may get immediate action but cause the organization to
wither and die in the long term because cancerous conduct depletes moti
vation and will . Short-term movement is a big price to pay at the expense
of long-term growth and learning .
S oft leadership i s something else . It requires the biggest risk a leader
can take - exposing oneself as a real person in an authentic way, com
plete with one ' s vulnerabilities and unique humanness . Softness in
volves being subtle , sensitive , flexible , open , approachable , and per
ceptive . It does not mean foisting emotions on people , being smarmy,
or breaching the normal barriers of propriety we all maintain . B ut it
does mean:
From this perspective , soft leadership can produce two bottom lines: re
sults in terms of desirable outcomes and productivity, and strong connec
tions and ties between people who long to belong to an enterprise in which
they can use their talent to serve and to act as stewards of the common
good . Some athletes are said to have soft hands . They can catch the hard
throws or they can make shots that require finesse and sensitivity. Lead
ers need softness , too , to deal with the conflict and the complex chal lenges
they face , as wel l as to create productive rel ationships with people .
The Softer Side of Leadership 1 47
Conflict is not a bad thing , and neither is confusion , which often is con
flict's companion . In fact, conflict is inevitable in life because life is not a
place of equilibrium. In searching for homeostasis, conflict happens ,
change occurs , and energy i s expended .
Leaders must sometimes create conflict if they act as stewards meeting
their obligations . Conflict comes from a variety of sources and is created
when:
Conflict and change can be painful. Sometimes pain has little to do with
physical hurt. Many times it is the emotional pain of cutting programs that
we developed that haven't produced results . Or, it is confronting people
we like and respect with data that are negative or signaling that their be
havior is incongruent with values and obligations . Pain comes from not
having the skills to meet new challenges or adapt to change .
When people don 't want to do things that contribute to a successful
and positive environment , leaders must be a source of feedback and the
gatekeeper of credibility and integrity. That feedback may hurt and
cause concern . At times , both leaders and followers must face difficult
choices . As stewards , there is great weight placed on the shoulders of
professionals to do no harm and to address issues . Stewards do not be
little or damage potential and integrity. They are builders , not destroy
ers , and must "be" and "act" with integrity, passion , and commitment to
principles .
In other situations , some individuals may head on a course that is con
trary to principles and values or take positions of ego-driven self-interest
and not consider the common good . Leaders then must confront the is
sues and help people return to projects , issues , and approaches that are in
harmony with the school ' s values . Compassion means helping people ,
even if confrontation is necessary. The issue is how the confrontation is
approached .
148 Chapter Seven
Ghandi believed that all leaders must know their attachments , otherwise
they will succumb to them in subtle ways and compromise their princi
ples . Attachments are the "relationships , possessions , privileges , and other
components of our life we do not want to give up ." Knowing yourself
means knowing your attachments .
Leaders can be attached professionally to competence , status , power,
and acceptance . On a personal level , they can be attached to status ,
money, geography, or image . Some of these attachments are conscious
one s , and others may be subconscious. Discovering them, however, is es
sential; otherwise they can cloud judgment and decisions .
Attachments to power, privilege , and possessions can make integrity
to principles difficult if we do not identify them. These attachments can
create subtle confrontations with us . Do we risk jeopardizing an attach
ment to do the right thing? For example , many leaders are attached to
competence - being perceived by others as capable and assured . Com
petence is demonstrated by success , and success leads to security. That
is why some leaders and executives cannot admit a mistake or indicate
a shortfall in reaching goals . Failure is anathema to people attached to
always being right and having their self-image connected to recognized
success and status .
Competence , success , and security are positive things in life . The issue
is attachment. For example , if the data indicate that goals are not being
reached and a leader 's self-esteem or job security might be threatened,
will he or she speak out and candidly report the data if they are attached
to competence and security? There are examples in all sectors of life
where data were stretched , false pictures were painted , issues were buried ,
or scapegoats were identified to protect a person 's image .
People i n d an ge r of lo s i n g thei r power an d au thori ty w i l l som et i m e s
co m prom i se pri n c i p les and fal l i nto self-protection . secrecy. or d uc k i ng
The Softer Side of Leadership 149
time . Falling short offers the opportunity, not only for growth , but also for
creative problem solving and deepened commitment.
APPROACHES TO LEADERSHIP
Distance is also created when we value numbers more than people. Leaders
who are "distant" lack pa s s io n and a sense of com m itment to the people
The Softer Side of Leadership 151
they lead, and treating people as interchangeable parts does not create re
spect and commitment. Distance between leaders and followers dooms an
organization to failure.
People long for connection , not separation . They long for connection
between their work and purpose , between people and cooperation , and be
tween leaders and their aspirations . They do not yearn for leaders as ves
tiges of distant power or leaders more interested in material things than
people and principle . Essentially, people want leaders who are human and
approachable and who do not hide behind the fa<;ade of a title .
Motivation comes from efficacy and connection , both of which can pull
an organization to higher levels of performance . Nurturing and closeness
are keys to high performance . Seeing the wonder in human beings and
providing an environment in which they can learn and contribute is a
leader 's obligation .
In our penchant to manage , we sometimes forget the corollaries to
providing success and the major metric indicators . The paradox is that
to produce tangible numbers , leaders must rely on the intangibles of
imagination , creativity, spontaneity, synchronicity, serendipity, and j oy
ful fun . High-performing , successful organizations have people with
these qualities . They are not dour and humorles s places that operate by
highly regimented behavior long-term . There is little distance in these
places between people and their passion and commitment .
did another because it did not feel "right" or just did not "fit" our intuitive
sense . Leaders must use all their ways of knowing in order to make both
kinds of decisions in a chaotic and nonrational world . They adjust, "go
with the flow," take a "half a loaf," and wait until "the time is right." Met
rics can provide an analysis , but insight comes from coupling data with our
other cognitive , intuitive , or affective ways of understanding our world.
Leaders are players - not in a manipulative or devious way, but in a pos
itive manner. They look forward, not backward . They understand that al
though chaos in the world exists , they have choices and can respond. Be
cause they may have created the dilemma, they accept responsibility and
feel an obligation to find a solution . They have a sense of efficacy and are
"response-able ." They use their data, their relationships , and their intuition
to make conscious choices and to respond . Leaders are mindful and aware ,
whereas victims wear the shawl of innocence and engage in blaming .
Smart , talented people who p roduce d and analyzed numbers efTic ientl y
The Softer Side of Leadership 153
......
�
011,.,.
p
led Enron and other failed corporations . Leaders need heart to pursue no
ble goals courageously in the face of daunting odds , and they need heart
to risk, persevere , and challenge even if rational logic dictates otherwise .
Overcoming odds , in the face of criticism from the majority, takes courage
and peace of mind.
Second , heart addresses critical qualities of healthy relationships . It is
not a "take no prisoners" approach or about seeking retribution . Healthy
relationships require loyalty, patience , understanding , compassion , and
forgivenes s . They are rooted in love . Leaders demonstrate love for the
people with whom they work by helping them reach their full potential in
contributing to a cause larger than their own self-interest . And leaders do
this by exhibiting patience as their staff learns the new literacies required
in this test-driven accountability era.
Finally, leadership engages the human spirit. It requires the spirit of cre
ativity, imagination , and wonder. It requires moral purpose and alignment in
pursuing noble goals and virtue . It lives in wisdom's house and nurtures the
innate goodness of people and life . It rejoices in the wonder of the universe
and life itself and uplifts our souls in the face of events . It challenges us to
learn and celebrate life no matter our circumstances . It calls us to be happy
and to fi nd our cal l ing without regard to our attachments or the expectations
of others . It eal l s us to l i ve our l i ves vigorously and courageou sly.
1 54 Chapter Seven
• Synergy - Is the whole greater than the sum of the parts? Leadership de
velops organizations that exceed the individual potential of the people
within it. People are not parts . Collective action should supercede the
individual ability of each person . Some organizations perform below
their potential because leaders were hardheaded or wrongheaded in nur
turing an individual's potential .
• Talent- Leadership increases the organizational and individual capacity
to create , to respond, to develop , to build , and to succeed. Some leaders
do not build the skills , knowledge , or relationships required for success .
• Creativity - Do people feel free to find solutions creatively and innova
tively? Are they intimidated by failure? Do they rigidly stick to plans
and fear improvising? Do they act with integrity regarding values and
mission? Are they victims or players?
• Success - Do people feel a sense of commitment and obligation? Do
they ensure that the organization is not simply effective at mindless and
irrelevant things? Do they ensure that the values and moral obligation
of the profession are maintained? Are they courageous and safe in pre
senting dissenting opinion and thinking outside of the box?
155
156 Index
filter, 45 , 47 , 5 8 7 1 , 7 6 , 1 1 1- 1 2
1 12 46
fractals , 1 40
neural net , 3 3-34
Goal s 2000 , 1 1 7 , 1 1 9 , 1 3 3 No Child Left Beh i nd , 2. 4. 6, 9, 29 .
governance (school and d i strict) , 45 . 47 . 6 1 -62 . 66-67 . H3-H4 . 95 .
1 1 5- 1 8 , 1 2 3 . 1 27 , 1 3 3 1 1 5- 1 H . 1 42 . 1 49
Index 157