8 - The Effectiveness of International Environmental Regimes Comparing
8 - The Effectiveness of International Environmental Regimes Comparing
Arild Underdal
University of Oslo
and
Oran R. Young
University of California, Santa Barbara
This article uses quantitative methods to deepen and broaden our under-
standing of the factors that determine the effectiveness of international
regimes. To do so, we compare and contrast the findings resulting from
two major projects: the Oslo-Seattle Project and the International
Regimes Database Project. The evidence from these projects sheds consid-
erable light on the determinants of regime effectiveness in the environ-
mental realm. Clearly, regimes do make a difference. By combining
models and data from the two projects, we are able to move beyond this
general proposition to explore the significance of a number individual
determinants of effectiveness, including the distribution of power, the
roles of pushers and laggards, the effects of decision rules, the depth and
density of regime rules, and the extent of knowledge of the relevant prob-
lem. We show how important insights emerge not only from the use of sta-
tistical procedures to separate the effects of individual variables but also
from the application of alternative techniques, such as Qualitative Com-
parative Analysis (QCA), designed to identify combinations of factors that
operate together to determine the effectiveness of regimes. We use our
results to identify a number of opportunities for additional research fea-
turing quantitative analyses of regime effectiveness. Our goal is not to dis-
place traditional qualitative methods in this field of study. Rather, we seek
to sharpen a set of quantitative tools that can be joined together with the
extensive body of qualitative studies of environmental regimes to
1
The authors thank Ivar Torgersen for assistance in data formatting, and the Department of Political Science,
University of Oslo, for financial support. A preliminary version of this study was presented at the ISA 2009 Annual
Convention. Useful and encouraging comments from the panel discussant, Peter M. Haas, and other participants
are gratefully acknowledged. This version has benefited substantially from comments and suggestions by three
anonymous reviewers. The data presented by the authors are provided in replication files at http://www.sv.uio.no/
isv/english/people/aca/stvau1/index.html.
Breitmeier, Helmut, Arild Underdal, and Oran R. Young. (2011) The Effectiveness of International Environmental Regimes:
Comparing and Contrasting Findings from Quantitative Research. International Studies Review, doi: 10.1111/j.1468-2486.2011.01045.x
2011 International Studies Association
580 Effectiveness of Environmental Regimes
In this article, we compare and contrast the findings of ERE and AIER in the
interests of documenting what we know about the effectiveness of regimes,
exploring differences in the literature regarding the determinants of effective-
ness, and developing suggestions for next steps in research in this field. The next
section briefly describes the approaches ERE and AIER adopt together with the
procedures they use to generate results. The following section comments on
areas in which the two studies yield common conclusions. The third section pro-
vides a more detailed analysis of areas in which the conclusions of ERE and
AIER (appear to) differ and sometimes even conflict with one another. The final
section draws on the preceding analysis to make recommendations for next steps
in quantitative research on the effectiveness of international regimes.
produced the IRD, a large database that is accessible electronically and available
for use by all interested researchers.
Whereas ERE focuses almost entirely on the question of effectiveness, AIER
includes data on a wide range of themes relating to regimes more generally. Still,
both projects treat effectiveness as a critical dependent variable, and they con-
ceptualize this variable in a manner that is broadly comparable. ERE tracks effec-
tiveness both in behavioral terms and in functional or problem-solving terms
(Miles et al. 2002:4–7). It asks coders to specify changes in behavior relative to a
hypothetical state of affairs absent the regime but otherwise the same as the real
world (the no-regime counterfactual). ERE also asks coders to rate the perfor-
mance of a regime on a continuum running from the no-regime counterfactual
to an outcome defined as the collective optimum. The no-regime counterfactual
takes on the character of a worst-case outcome against which actual achievements
typically appear in a favorable light. The collective optimum is reached ‘‘when
no further increase in benefits to one party can be obtained without leaving one
or more prospective partners worse off’’ (Underdal 2002a: 9). The collective
optimum sets a high standard against which actual achievements must be
assessed.7 AIER, by contrast, includes data on a broader set of consequences,
encompassing information about effects framed as outputs, outcomes, and
impacts as well as about effects outside the issue area of the regime in question.
It has no direct analog of the concept of the collective optimum. The emphasis
in both projects falls on the effects of regimes in fulfilling stated and unstated
goals and in solving the problems that led to their creation. The data protocol
for each project separates out the issue of causality, asking distinct questions not
7
This standard is demanding also in terms of operationalization (see Young 2001; Hovi, Sprinz, and Arild
Underdal 2003; Mitchell 2008).
Helmut Breitmeier, Arild Underdal and Oran R. Young 583
only about the fulfillment of goals and the solution of problems but also about
the causal force of the regime in bringing about these results.
At this juncture, ERE and AIER move in different directions. ERE proceeds to
construct a model of regime effectiveness and to derive some hypotheses from
this model that can be ‘‘tested’’ using data included in the project’s data set
(Miles et al. 2002:37 and 460–462). The ERE ‘‘core model’’ specifies that two
complex variables called ‘‘problem malignancy’’ and ‘‘problem-solving capacity’’
account for the variance in ‘‘regime effectiveness’’ either directly or through
their impact on an intervening variable designated ‘‘level of collaboration.’’ It
then proceeds to derive a series of hypotheses of the following sort: There is an
inverse relationship between level of malignancy of the problem and the success
of the regime in terms of problem solving. Several of the variables in the ERE
core model are highly aggregated. Problem-solving capacity, for example, sub-
sumes information about the institutional setting, the distribution of power, and
the skill and energy of key players. Despite limitations imposed by its small uni-
verse of cases, ERE makes some effort to disaggregate these composite variables.
Analyzing International Environmental Regimes: from case study to database
adopts a different strategy. This project focuses on a range of important issues in
ongoing debates about international regimes and creates the IRD to allow ana-
lysts to test claims regarding these issues in quantitative terms (Breitmeier et al.
2007: 49–55). To illustrate, some analysts argue that international regimes can-
not be effective because they lack enforcement mechanisms needed to induce
subjects to comply with their requirements. AIER examines this proposition from
a number of angles. It raises questions, for example, about what have become
known as the enforcement and management models of compliance (Chayes and
Chayes 1995; see also Downs, Rocke, and Barsoom 1996; Victor, Raustiala, and
Skolnikoff 1998). Similarly, commentators often assert that international regimes
cannot be effective because they rely on decision rules requiring consensus or
even unanimous consent. AIER therefore uses its database to ask questions about
the relative effectiveness of regimes employing different decision rules. The IRD
also provides data usable to explore ideas about regime effectiveness that go
beyond the sphere of regulation. It asks questions, for instance, about the roles
regimes play in generating knowledge both about the nature of the problem and
about the feasibility of different solutions.
These strategies lead ERE and AIER researchers to structure their arguments
differently. Because ERE and AIER define their dependent variables in a manner
that is reasonably similar, however, we can compare and contrast the results they
produce, noting areas where these results are compatible and exploring areas
where the results appear to differ. The differences between the projects with
regard to research strategies constitute an advantage in some areas, since they
allow analysts to engage in quantitative assessments of regime effectiveness that
make use of two distinct modes of reasoning.
Common Findings
Both studies provide strong support for the proposition that regimes do matter,
though the contributions they make vary considerably on the basis of a variety of
conditions involving the nature of the problem, the character of the regime
itself, and attributes of the broader setting in which it operates.8 The two pro-
jects support the general conclusion that many regimes have a strong or at least
a moderate causal effect in producing observed outcomes and impacts. A more
detailed assessment of the ERE and AIER data sets demonstrates that this role
8
These findings, we believe, should lay to rest at least the more extreme assertions of Strange (1983) and
Mearsheimer (1994 ⁄ 1995).
584 Effectiveness of Environmental Regimes
of affairs that would have existed in its absence. It is possible from this perspec-
tive for a regime to move a system some distance from the no-regime counterfac-
tual yet receive a relatively low score on effectiveness because the outcome leaves
a lot to be desired with regard to movement toward the collective optimum.
Such situations cannot occur in the findings reported in AIER. The IRD mea-
sures effectiveness without reference to some notion of a collective optimum; it
compares the state of goal attainment and problem solving that existed at the
beginning and the end of a specific time period. A judgment about the causal
role of a regime in accounting for observed changes with regard to goal attain-
ment or problem solving is made only in another step following these measure-
ments. It follows that a regime that gets a score of 4 or 5 in terms of both
problem solving and causal influence in AIER’s ranking system can end up with
a lower score in the ERE ranking system. Once this difference is factored into
our assessment, it becomes apparent that the general conclusions of the two
studies regarding effectiveness are broadly compatible. Regimes do matter—
sometimes significantly—but they ordinarily operate in circumstances where a
number of interactive forces give rise to conditions of complex causality.
TABLE 2. The Samples as Described in the Two Databases—Descriptive Statistics for Selected
Variables
IRD ERE
(Notes. Some of the scales used in International Regimes Database [IRD] and Environmental regime effectiveness:
confronting theory with evidence [ERE] differ. To facilitate comparison, we have—except for the two capacity com-
ponents—translated the values originally assigned into scores on a standardized scale ranging from .00 [all cases
assigned lowest value in the codebook] to 1.00 [all cases highest value].)
9
One interesting concept developed to facilitate comparison across problem-regime complexes is that of
‘‘regime effort units’’ (see Mitchell 2004).
10
In the international regimes literature, pushers are actors that become advocates or leaders in the formation
and implementation of regimes.
Helmut Breitmeier, Arild Underdal and Oran R. Young 587
advantageous position than ERE does.11 Combining these observations does lead
to the conclusion that AIER offers a somewhat ‘‘brighter’’ picture of regime per-
formance and task environments than ERE does, though this finding seems to
be attributable in part to differences in the selection of cases.12
Multivariate Analysis
More interesting than a simple comparison of descriptive statistics is the question
of whether the two data sets point to the same determinants of ‘‘success’’ and
the same causes of ‘‘failure.’’ To answer this question, we need estimates of the
effects of changes in one or more independent (and possibly also intervening)
variables on regime effectiveness. Since we are dealing with small or at best mod-
erate samples of cases, we need a research design that enables us to make effi-
cient use of scarce data. To meet this requirement, we combine three different
methodological techniques. One is partial correlation in which the statistical
effect of a certain independent variable is measured stepwise by controlling for
each of the other independent variables included in the model one by one (for
example, through trivariate analysis). From the resultant computer runs, we
report average partial correlations as well as the lowest and highest scores. Next,
to be able to estimate effects of two or more variables simultaneously, we run
binary logistic regressions. Here, we dichotomize each dependent variable and
keep the number of values on all independent variables low (at most 3) in order
to reduce the problem of empty cells. The first technique allows us to take
advantage of information contained in nuances in the coding of each variable
and to maximize the number of cases included. Logistic regression allows us to
measure the effect of two or more variables simultaneously, although at the cost
of sacrificing potentially important nuances in the original coding and in many
instances reducing the number of cases included. We use both of these
approaches primarily to separate the influence of individual variables. An equally
important challenge is to identify combinations of factors that are necessary or suf-
ficient to produce particular outcomes. For this purpose, we turn to a third tech-
nique of analysis, Ragin’s (1987, 2000) qualitative comparative analysis or QCA.13
All these approaches have important limitations, especially when applied to
small-N data sets. Nevertheless, we report results from the use of each of these
techniques because we believe that even findings based on a small number of
observations may provide interesting clues for interpreting observations and
identifying hypotheses for further research. By combining the three techniques,
we are able to get more out of the available data than we could by relying
entirely on a single approach. Our analysis supports results that come out as
clear and consistent across the two data sets using different methodological tools
as well as alternative specifications of causal models. Results that differ signifi-
cantly from one data set to the other or turn out to be highly sensitive to the
specification of causal models call for more sophisticated analysis. Such differ-
ences may also indicate that more hard work is required to improve the validity
and reliability of data sets available for regime analysis.
The Oslo-Seattle team conceived of variance in regime effectiveness as a
function of two basic determinants—the nature of the problem and the capacity of
regimes to solve or alleviate problems of the relevant types (Underdal 2002a).
They argued that there are at least two factors that can make a governance
11
ERE distinguished between power in the ‘‘basic game’’ (the system of activities to be governed) and power in
the ‘‘negotiation game.’’ The analysis in the book refers mainly to the former. The difference between ERE and
AIER scores would have been less had we focused on the negotiation game.
12
Separate analysis shows that the pattern is consistent across different ‘‘generations’’ of regimes.
13
Manuals and software available at http://www.u.arizona.edu/~cragin/fsQCA.
588 Effectiveness of Environmental Regimes
problem hard to solve. The problem may be intellectually complex and demand-
ing or for some other reason not well understood. It may also generate a politi-
cally ‘‘malignant’’ configuration of interests. ERE treats problem-solving capacity
as a function of three main components—the institutional setting, the distribu-
tion of power, and the supply of informal leadership. This project’s core model
also includes one intervening variable—level of collaboration, a variable describ-
ing types of functions subject to collective (or ‘‘centralized’’) control. The ERE
data set also includes information on some other regime properties and interme-
diate achievements that qualify as intervening variables.
We now confront ERE’s core model with data from the two projects. We pro-
ceed in five main steps. First, we report trivariate partial correlations between
each of the main independent variables and two variables describing outcomes.
As a second step, we extend this analysis by exploring the impact of two sets of
variables that may be considered intervening. A third step involves a switch to
binary logistic regression, reporting results of four (in the case of ERE) or five to
six (in the case of AIER) alternative models for each outcome variable. Our
fourth step shows configurations of factors associated with high and low effective-
ness. Finally, we turn to change within regimes, asking to what extent increasing
or declining effectiveness can be accounted for by the same set of variables. The
fact that the two data sets focus on somewhat different concepts of regime per-
formance and, to some extent, on different independent and intervening vari-
ables calls for caution in comparing results.
With one remarkable exception, the overall impression arising from Tables 3
and 4 is one of similar effects. In both tables, regimes achieve lower performance
scores when faced with problems that are poorly understood or characterized by
severe political malignancy. Malignancy appears to be a somewhat more severe
obstacle in the analysis of AIER data than in the ERE data set. Further scrutiny
reveals that the impact of malignancy may also be contingent upon other factors.
In ERE, Underdal (2002b:443) found that uncertainty and malignancy interact.
A solid knowledge base serves to mute the effect of malignancy; high uncertainty
and high malignancy interact synergistically and emerge as a ‘‘lethal’’ combina-
tion with regard to problem solving. In AIER data, a similar but weaker interac-
tion effect is found for the contribution regimes make to problem change but
not for their impacts on compliance. In most runs, regimes using majority voting
do somewhat better than those requiring unanimity or consensus. This effect is
somewhat stronger in the AIER data set.
Independent Variables
Behavioral change
Average ).36 ).11 .01 .20 .40
Range ).33* ).38* ).02 ).09 ).11 .07 .18 .23 .38* .42*
Problem-solving
Average ).31 ).12 .08 .32 .21
Range ).31 ).32 ).04 ).19 .00 .14 .27 .35* .19 .23
(Notes. This table shows partial correlations between two measures of regime effectiveness and five independent
variables when controlled for each of the other independent variables, in trivariate runs. [Institutional capacity
is an index and is not used here as a control variable.] *p<.05, **p<.01, ***p<.001. N = 29–35. For definition of
variables, see Appendix.)
Helmut Breitmeier, Arild Underdal and Oran R. Young 589
Independent Variables
Intervening Variables
Behavioral change
Average .46 .13 .59
Range .45* .47** .10 .16 .59*** .60***
Problem-solving
Average .39 .02 .27
Range .33 .45* ).04 .08 .25 .29
TABLE 6. Analyzing International Environmental Regimes: From Case Study to Database—The Impact
of Intervening Variables
Intervening Variables
Table 6 indicates that regimes where rules qualify as ‘‘deep’’ or ‘‘dense’’ tend
to produce significantly higher scores on both problem change and compli-
ance.16 As indicated by the narrow range of scores and high levels of statistical
significance, this finding is robust. Interestingly, compliance is at best marginally
higher—and regime contributions to problem change significantly lower—where
rules are legally binding. A possible explanation of this result may be that regime
members become more reluctant to undertake legally binding commitments as
rules and regulations become more demanding (Downs et al. 1996). This inter-
pretation would lead us to expect a negative correlation with depth ⁄ density of
rules. But this expectation receives only weak support. Enforcement and manage-
ment approaches do about equally well on both outcome dimensions. Regimes
that contribute to improving the knowledge base tend to do better than regula-
tory arrangements, but the strength of that impact is weaker than in the Oslo-
Seattle data set.
To estimate the effects of two or more variables simultaneously, we turn next
to the use of logistic regression. For each of the outcome variables in the two
studies, we present results for four (in the case of ERE) and five to six (in the
case of AIER) alternative specifications of causal models. What is labeled Model
1 in the tables focuses on the four independent variables included in the ERE
‘‘core model,’’ adapted so that the institutional capacity variable is replaced with
decision rule in use. We then move on to add, stepwise, the variables that we
have introduced above as ‘‘intervening.’’ Since we are already stretching our data
16
While rules may become deeper or denser as a regime matures, there are also variations among regimes in
these terms from the outset.
Helmut Breitmeier, Arild Underdal and Oran R. Young 591
(Notes. Logistic regression, binary. *p<.10, **p<.05, ***p<.01. Standard errors in parentheses.)
(Notes. Logistic regression, binary. *p<.10, **p<.05, ***p<.01. Level of collaboration not included due to empty-cells
problems. Standard errors in parentheses.)
sets to their limits, these models will include at most three of the independent
variables from the ERE core model. In our analysis of the smaller data set, we
replace one or more of these variables with an aggregate index. In our analysis
of the AIER, we proceed by eliminating the type-of-problem variable that seems
least important. Since we have identified the impact of decision rules and the
distribution of power as critical issues, we keep these two variables in all models
using IRD data.17
Tables 7 and 8 present results derived from the ERE data set. Broadly, the
regression analysis corroborates the conclusions derived from the partial correla-
tion analysis. A distribution of power favoring pushers emerges as the most
important driver of behavioral change, while institutional capacity appears to be
the key to effectiveness measured as problem-solving. Majority voting seems to
17
We have, where feasible, run similar models with ERE data. Results are consistent with the findings summa-
rized below.
592 Effectiveness of Environmental Regimes
although most results tilt in favor of the former. The two problem features
(uncertainty and malignancy) seem to interact synergistically, and the combination
emerges as a significant obstacle to effectiveness (as in ERE).
Tables 9 and 10 do reveal one remarkable contrast with the findings reported
from the partial correlation analysis. A distribution of power in favor of pushers
now seems to have a weak positive effect on compliance and is insignificant for
regime contributions to problem change. Ordinal regression analysis shows that
the divergence between the two data sets in this regard is confined largely to the
scores of neutrals and intermediates relative to those of pushers. There is one
minor surprise as well. While majority voting is positively associated with compli-
ance, results for regime contribution to alleviating the problem vary consider-
ably, with more negative coefficients than positive.
So far, we have sought to separate effects of individual variables and determine
how much of the observed variance in outcomes we can account for with differ-
ent models. We now shift gears to search for combinations of factors associated
with particular outcomes (for example, high compliance). For this purpose, we
use the ‘‘crisp set’’ version of Ragin’s QCA method. This approach is particularly
useful in identifying factors or combinations of factors that are sufficient to bring
about a particular outcome. In reporting results from this analysis, we make no
claim that our findings provide general or foolproof recipes for success. Because
the crisp set version of QCA requires dichotomous variables, our results are sen-
sitive to the cutoff points used in distinguishing ‘‘high’’ and ‘‘low’’ scores as well
as to specifications of the models examined. Some also are based on a very small
number of observations. Still, we regard this type of analysis as important; it can
594 Effectiveness of Environmental Regimes
guide us toward the identification of causal pathways that are sufficiently promis-
ing to warrant further examination by researchers and serious attention on the
part of practitioners.
We distinguish between pathways leading to high scores on measures of effec-
tiveness and pathways associated with low scores. Recognizing that different types
of problems may call for different ‘‘cures,’’ we also distinguish between problems
diagnosed as malignant and those coded as non-malignant. For each of the two
data sets, we start with the adapted version of the ERE core model and move on
to add, stepwise, the intervening variables examined in the statistical analysis.
Beginning with the ERE database, Table 11 shows clearly that there is more
than one pathway to effectiveness. As expected, we find more pathways leading
to behavioral change than to the more demanding goal of problem-solving,
while the reverse is true for low scores. One pathway to high effectiveness does
stand out from this analysis as a focus of attention. It includes a solid knowledge
base, majority voting or high institutional capacity more broadly defined,
and—for malignant problems—a distribution of power in favor of pushers. The
most significant pattern associated with low effectiveness is less clear. But if we
include runs with lower minimal requirements, we can conclude that a combina-
tion of high uncertainty and malignancy with a power advantage in favor of lag-
gards and ⁄ or a demanding decision rule (or low institutional capacity) would be
a fairly safe bet.
If we were to identify one particularly important key to effectiveness, Table 11
points to knowledge. A solid knowledge base is common to all but one of the
pathways leading to high effectiveness, and a weak knowledge base occurs in six
of the nine pathways associated with low effectiveness. A good understanding of
the problem is not sufficient to guarantee success. But what makes knowledge
uniquely important are the facts that only one of the high effectiveness pathways
we have identified works without it and that it is important in dealing with malig-
nant as well as non-malignant problems.
Results from the analysis of the AIER data set provide an interesting combina-
tion of support for and suggestions for revisions, extensions, and refinements of
the conclusions emerging from ERE. The most unambiguous case of conver-
gence concerns the importance of a good understanding of the problem to be
solved. In Table 12, a solid knowledge base appears in all pathways leading to
high compliance and positive problem change. Had we accepted lower minimal
numbers of observations (3,3), we would have been able to identify pathways to
high compliance also for the first three models, and a good understanding of
the problem would have appeared in all of them. The two data sets thus con-
verge on a crisp and clear message: although not a sufficient condition in itself,
a solid knowledge base is an important ingredient in most recipes for success in
creating international regimes. AIER’s analysis using IRD data also reinforces
what emerged as a more muted observation in the analysis of ERE data: knowl-
edge seems more important on the ‘‘positive’’ side than on the ‘‘negative’’ side
(there are many pathways to failure that do not involve high uncertainty).
The most important difference seems to be that the analysis using IRD data
yields more mixed results for two of ERE’s ‘‘capacity’’ variables: power and deci-
sion rules. In Table 12, both variables are somehow implicated in all of the path-
ways where they could be relevant, but rarely in a straightforward additive sense.
This should not come as a big surprise. ERE’s collective-action framework treats
problem-solving capacity primarily as a matter of aggregating divergent prefer-
ences into collective decisions. An analysis of ERE data has already indicated that
this formula does not work well for non-malignant problems (Underdal 2002b).
Our analysis of AIER’s use of IRD data supports this conclusion. With a higher
proportion of non-malignant cases in the AIER data set, we would expect less
prominent roles for power and decision rules.
TABLE 11. Environmental Regime Effectiveness: Confronting Theory with Evidence (ERE)—Pathways to High and Low Effectiveness
(Notes. Ragin QCA Crisp Set Solutions. Minimal requirements when all cases are included are 3 ‘‘right’’, 3 ‘‘false’’ observations. For the two subsets, the corresponding requirements are 2 and 2.
Since this requirement would reduce the small subset of non-malignant cases to <10 valid observations, we report, in brackets, results obtained with 1 and 1. Model ERE-QCA1 is the [adapted]
ERE core model, including Problem understanding [know ⁄ unc], Malignancy [mal ⁄ ben], decision rule in use [maj ⁄ cons], and the distribution of power [push ⁄ lagg]. ERE-QCA2 combines the two
Helmut Breitmeier, Arild Underdal and Oran R. Young
problem features [into benknow ⁄ malunc] and replaces decision rules with the broader concept of Institutional capacity [hicap ⁄ locap]. Upper case letters mean presence of a certain condition,
lower case means absence, and – means that no pathway is found that meets the minimal requirements. * means ‘‘and’’, + means ‘‘or’’).
595
596
TABLE 12. Analyzing International Environmental Regimes: From Case Study to Database (AIER)—Pathways to Compliance and Problem Change
MAN
IRD-QCA4 # KNOW*deep*BIND*MAN # #
N (range) 24–34 13–45 28–35 37–47
(Notes. Ragin QCA Crisp Set Solutions. Minimal requirements when all cases are included: 5 ‘‘right’’, 5 ‘‘false’’ observations. For the two subsets, minimal requirements are 2 and 2. Model IRD-
QCA1 is the [adapted] environmental regime effectiveness: confronting theory with evidence [ERE] core model [see Table 11]. IRD-QCA2 adds Rule depth and density [deep ⁄ shal], while model
IRD-QCA3 adds also Management approach [man ⁄ nonman]. In IRD-QCA4, we have left out the two ‘‘capacity’’ variables of the ERE core model [power and decision rules] and added three
AIER variables that we have treated as intervening [Rule depth and density, Rule binding [bind ⁄ nonbind], and Management approach]. Upper case indicates presence of a certain condition,
lower case means absence, – indicates that no pathway is found that meets minimal requirements, and # indicates that there are too few cases left for the analysis. * means ‘‘and’’, + means ‘‘or’’).
Helmut Breitmeier, Arild Underdal and Oran R. Young 597
ERE IRD
(Notes. Figures to the left in each cell show the proportion [in %] of outcomes that are correctly predicted, while fig-
ures in brackets show the proportion of predictions that fit outcomes. Predictions are based on the environmental
regime effectiveness: confronting theory with evidence [ERE] core model ‘‘reinforced’’ by the intervening variable
that emerged as the most important in Tables 4 and 6 [‘level of collaboration’ in the case of ERE, ‘rules deep ⁄
dense’ in the case of International Regimes Database (IRD)]. Our ‘‘predictions’’ assume that any change in aggre-
gate score for this model will lead to a similar change in regime effectiveness. This is arguably an overly sensitive
indicator, prone to predict more change than actually occurs. – indicates empty category.)
This poses an important question: what would a valid model for non-malignant
problems look like? AIER took some initial steps toward answering this question,
in part by adding the social-practice perspective and in part by including a wider
range of independent and intervening variables. In Table 12, we present one
model (IRD-QCA) in which power and decision rules are replaced by three
regime properties—the depth and density of rules, the extent to which rules are
legally binding, and the overall approach to compliance (enforcement ⁄ manage-
ment). The results are encouraging, with a somewhat better overall fit than
obtained for the ERE core model. More specifically, shallow rules are found in
all negative pathways possible, while results for pathways leading to positive out-
comes are mixed. A plausible interpretation—corroborated by ordinal regression
analysis—is that avoiding a very low score on the depth ⁄ density variable is more
important than achieving a top score. The binding rules variable appears in all
pathways where it could appear and a positive value is associated with a positive
outcome, but only for non-malignant problems that are well understood.
Again, results regarding the management-versus-enforcement competition are
inconclusive.
So far, we have not distinguished between variance across regimes and variance
over time within regimes and regime components. But regimes are dynamic insti-
tutions that change continually (Young 2010). The ERE and AIER data sets offer
limited opportunities for time series analysis. But both do record major transi-
tions or ‘‘watersheds.’’ We can use these distinctions to examine in a preliminary
way how well the models we have explored above account for intra-regime
changes in effectiveness scores.
In Table 13, we report results for the adapted ERE core model ‘‘reinforced’’
by the intervening variable that emerged as the most important in each data
set.18 The most striking observation is that this model does fairly well in pre-
dicting intra-regime increases in effectiveness from one period to the next
(with a partial exception for increases in the most demanding standard of
problem-solving) but fails completely in predicting decline. Not only does none
of the predicted instances of decline materialize; the model also misses all
instances of decline that we observe! Interestingly, in the analysis using IRD,
data about two-thirds of the ‘‘errors’’ are overly optimistic predictions; for the
Oslo-Seattle database, 75–80% of the ‘‘errors’’ are on the pessimistic side.
18
This model corresponds to ERE-2 and IRD-2 ⁄ IRD-7 in the logistic regression analysis.
598 Effectiveness of Environmental Regimes
Discussion
How should we interpret all these findings? In response to this question, we see
five observations that are worthy of attention.
First, we have found that the AIER does offer a somewhat ‘‘brighter’’ charac-
terization than ERE of regime performance as well as of task environments. Dif-
ferent samples of cases may account for most of this difference, but coding rules
and practices also play a role.
Second, and more important, our analysis of the findings of the two projects
yields conclusions that are largely similar or at least compatible regarding both
conditions for ‘‘success’’ and causes of ‘‘failure.’’ In multivariate analyses, to be
more precise, the two projects yield basically similar results for similar variables,
notably the two main problem features and at least one of the capacity compo-
nents (decision rules). In both data sets, we also find a considerable amount of
evidence indicating that programmatic activities (such as building a base of con-
sensual knowledge and joint management of functions like monitoring and
assessment) as well as the inclusion of certain regime properties (such as deep
and dense rules) can become important tools for enhancing regime effectiveness
over time (Breitmeier 2008: 87–89, 114–117). As noted above, a ‘‘reinforced’’ ver-
sion of the ERE core model does fairly well using both data sets in accounting
for intra-regime increases in effectiveness scores but fails to account for declines.
Third, we have made progress in resolving what appeared at first to be a major
divergence concerning the role of power. The key to resolving this issue is to
think of the role of power as contingent on the presence or absence of certain
other factors. ERE concluded that the positive impact of pusher power is con-
fined largely to malignant problems and, though less clearly, to effectiveness
defined as behavioral change (Underdal 2002b: 449–451; 464). We find some
support for the former conclusion also in AIER’s results. The bivariate correla-
tion between a power distribution in favor of pushers and regime contribution
to compliance is 0.53 when interest incompatibility is at its highest and )0.14
when interests are largely convergent. The corresponding coefficients for regime
contribution to problem change are 0.09 and )0.51*** respectively.19 Moreover,
our analysis of AIER’s data set indicates that a power distribution in favor of
pushers may enhance compliance but makes little difference regarding effective-
ness defined as contribution to problem change. In another study based on the
Oslo-Seattle database, Underdal (2008: 191) found that the impact of decision
rules seems to depend on the distribution of power. More specifically, moving
from consensus to qualified majority procedures appears to improve regime
effectiveness primarily (perhaps only) where power is skewed in favor of pushers.
A similar but weaker pattern is found in the AIER data set as far as regime con-
tribution to problem change is concerned. Majority voting is not likely to lead to
significant change unless pushers can form a winning coalition. What a less
demanding decision rule can do is to help pushers translate a favorable configu-
ration of power into formal regulatory actions. Our analysis also indicates that
power and decision rules sometimes serve as functional equivalents. Both strin-
gent decision rules and a power distribution in favor of laggards, for example,
can contribute substantially to low effectiveness scores. With a high score on one
of these factors, the other is likely to be redundant or have only a marginal
impact.
Fourth, whatever the merits of these specific observations, the basic message
emerging from our analysis is straightforward and clear. In measuring and
explaining variance in the effectiveness of international regimes, we need more
19
However, if we use incentives to defect as an indicator of problem malignancy, the pattern is reversed. ***
indicates p<.001.
Helmut Breitmeier, Arild Underdal and Oran R. Young 599
20
A core project of the International Human Dimensions Programme on Global Environmental Change, IDGEC
ran from 1998 through 2007. See Young et al. (2008).
21
See the ideas presented in Axelrod (1997).
22
For some initial steps in this direction, see Young (2010).
23
Such configurations may well shift from one stage to another (Stokke 2010).
600 Effectiveness of Environmental Regimes
range of cases available for empirical testing. The development of the ERE data
set and the IRD constitutes an important step forward in this realm. But more is
needed.
What is the best way to move forward? One option would be to update case
studies that have been coded already. Adding new regimes to the existing data
sets is another option. The UNEP Register of Treaties includes a selection of 272
international environmental treaties and related instruments (UNEP 2005). But
the real number of bi- and multilateral environmental agreements is much lar-
ger. To take full advantage of the lessons learned since the AIER and ERE data
sets were created, however, we would have to expand the range of cases in three
directions. First, we would want to study effects at the level of individual (mem-
ber) states. Several other projects operate at this level of analysis. These studies
typically focus on a single regime (for example, the regime addressing Long-
Range Transboundary Air Pollution in Europe) or regime component (for exam-
ple, LRTAP’s 1985 Helsinki Protocol) (Helm and Sprinz 1999; Mitchell 2004;
Ringquist and Kostadinova 2005). Data in this format provide opportunities for
exploring the influence of domestic factors on state behavior as well as the
impact of regimes on the capacity and behavior of member states. Second, useful
new insights may be gained by adding regimes operating in different institu-
tional settings. Thus, coding EU environmental directives and regulations would
produce a data set that could help us assess whether the special character of the
EU as a supra-national and highly legalized political system enhances compliance
and problem-solving. Similarly, we believe that the study of regime effectiveness
could benefit from including transnational governance systems, in part for what
they accomplish on their own and partly for their contributions to the effective-
ness of intergovernmental institutions (Hall and Biersteker 2003; Pattberg 2007;
Delmas and Young 2009). Third, as Simmons demonstrates convincingly, policy
domains differ in ways that may limit the scope of the validity of mainstream
regime theory more than its pioneers anticipated (Simmons 2009). Models
framed in terms of collective-action theory have a fairly strong record in domains
such as environmental governance and international trade regulations, but they
seem much less useful for understanding human rights regimes. Comparing and
contrasting cases from different policy domains may help us to refine, differenti-
ate, and perhaps also integrate models and propositions emerging from various
subfields into more general theories of regime performance.
A strategy for improving our understanding of these complex and dynamic sys-
tems will have to include theoretical as well as methodological components. One
important step could be to couple models of policy diffusion with models of
cooperation. Policies can ‘‘co-evolve’’ through individual learning and adaptation
as well as through joint decisions and deliberate coordination, and these pro-
cesses often interact. Moreover, students of cooperation may learn from students
of diffusion who have combined rationalist and constructivist concepts and mod-
els to analyze ‘‘norm cascades’’ and other nonlinear developments (Finnemore
and Sikkink 1998). AIER took an important first step in that direction by supple-
menting the collective-action paradigm with a social-practice framework. Much
remains to be done, however, before we are able to take full advantage of the
complementary strengths of these and other approaches.
To understand complex systems, we also need to understand how different
mechanisms and factors co-produce outcomes. As we have seen, QCA, even in its
crisp set form, is a useful tool for identifying configurations of factors that lead
to a certain outcome. ‘‘Fuzzy set’’ QCA may prove even more useful. However,
the challenge is not primarily a matter of technical methods. More important
are substantive questions about conditional effects and other types of interplay.24
One lesson to be drawn from the analysis reported in this article is that trying to
specify precisely the conditions under which a particular causal effect obtains
can be a good place to start.
Finally, to study dynamics, we need time series data at the level of individual
actors (for example, member states). AIER and ERE both provide information
relating to regime development. But in both data sets, the format is too crude to
get a good grasp on process dynamics.
24
Interactions between or among distinct regimes is one type of interplay that has recently been studied in
several projects. See for example, Underdal and Young (2004), Oberthür and Gehring (2006), Oberthür and
Schram Stokke (2011).
602 Effectiveness of Environmental Regimes
Concluding Remarks
Environmental regime effectiveness: confronting theory with evidence and AIER
have significant limitations that we have sought to acknowledge explicitly in the
course of our work. But, taken together, the two projects offer considerable
encouragement regarding the contributions of quantitative analyses to under-
standing the effectiveness of international regimes. This is especially so when
quantitative analyses, which are particularly useful in developing measures of
association among variables, and qualitative analyses, which can help to explore
the causal mechanisms underlying these relationships, are employed in tandem.
The evidence we have analyzed from the ERE and AIER projects not only pro-
vides strong support for the proposition that regime matters, but it also allows us
to begin to identify specific determinants of regime effectiveness that operate
either individually or, more often, in combination with one another. Like any
good scientific effort, this work also identifies new questions that call for addi-
tional research. There is no implication here that quantitative studies of regime
effectiveness will displace the more familiar qualitative studies that constitute the
mainstream of regime analysis. Rather, we advocate the use of a mixed strategy,
deploying a toolkit that contains both qualitative and quantitative methods that
can help us to make progress in understanding the factors leading to success in
efforts to solve major problems through the creation of international regimes
(Underdal and Young 2004).
References
Axelrod, Robert. (1997) The Complexity of Cooperation: Agent-Based Models of Competition and Collabora-
tion. Princeton, NJ: Princeton University Press.
Bachrach, Peter, and Morton S. Baratz. (1962) Two Faces of Power. American Political Science
Review 56: 947–952.
Baldwin, David. (1980) Interdependence and Power: A Conceptual Analysis. International Organiza-
tion 34: 471–506.
Breitmeier, Helmut. (2008) The Legitimacy of International Regimes. Farnham, UK: Ashgate.
Breitmeier, Helmut, Marc A. Levy, Oran R. Young, and Michael Zürn. (1996a) The International
Regimes Database as a Tool for Studying International Cooperation. IIASA Working Paper WP-96-160.
Laxenburg, AT: International Institute for Applied Systems Analysis.
Breitmeier, Helmut, Marc A. Levy, Oran R. Young, and Michael Zürn. (1996b) International
Regimes Database (IRD): Data Protocol. IIASA Working Paper WP-96-154. Laxenburg, AT: Interna-
tional Institute for Applied Systems Analysis.
Helmut Breitmeier, Arild Underdal and Oran R. Young 603
Breitmeier, Helmut, Oran R. Young, and Michael Zürn. (2006) Analyzing International Environ-
mental Regimes: From Case Study to Database. Cambridge, MA: The MIT Press.
Breitmeier, Helmut, Oran R. Young, and Michael Zürn. (2007) The International Regimes Data-
base: Architecture, Key Findings, and Implications for the Study of Environmental Regimes. In
Politik und Umwelt, PVS Sonderheft 39 ⁄ 2007, edited by K. Jacob, F. Biermann, P.-O. Busch, and
P. H. Feindt. Wiesbaden: VS-Verlag, 41–59.
Cash, David W., W. Neil Adger, Fikret Berkes, Po Garden, Louis Lebel, Per Olsson, Lowell
Pritchard, and Oran R. Young. (2006) Scale and Cross-Scale Dynamics: Governance and
Information in a Multilevel World. Ecology and Society 11 (2): art. 8
Chayes, Abram, and Antonia H. Chayes. (1995) The New Sovereignty: Compliance with International
Regulatory Agreements. Cambridge, MA: Harvard University Press.
Delmas, Magali, and Oran R. Young, Eds. (2009) Governance for the Environment: New Perspectives.
Cambridge: Cambridge University Press.
Downs, George W., David M. Rocke, and Peter N. Barsoom. (1996) Is the Good News About
Compliance Good News About Cooperation? International Organization 50: 379–406.
Efinger, Manfred, Peter Mayer, and Gudrun Schwarzer. (1993) Integrating and Con-
textualizing Hypotheses: Alternative Paths to Better Explanations of Regime Formation. In
Regime Theory in International Relations, edited by V. Rittberger. Oxford: Oxford University
Press.
Finnemore, Martha, and Kathryn Sikkink. (1998) International Norm Dynamics and Political
Change. International Organization 52: 887–917.
Haas, Peter M. (1992) Epistemic Communities and International Policy Coordination. International
Organization 46: 1–35.
Haas, Peter M., Robert O. Keohane, and Marc A. Levy, Eds. (1993) Institutions for the Earth:
Sources of Effective International Environmental Protection. Cambridge, MA: The MIT Press.
Hall, Rodney B., and Thomas J. Biersteker. (2003) The Emergence of Private Authority in the
International System. In The Emergence of Private Authority in Global Governance, edited by R. B.
Hall, and T. J. Biersteker. Cambridge, UK: Cambridge University Press.
Hasenclever, Andreas, Peter Mayer, and Volker Rittberger. (1997) Theories of International
Regimes. Cambridge, UK: Cambridge University Press.
Helm, Carsten, and Detlef Sprinz. (1999) Measuring the Effectiveness of International Environmental
Regimes. PIK Report No. 52. Potsdam: Potsdam Institute for Climate Impact Research.
Hovi, Jon, Detlef F. Sprinz, and Arild Underdal. (2003) The Oslo Potsdam Solution to
Measuring Regime Effectiveness. Global Environmental Politics 3: 74–96.
Keohane, Robert O, and Joseph S. Nye. (1977) Power and Interdependence: World Politics in Transition.
Boston: Little Brown.
Mearsheimer, John J. (1994 ⁄ 1995) The False Promise of International Institutions. International
Security 19: 5–49.
Miles, Edward L., Arild Underdal, Steinar Andresen, Jørgen Wettestad, Jon B. Skjærseth,
and Elaine M. Carlin. (2002) Environmental Regime Effectiveness: Confronting Theory with Evidence.
Cambridge, MA: The MIT Press.
Mitchell, Ronald B. (2004) Quantitative Analysis in International Environmental Politics:
Toward a Theory of Relative Effectiveness. In Regime Consequences: Methodological Challenges and
Regime Consequences, edited by A. Underdal, and O. R. Young. Dordrecht: Kluwer Academic
Publishers.
Mitchell, Ronald B. (2008) Evaluating the Performance of International Institutions: What to
Evaluate and How to Evaluate It. In Institutions and Environmental Change: Principal Findings,
Applications, and Research Frontiers, edited by O. R. Young, L. A. King, and H. Schroeder.
Cambridge, MA: The MIT Press, pp. 79–114.
Nye, Joseph S. (2003) The Paradox of American Power: Why the World’s Only Superpower Can’t Go It Alone.
New York: Oxford University Press.
Oberthür, Sebastian, and Thomas Gehring, Eds. (2006) Institutional Interaction: How to Prevent
Conflicts and Enhance Synergies between International and European Environmental Institutions. Cam-
bridge, MA: MIT Press.
Oberthür, Sebastian, and Olav Schram Stokke, Eds. (2011) Managing Institutional Complexity:
Regime Interplay and Global Environmental Change. Cambridge, MA: The MIT Press.
Ostrom, Elinor. (1990) Governing the Commons: The Evolution of Institutions for Collective Action.
Oxford: Cambridge University Press.
Parson, Edward A. (2003) Protecting the Ozone Layer: Science and Strategy. New York: Oxford University
Press.
604 Effectiveness of Environmental Regimes
Pattberg, Phillip. (2007) Private Institutions and Global Governance: The New Politics of Environmental
Sustainability. Cheltenham, UK and Northampton, MA: Edward Elgar Publishing Ltd.
Peters, B. Guy. (2005) Institutional Theory in Political Science: The New Institutionalism. London: Sage
Publications.
Ragin, Charles C. (1987) The Comparative Method. Berkeley: University of California Press.
Ragin, Charles C. (2000) Fuzzy-Set Social Science. Chicago: University of Chicago Press.
Ringquist, Evan J., and Tatiana Kostadinova. (2005) Evaluating the Effectiveness of International
Environmental Agreements: The Case of the 1985 Helsinki Protocol. American Journal of Political
Science 49: 86–102.
Rittberger, Volker, and Michael Zürn. (1990) Towards Regulated Anarchy in East-West
Relations: Causes and Consequences of East-West Regimes. In International Regimes in East-West
Politics, edited by V. Rittberger. London: Pinter.
Simmons, Beth A. (2009) Mobilizing for Human Rights: International Law in Domestic Politics.
Cambridge: Cambridge University Press.
Stokke, Olav S. (2010) A Disaggregate Approach to International Regime Effectiveness: The Case of Barents
Sea Fisheries. Oslo: University of Oslo, Department of Political Science (Dr. Philos. Dissertation).
Strange, Susan. (1983) Cave! Hic Dragones: A Critique of Regime Analysis. In International Regimes,
edited by S.D. Krasner. Ithaca, NY: Cornell University Press.
Tsebelis, George. (2002) Veto Players: How Political Institutions Work. New York: Russell Sage
Foundation.
UN Environment Program. (2005) Register of Treaties and Other Agreements in the Field of the Environ-
ment. Nairobi: UNEP.
Underdal, Arild. (2002a) One Question, Two Answers. In Environmental Regime Effectiveness:
Confronting Theory with Evidence, edited by Edward L. Miles, Edward L. Underdal, Arild Andresen,
Steinar Wettestad, Jorgen Skjaerseth, Jon Birger Carlin, and Elaine M. Carlin. Cambridge, MA:
The MIT Press.
Underdal, Arild. (2002b) Conclusions: Patterns of Regime Effectiveness. In Environmental Regime
Effectiveness: Confronting Theory with Evidence, edited by E.L. Miles. Cambridge, MA: The MIT
Press.
Underdal, Arild. (2008) Determining the Causal Significance of Institutions: Accomplishments
and Challenges. In Institutions and Environmental Change: Principal Findings, Applications, and
Research Frontiers, edited by O.R. Young, L.A. King, and H. Schroeder. Cambridge, MA: The
MIT Press.
Underdal, Arild, and Oran R. Young, Eds. (2004) Regime Consequences: Methodological Challenges and
Research Strategies. Dordrecht, NL: Kluwer Academic Publishers.
Victor, David G., Kal Raustiala, and Eugene B. Skolnikoff, Eds. (1998) The Implementation and
Effectiveness of International Environmental Commitments: Theory and Practice. Cambridge, MA: The
MIT Press.
Young, Oran R. (1991) Political Leadership and Regime Formation: On the Development of
Institutions in International Society. International Organization 45: 281–308.
Young, Oran R., Ed. (1999) The Effectiveness of International Environmental Regimes: Causal Connections
and Behavioral Mechanisms. Cambridge, MA: The MIT Press.
Young, Oran R. (2001) Inferences and Indices: Evaluating the Effectiveness of International
Environmental Regimes. Global Environmental Politics 1: 99–121.
Young, Oran R. (2002a) Are Institutions Intervening Variables or Basic Causal Forces?: Causal
Clusters vs. Causal Chains in International Society. In Millennium Reflections on International Stud-
ies, edited by M. Brecher, and F. Harvey. Ann Arbor: University of Michigan Press.
Young, Oran R. (2002b) The Institutional Dimensions of Environmental Change: Fit, Interplay, and Scale.
Cambridge, MA: The MIT Press.
Young, Oran R. (2008) Building Regimes for Socioecological Systems: Institutional Diagnostics. In
Institutions and Environmental Change: Principal Findings, Applications, and Research Frontiers, edited
by O.R. Young, L.A. King, and H. Schroeder. Cambridge, MA: The MIT Press.
Young, Oran R. (2010) Institutional Dynamics: Emergent Patterns in International Environmental Gover-
nance. Cambridge, MA: The MIT Press.
Young, Oran R., and Gail Osherenko. (1993) Testing Theories of Regime Formation: Findings
from a Large Collaborative Research Project. In Regime Theory and International Relations, edited
by Volker Rittberger. Oxford: Oxford University Press, pp. 223–251.
Young, Oran R., Leslie A. King, and Heike Schroeder, Eds. (2008) Institutions and Environmen-
tal Change: Principal Findings, Applications, and Research Frontiers. Cambridge, MA: The MIT
Press.
Helmut Breitmeier, Arild Underdal and Oran R. Young 605
Distribution of Power
RF19, variable 102C (POWER_SETTING_SYMMETRY), combined with back-
ground information in Part I. Scores assigned specifically for the purposes of this
analysis.
Compliance
RC5, variable 303A (CONFORMITY ALL_MEMBERS x CONFORMITY_CAUSAL)
Problem Change
RC11, variable 304A (PROBLEM_CHANGE x PROBLEM_CHANGE_CAUSAL)
Institutional Capacity
Var40a (Decision Rule in Use)
+ Var43 (Fast Track Options)
+ Var44 (Role of Secretariat)
+ Var45 (Role of Conference Presidents and Committee Chairs)
Where a transformation produces ‘‘too many’’ values on the new variable
(leading to severe empty-cells problems), we have merged values. For the QCA
analysis, ‘‘intermediate’’ values are left out of ‘‘high’’ and ‘‘low’’ categories.
25
The data files used for the analysis and details about recoding and transformation of variables will be made
available on a website.