Introduction: evaluation in resource and environmental
planning.
by Gunton, Thomas I.^Rutherford, M.B.^Williams, Peter W.^Day,
J.C.
The final question in evaluation is determining the criteria to use
to assess program performance. A common approach for developing
evaluative criteria is to use the explicit goals and objectives for the
program to assess program performance. The major problem with this
approach is that goals and objectives for environmental programs are
often too vague or incomplete to provide a clear standard for assessing
performance (Bellamy et al. 1999, Gunton and Joseph 2006). Even if the
goals and objectives are clear and comprehensive, assessing a program
relative to its goals and objectives assumes that the goals and
objectives adequately reflect the public interest. Programs may have
important unintended consequences relevant to the public interest that
may not be expressed in a stated goal of the program. Using only stated
goals and ignoring unintended consequences would result in a deficient
evaluation. Using explicit goals and objectives also does not indicate
whether the program is the most effective or efficient way of achieving
the objectives because the program is not being assessed relative to
options.
A second approach is to compare performance of the program being
evaluated to similar programs in other jurisdictions by completing a
cross-sectional analysis For example, an increasingly common way of
evaluating environmental performance of jurisdictions is to compare
environmental indicators such as greenhouse gas emissions per capita by
jurisdiction (Gunton et al. 2005, Esty et al. 2006). The assumption is
that the relative performance of a jurisdiction measures the
effectiveness of the jurisdiction's plans and programs. For
example, lower per capita greenhouse gas emissions relative to other
jurisdictions indicate the effectiveness of greenhouse gas emission
control strategies. The advantage of this approach is that the data for
comparison are more readily available than other program evaluation data
such as impacts of programs on the state of the environment. However,
this approach does not indicate whether the performance is good or
bad--all jurisdictions may be poor performers--and does not normally
distinguish between outcomes due to the program versus other factors
such as geography and climate.
A third common approach is to construct time series for outcome
indicators to determine if trends are improving or deteriorating. This
approach is used in most environmental monitoring systems, such as the
Canadian government's National Indicator Initiative. The assumption
is that if trends are improving, the existing program is effective. The
problem with this approach is that it does not distinguish between
changes due to the program and those due to other factors (Gunton and
Joseph 2006). Further, trend line analysis does not indicate whether the
performance is good or bad in absolute terms, just whether it is
changing.
Another evaluative criterion is best practice standards based on
theory and/or the performance of other jurisdictions. Best practice
standards are commonly used in process evaluations to assess program
management and planning. For example, best practice analysis has been
successfully used to assess the quality of environmental sustainability
planning in various countries by the OECD, the United Nations and
non-governmental researchers (Gunton and Joseph 2006, IISD 2004). The
underlying assumption in these evaluations is that better processes lead
to better outcomes. Although this approach is useful in identifying
strengths and weaknesses of planning systems, it relies on best practice
criteria that are often not empirically verified.
A final criterion for evaluation is a comprehensive benefit-cost
analysis that assesses program benefits relative to costs to determine
if the program is in the public interest. Although benefit-cost
addresses many of the problems with other methods of evaluation, it also
suffers from major weaknesses. The most significant challenge in
benefit-cost is monetizing intangibles such as pollution and ecological
values that are central to resource and environmental planning.
Benefit-cost also requires identifying impacts attributed to the
program. As discussed above, distinguishing between impacts due to the
program and those due to other factors is difficult. Nonetheless,
benefit-cost is a legally required component of environmental evaluation
in many jurisdictions such as the United States. Cost-effectiveness
analysis, which measures outputs per unit of input, is less demanding in
terms of monetizing intangibles, but is difficult to apply to resource
and environmental plans where outputs are difficult to quantify.
Common Evaluation Errors
Evaluative Criteria Errors
A common problem in evaluation is using inappropriate evaluative
criteria. A recent evaluation of the effectiveness of environmental
regulations in the U.S. found that the criteria normally used to assess
performance such as number of permits issued, enforcement actions, and
inspections do not indicate the effectiveness of the programs in meeting
environmental objectives (NAPA2001). Evaluations of the success of land
use planning in British Columbia have used the implementation of
recommendations to increase protected areas as a measure of success (Day
et al. 2003). While this is important, the underlying objectives of
increasing protected areas, such as reduction in endangered species,
also need to be assessed. Evaluations of alternative dispute resolution
processes sometimes assess effectiveness by using the single criterion
of whether an agreement was reached, which ignores the relative quality
of the agreement, and excludes other important benefits such as improved
stakeholder relations even when an agreement was not reached (Gunton and
Day 2003). These examples show that care must be taken to develop
evaluative criteria that are comprehensive and reflect the underlying
objectives of the plan. Otherwise the use of inappropriate evaluative
criteria will lead to unjustified conclusions regarding program
performance.
Causation Assumption Errors
Another common error in evaluation is assuming that a correlation
between plan implementation and outcomes is causally linked. For
example, evaluation of regional land use planning in British Columbia is
based on monitoring time series trends for key environmental indicators
(Joseph et al. 2007). The assumption is that the trends accurately
assess the impact of the plan. The problem is that there are many
confounding factors that affect environmental trends such as weather
patterns, natural cycles, and human activity that make it difficult if
not impossible to identify impacts of the plan. Also, the impacts of the
plan occur over a long time horizon and may not be detected until many
years later. The challenge is to compare what would have happened in the
absence of the plan with what happened with the plan, holding all other
variables constant over a long enough time horizon to assess impacts
resulting from the plan. Another example of causation assumption error
is a recent evaluation of the effectiveness of mediation processes that
concluded that mediators have little positive impact (Leach et al.
2002). The study compared cases with mediators to cases without
mediators and assumed that all other factors were constant. One apparent
problem is that the cases that opted for independent mediators may have
been more challenging cases. Therefore, differences in outcomes may have
been due to factors other than the presence of a mediator.
Selection Bias Error
Many evaluations suffer from using a biased sample of cases for
evaluation. In social policy, participants in a program may be selected
based on attributes that increase the likelihood of success, instead of
being randomly selected. A positive impact on recipients relative to
non-recipients may be due to these other attributes, not to the program.
In planning, evaluation of performance of dispute resolution techniques
such as consensus-based negotiation may appear artificially high because
negotiation tends to be used in cases where a conflict assessment has
indicated the likelihood of success. Selection bias therefore can
significantly skew results.
Content Scope Errors
As discussed earlier, evaluation can focus on several different
dimensions of plan performance ranging from implementation effectiveness
to outcome efficiency. A common error is to complete an evaluation of
only a few of the dimensions of the plan and then draw conclusions on
the plan effectiveness based on the limited assessment. For example, it
is common to evaluate plans by assessing whether the recommendations are
implemented. While assessing implementation of recommendations is a
necessary component of evaluation, it is not sufficient. The question of
whether the implemented recommendations are meeting plan objectives
efficiently is also critical to the evaluation. For example, evaluations
of acid rain reduction strategies have extolled the success of policies
in reducing emissions but have not adequately assessed whether the
acidity levels of the environment have returned to acceptable levels or
whether the reductions are being achieved in the most cost-effective
manner (OECD 2004).
Timing Scope Errors
Another common error in evaluation is undertaking the evaluation
only once. Impacts of plans and programs occur over many years and
premature evaluation or single point evaluations can miss many of the
impacts.
Feasible Options Error
COPYRIGHT 2006 Wilfrid Laurier
University Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2006, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.