Assessing global computable general equilibrium model
validity using agricultural price volatility.
by Valenzuela, Ernesto^Hertel, Thomas W.^Keeney, Roman^Reimer,
Jeffrey J.
Despite their widespread use in policy analysis, computable general
equilibrium (CGE) models are sometimes criticized for having uncertain
empirical foundations and for being insufficiently validated (Jorgenson
1984; Kehoe, Polo, and Sancho 1995). The problem of endowing large CGE
models with numerical parameters values is formidable, and numerous
choices also have to be made about model structure. In many cases the
trustworthiness of a model may be based largely on the assertions of the
modeler. As CGE models become more widely used, it is essential to have
a formal means of assessing their empirical validity.
This article presents a methodology for validating CGE models on a
sector-by-sector basis. The approach developed here can help one gauge
the accuracy of a model's results, it can enable comparison to
competing CGE models, and--most importantly--it can inform the
development of improved specifications. Emphasis is placed on techniques
for validating and improving models as opposed to arguing for a
particular CGE model.
The validation approach is inspired by the work of Kydland and
Prescott's widely received dynamic competitive-equilibrium growth
modeling work. In their 1982 article, they develop a methodology for
model calibration that involves mapping out a model's responses for
historical technological shocks and then comparing them to the variance
of national output. Hertel, Reimer, and Valenzuela (2005) show how this
can help in the calibration of a commodity stockholding model for a
static, short-run global CGE framework.
Our approach also relates to earlier work by Tyers and Anderson
(1992) and Vanzetti (1998), who model uncertainty in world food markets
by sampling from a distribution of random supply shocks. Like them, we
focus on agricultural commodities since their weather-induced supply
variation translates into a series of natural historical experiments. We
incorporate this variation into a CGE model as technology shocks at the
individual sector level. The model can then be validated against the
observed variance of national commodity prices.
Validating the model against agricultural commodity price changes
also coincides with the current focus of many global CGE modeling
efforts. A key question is the potential impact of rich-country
agricultural support and protection policies on incomes and poverty in
developing countries. Agricultural policy impacts are transmitted to
developing countries through world markets--specifically, through
commodity price changes. It follows that a model's ability to
replicate observed price changes should be of central concern to
validation efforts. In order to permit maximum clarity in our
investigations, we focus on a single commodity--wheat.
The CGE model that we seek to validate is the GTAP (Global Trade
Analysis Project) model (Herte11997). This model is widely used by
international agencies and governments to evaluate trade policy
scenarios, and thus is a good candidate for validation. In comparing
actual versus simulated price variation, we find that this model
performs quite well for some countries. However, our most interesting
findings relate to the pattern by which the model fails to replicate
observed behavior in other markets. It tends to overstate price
volatility in the major net importing markets, while understating price
volatility in major exporting regions.
This is a striking result that arises from the tendency for
countries to insulate domestic markets from world prices. The standard
GTAP model assumes perfect price transmission and thus overlooks the
ensemble of policies and institutions that often serve to stabilize
domestic markets and destabilize world markets. Examples include
policies such as variable import levies and institutions such as
state-trading enterprises and commodity agreements.
To account for the incomplete transmission of world prices, we
modify the standard GTAP model to introduce active market insulation by
importers. In particular, we estimate and incorporate price transmission
elasticities into the model (Bredahl, Meyers, and Collins 1979). Once
this modification is undertaken, the model is again evaluated relative
to the same metric--predicted versus observed price volatility. The
richer formulation improves model performance but also suggests a truly
satisfactory reconciliation of observed and predicted outcomes can only
come through explicit modeling of the key policies in individual
markets. The validation method developed here provides a meaningful way
of documenting how such modifications would improve model performance.
The remainder of the article is organized as follows. The next
section reviews the practice of model validation and its application to
large-scale CGE models. The third section describes the main
characteristics of the model being tested, and outlines the methodology
employed in the validation exercise, namely, the use of stochastic
simulations focusing on annual variability in supply. The following
section presents the results, which center on a comparison of predicted
and observed price volatility. Finally, the article introduces a simple
approach to incorporating incomplete price transmission between border
and domestic prices, as implied by historical evidence.
Background on Model Validation
Gass (1983) provides the starting point for discussion of the
validation of simulation models. He stresses the need for credibility in
policy-related simulations, but suggests that such models can never be
truly validated. However, by subjecting a simulation model to
invalidation tests we can become more confident that the model is not
invalid, thereby improving its credibility.
Gass argues that the central concern of policy models should be
replicative validity, as opposed, for example, to a singular focus on a
model's underlying theoretical assumptions. Replicative validity
essentially means that a model's simulated outcomes match
historical outcomes over some appropriately chosen period of time. This
process facilitates: (a) understanding of the model by potential users,
(b) exposition of the strengths and weaknesses, (c) an assessment of the
model's limitations in a predictive capacity, and (d) information
on the proper level of confidence to attach to results. McCarl (1984)
adds that validation can point the way for adaptations that produce
better predictions in an area where a model was previously limited.
While the operations research literature continues to devote
considerable attention to the validation of simulation models (reviewed
in Kleijnen 1999), there are few cases of CGE models being tested
against the historical record. Kehoe, Polo, and Sancho (1995) offer one
exception. They validate a CGE model of the Spanish economy in terms of
its predictions of the impacts of tax reform, by attempting to control
their single-region CGE model for behavior it could not be expected to
reproduce (e.g., the impact of a drought in the base year). Their
experiment deals with shocks to a single, national economy, making the
process of isolating events, and exogenously introducing their impacts
into the model, considerably more straightforward than for a global
model.
We rarely have the kind of natural experiment that is needed to
validate a large-scale partial, or general equilibrium global model. For
instance, in the case of multilateral trade liberalization, the policy
changes are usually very modest, and are phased in over a long period of
time--particularly when compared to the other short-term factors
perturbing the world economy, such as wars, currency crises, and trade
embargoes.
Gehlhar (1997) encounters such difficulties when validating a
global trade model using policy shocks. He uses a backcasting simulation
to evaluate the validity of GTAP model results versus observed outcomes
concerning East Asian economic growth in the 1980s. He finds that the
model performs adequately with respect to the direction of change in
trade shares, but is otherwise weak in terms of predictive power. He
then alters the model, separating labor inputs into skilled and
unskilled components, and increases the trade elasticities by 20% from
their base values. These alterations significantly improve the
validation results in the particular case of East Asian growth.
Fox (2004) follows Kehoe, Polo, and Sancho's lead in
developing summary goodness-of-fit measures to assess the North American
Free Trade Agreement predictions of Brown, Deardorff, and Stern (1992),
using the Michigan Model of Production and Trade. In implementing shocks
to capital and labor endowments and allowing for international capital
mobility, he finds that the model does a good job in capturing the
qualitative pattern of trade changes. However, it fails to simulate the
large magnitude of trade changes in certain sectors. He suggests that
this may be due to the low magnitude of the elasticities used in the
model, and the Constant Elasticity of Substitution representation of
trade.
Liu, Arndt, and Hertel (2004) formalize the approach of Gehlhar
(1997) by developing an approximate likelihood function to assess the
quality of model performance over the (backcasting) period of 1986-92.
They use this framework to test the widely maintained hypothesis known
as the "rule of two," whereby the import/import substitution
elasticities are twice as large as the import/domestic elasticities for
comparable goods.
COPYRIGHT 2007 American Agricultural Economics
Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.