ABSTRACT. This paper investigates forecasting accuracy of four
different hedonic approaches, when vacant urban land prices are
predicted in local markets. The investigated hedonic approaches are: 1)
ordinary least squares estimation, 2) robust MM-estimation, 3)
structural time series estimation and 4) robust local regression.
Post-sample predictive testing indicated that more accurate predictions
are obtained if the unorthodox methods of this paper are used instead of
the conventional least squares estimation. In particular, the predictive
unbiassness can significantly be improved when using the unconventional
hedonic methods of the study. The paper also studied the structure of
urban land prices. The most important attribute variables in explaining
land prices were permitted building volume, house price index, northing
and easting. The influence of parcel size variable and different
indicator variables on land prices were much weaker.
KEYWORDS: Land price; Hedonic model; Prediction; Robustness;
Flexibility
SANTRAUKA
Nagrinejama, kokiu tikslumu keturi skirtingi hedonistiniai metodai
prognozuoja laisvu zemes plotu kainas vietinese miestu rinkose.
Nagrineti tokie hedonistiniai metodai: 1) maziausiuju kvadratu metodas,
2) daugybiniq modeliu vertinimas, 3) struktoriniu laiko eiluciu
vertinimas, 4) lokaline regresine analize. Post-sample prognostinis
testas parode, kad tikslesnes prognozes gaunamos taikant netradicinius
siame darbe nurodytus metodus, o ne iprasta maziausiuju kvadratu metoda.
Taikant netradicinius hedonistinius tyrimo metodus, gali gerokai
padideti prognoziu nesaliskumas. Darbe nagrineta ir zemes kainu mieste
struktura. Aiskinant zemes kainas is budingu kintamuju svarbiausi buvo
leidziamas pastato dydis, busto kainu indeksas, sklypo padetis. Sklypo
dydzio kintamasis ir ivairiu rodikliu kintamieji zemes kainoms turejo
daug mazesne itaka.
1. INTRODUCTION
Hedonic methods are often advocated in complex land valuation
assignments in order to objectively minimise the systematic valuation
error and in order to produce the necessary quality-adjustments, which
stem from the differentiated nature of separate land parcels, validly
and reliably. However, the use of hedonic models is plagued with some
fundamental problems imposing serious threats to their empirical
adequacy. These fundamental dilemmas include: (1) the temporal
variability of land prices, (2) the spatial variability of land prices,
(3) the model specification dilemma and (4) outlying and influential
observations.
When investigating the temporal dimension of land prices it is
important to understand that the behaviour of land prices is generally
nonstationary. This is a typical characteristic of many economic time
series, which means that the data-generating process that produces the
observables is itself transient in time. The effect of time is also
multidimensional: Often we can legitimately separate from each other the
price trend, the price cycle, seasonal variation and random variation.
Traditionally, when modelling temporal land price movements, the effect
of time has been tried to reduce to the variation of cost-of-living
index or house price index, which have subsequently been used as
explanatory variables in a hedonic regression. Also the indicator
variable technique (i.e. by using yearly time dummy variables) has been
a very popular approach when analysing the temporal dimension of land
prices. These approaches contain problems mainly because the influence
of time can only be estimated in a manner, which is not very accurate in
practice. Structural time series models, on the other hand, usually
provide a more accurate description about temporal movements.
The spatial variation of land prices can be divided to the spatial
heterogeneity and spatial dependency. Spatial heterogeneity implies that
functional forms and parameters vary with location and are not
homogeneous throughout the data set, whereas spatial dependence implies
that the variation is a function of distance. The spatial dependency
problem can usually be solved by including location or some distance
variables into a hedonic regression as explanatory variables. The
spatial heterogeneity problem is usually more problematic: One natural
solution would be to narrow the analyses into reasonably small
submarkets, which homogenises the data. However, in practise this
operation is not typically feasible due to the scarcity of observations
for the hedonic modelling purposes. Adaptive modelling techniques, such
as local regression, usually provide a better solution to the spatial
heterogeneity problem in that they possess a spatial adaptation property
and thus explicitly address the spatial heterogeneity problem.
The model specification dilemma can be solved by three different
ways: (1) parametrically, (2) semiparametrically and (3)
nonparametrically. Parametric modelling is the classical approach in the
hedonic modelling of land prices, which is theory-laden because
pre-specified functional forms are used in the analysis. Nonparametric
techniques are on the other hand data-driven, very flexible tools and
semiparametric techniques combine features from parametric and
nonparametric approaches. The exact research problem determines what
approach should be used. Generally, nonparametric methods are useful
when associations between variables are complex (i.e. highly nonlinear)
and theoretically unknown. Parametric models apply well to a less
complex setting where there exists valid prior knowledge about
model's functional form. Irrespective of a chosen approach the
model specification dilemma contains the choice of a hedonic
model's functional form, the selection of relevant study variables
and an error distribution assumption. And it should be noted that the
result depends on the chosen scale, which is often, however, implicit.
Parametric models that represent data modelling culture (Breiman,
2001) have formed the conventional dogma of hedonic pricing methods in
land price studies, where prespecified global models are estimated by
means of ordinarily least squares or some modification thereof. Benefits
of parametric approaches undeniably include: simplicity,
interpretability, parsimony and comprehensive statistical theory. The
fundamental obstacle, however, under-lying the general use of parametric
models is their inflexibility, i.e. inability to learn genuine structure
about the hedonic relationship from the evidence in such decision-making
settings, where theoretically unknown nonlinearity is expected. This is
the typical case when the effects of variables representing location and
time are considered (McMillen and Thorsnes, 2003). The conventional
result is that even the best parametric model tends to impose
restrictions that substantially reduce the explanatory and predictive
power of hedonic equation (Pace, 1993 and 1995; Anglin and Gencay, 1996;
inter alia). Unless the theory-laden parametric model coincides with the
data-generating process, profound mis-specification errors may result
imposing serious threats to their empirical validity.
Semiparametric and nonparametric approaches are representative of
algorithmic modelling culture (Breiman, 2001) that emphasise aspects of
learning the complex structure from the available facts and adaptability
to the features underlying the data. Semiparametric estimators are, more
precisely, an intermediate strategy between theory-laden and data-driven
estimators that have restricted learning ability, i.e. semiparametric
estimators can approximate functions only within some prespecified
classes. Their practical relevance is mainly in balancing the dual goals
of low specification error and high efficiency (Pace, 1995; Anglin and
Gencay, 1996) and in enchaining the interpretability of results.
Nonparametric estimators are by their nature highly flexible and, thus,
capable of approximating very general classes of functions (e.g. smooth
functions, square integrable functions) that does not require any
restrictive, unwarranted prespecification of the functional form of mean
response function (nor any specific error distribution assumption). This
renders nonparametric estimators to be powerful data-driven tools,
albeit highly sensitive to the problem of undersmoothing or overfitting,
if local estimation is implemented unduly.
Outlying and influential observations are very common in the land
value studies, which may be genuine, faultless values, generated under
conditions of some untypical factors or they can contain different
errors (such as recording and measurement error; wrong population,
etc.). Traditional hedonic modelling techniques, especially the ordinary
least squares technique, are sensitive to outlying observations; even a
single outlier can drastically change the results and misguide the
inferences. In fact, a single sufficiently deviating data point can
cause that the least squares estimator breaks down and generates results
that are utterly unreliable and uninformative. Robust methods such as
MM-estimation, on the contrary, are not sensitive to outliers or
influential observations and, therefore, can tolerate a certain amount
of bad observations without the fear that the estimator breaks down and
produces completely useless results.
2. THE RESEARCH PROBLEM
COPYRIGHT 2008 Vilnius Gediminas Technical
University Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.