Measurement error in recall surveys and the
relationship between household size and food demand.
by Gibson, John^Kim, Bonggeun
Several indicators suggest a more diligent interviewer performance,
with greater probing in Round 2 of the survey. The share of households
requiring re-interviews, due to incomplete and/or inconsistent
questionnaires, fell from 40% in Round 1% to 28% in Round 2 (table 4).
The average proportion of households reporting zero expenditure on an
item fell from 48% to 43%, while the proportion reporting zero
own-production also fell. (16) While these falls could be due to
seasonality, the zero response rates would normally go in opposite
directions for purchases and own-production, as producer-households
exhaust their stocks and switch to market purchases. Thus, it seems
plausible that the data in Round 2 of the CSES reflect a more probing
interview style, so variation across the survey rounds in the estimated
food Engel curve may indicate something about measurement error effects
in food demand models.
Results
The results of estimating equations (17) and (18) are reported in
table 5 for Indonesia and table 6 for Cambodia. The regression model in
each case is based on the specification used by Deaton and Paxson for
Thailand, which is the closest country in their sample to the countries
studied here. In addition to (log) PCE and (log) household size, the
variables include eleven demographic ratios, the fraction of adults in
each household working in agricultural employment, agricultural
self-employment, and non-agricultural employment, and dummy variables
for farm households and for each province (sector rather than province
in Cambodia).
The two equations are estimated by OLS and Instrumental Variables
(IV), which are two of the four estimation methods used by Deaton and
Paxson. The justification for using IV is that random measurement errors
in ln (x/n) might bias the [gamma] coefficient because of the
correlation between In (x/n) and In n. The instrument used by Deaton and
Paxson is household income, excluding imputed items that are common with
expenditures. This variable is not available for the annual SUSENAS
survey, so wage and salary income is used instead. Only 60% of the urban
sample has wage and salary earnings, so the OLS equation is run
twice--once on all households in the sample for urban Java and once on
just those with earnings. While the point estimates change between these
two samples, the pattern of results is qualitatively the same.
Indonesia
Questionnaire design has a significant effect on the estimated
relationship between household size and food demand. When the more
detailed Module questionnaire is used to measure expenditures, the
negative effect of household size on the food budget share (at constant
PCE) is significantly smaller for all samples and all estimators in
table 5. This method effect is shown by the coefficient on the
interacted dummy variable term, [ln n x D] being positive (ranging from
0.010 to 0.016) and statistically significant in all columns. In other
words, when the questionnaire uses a longer list of foods for collecting
recalled expenditures, the negative effect of household size on the food
budget share is less apparent.
Similarly, the difference between the Core and Module samples in
the estimated elasticity of per capita food demand with respect to
household size, [gamma]/[[bar.w].sub.f] is statistically significant in
most cases. The other apparent questionnaire effect is that the Engel
estimates of economies of scale are about one quarter larger when
household expenditures are measured with the shorter questionnaire, and
this difference is always statistically significant.
Comparing the columns on the right of table 5 with those on the
left suggests that using the more general model (equation (18)) makes
little difference to the results. Thus, even when all coefficients are
allowed to vary between the Core and Module samples, it is usually only
the interaction between household size and the dummy variable for the
method effect that attracts a significant coefficient. This result is
consistent with the pattern that would be expected if expenditure
reporting errors are correlated with household size. Similarly, the use
of IV estimation does not alter the basic pattern. Even though the IV
estimates of equation (18) are significantly different from the OLS
estimates, the gap between the short questionnaire and long
questionnaire estimates of the Engel elasticity of household scale is
almost identical to the gap with the OLS estimates. (17) Specifically,
according to the IV estimates, [sigma] = 0.51 with the short recall and
[sigma] = 0.42 with the long recall, giving a gap of 0.09 which is close
to the gap of 0.11 when OLS is used to estimate equation (18).
The results are largely similar in the rural sector, where only OLS
estimates are reported because the small fraction of households with
wage income limits the use of IV (with standard errors in parentheses):
[w.sub.f] = -0.100 1n(x/n) - 0.019 ln n
(0.004) (0.003)
+ 0.014 [ln n x D] + controls
(0.003)
[R.sup.2] = 0.17, N = 31,023.
The elasticity of per capita food demand with respect to household
size in the Core questionnaire is -0.028, but with the longer Module it
is a less negative (and hence less puzzling) -0.008. Similarly, the
Engel estimate of economies of scale is 0.19 with the Core but only 0.05
with the more detailed Module. Both of these differences in the
coefficients between the subsamples are statistically significant.
However, in contrast to the urban sector, the results do change
when the more general model (equation (18)) is used. The interaction
between questionnaire type and log per capita expenditures is
statistically significant, while the interaction with household size is
not:
[w.sub.f] = -0.0941n(x/n)- 0.018 [ln(x/n) x D]
(0.005) (0.008)
- 0.016 ln n + 0.004 [ln n x D]
(0.003) (0.005)
+ controls [R.sup.2] = 0.17.
The apparent correlation between the questionnaire effect and per
capita expenditures is consistent with measurement error that is
correlated with the true value of expenditures, as shown in the Monte
Carlo experiments. However, the zero restrictions on the interaction
terms needed to nest equation (17) within equation (18) are not rejected
([F.sub.17,1873] = 1.28). Thus the evidence of errors being correlated
with true expenditures rather than with household size comes from a
model that is itself rejected in favor of a simpler equation that
suggests a correlation between household size and the measurement
errors.
Cambodia
The effect of variation in survey implementation in the Cambodian
survey has an even stronger effect on the food Engel curve than the
questionnaire effect in Indonesia. In Round 2, the puzzling negative
relationship between Inn and [w.sub.f] almost disappears (table 6). The
difference in [??] between survey rounds varies from 0.031 to 0.053,
depending on estimation method and whether the fully interacted model
(equation (18)) is used. (18) In other words, within the Cambodian
survey, the difference in the effect on the food share of a unit
increase in the logarithm of household size is greater than many of the
between country differences reported by Deaton and Paxson. Because
nothing other than interviewer practice seems to differ between the two
groups of households in Round 1 and Round 2, measurement error emerges
as a plausible cause.
In terms of the Engel estimates of economies of scale, there appear
to be significant scale economies available in Round 1, with [sigma]
ranging from 0.37 to 0.40. This range is very close to the estimate
reported for Pakistan by Lanjouw and Ravallion (1995). In contrast, in
Round 2 of the survey the Engel estimates of scale economies are only
from 0.04 to 0.08, and always statistically insignificant.
Conclusions
This article has attempted to examine measurement error in food
expenditure data collected by the recall method in developing countries.
This is an inherently difficult task because of the lack of a gold
standard for comparing with the household survey estimates so that the
nature of the measurement error can be revealed. Nevertheless we provide
evidence for measurement errors from the difference in results that
occurs within a given setting when there is variation in either
household survey design or implementation. These errors may be related
to both household size and survey design through effects of these
factors on whether survey respondents use either episodic enumeration or
estimation strategies.
One interpretation of the empirical evidence reported here, which
is consistent with the Monte Carlo results in table 1, is that food
expenditures collected with a less detailed recall questionnaire have
measurement errors that are correlated with household size. In the
absence of prompting, either from a more detailed questionnaire or from
interviewers, a respondent in a recall survey is likely to forget
expenditures. As household size increases it becomes increasingly harder
for the respondent to accurately recall all food expenditures, because
the number of transactions to remember grows with the number of
residents in the household. Hence the measurement errors may be
correlated with household size.
COPYRIGHT 2007 American Agricultural Economics
Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.