Measurement error in recall surveys and the
relationship between household size and food demand.
by Gibson, John^Kim, Bonggeun
Empirical research in agricultural and development economics
increasingly uses data from household surveys. (1) There is a growing
realization that "measurement error is an ever-present, generally
significant, but usually neglected, feature of survey based income and
expenditure data" (Chesher and Schluter 2002, p. 377). It is
difficult to directly study these measurement errors because the true
value of expenditures is rarely known. Comparisons with other estimates,
such as household consumption in the National Accounts, are also fraught
with difficulty. In the absence of contrary evidence, assumptions about
errors being uncorrelated white noise continue to be made for reasons of
convenience (Bound, Brown, and Mathiowetz 2001).
In this article, we provide suggestive evidence of measurement
errors in food expenditures and budget shares being correlated with
household size in recall surveys. These surveys, where one respondent
gives a verbal report on the entire household's expenditure on a
number of items over some previous period, are used especially in
developing countries. It appears that in the absence of prompting from a
more detailed recall list a respondent in a recall survey is likely to
forget food expenditures, especially in larger households where there
are more purchases to remember. While they also may forget nonfoods, the
understatement for food may be greater due to its purchase frequency.
These measurement errors affect the estimated relationship between
household size and food demand, which is important for understanding
economies of scale within households. One common, although frequently
criticized, method of measuring scale economies is based on what is
sometimes called Engel's second law, the assertion that the food
share is an inverse indicator of welfare across households of different
sizes and compositions (Lanjouw and Ravallion 1995). This method may
mistake correlated errors in food expenditure data for genuine scale
economies.
Further motivation for studying these errors comes from Deaton and
Paxson (1998), who report the puzzling result that at constant per
capita expenditure (PCE), the budget share for food falls as household
size rises, especially in poorer countries. This pattern has been
confirmed by Gardes and Starzec (2000), Perali (2001), Abdulai (2003),
and Gan and Vernon (2003). Theory predicts the opposite pattern. Larger
households should have higher food demand because, at constant PCE,
resources released by the sharing of public goods can be spent on both
public and private goods, giving a positive income effect. Substitution
effects favor public goods, which are effectively cheaper in larger
households, but the income effect should be bigger for food whose
(absolute) own-price elasticity is likely to be lower than the income
elasticity, especially in poorer countries. Deaton and Paxson list
several possible explanations for their puzzle, including measurement
error, but none are considered convincing.
Measurement error may warrant more attention because the design of
the surveys used by Deaton and Paxson varies systematically across the
income distribution. The countries with the least puzzling results
(France and Britain) use diary surveys where each adult in the household
keeps a daily record of expenditure for two weeks. Surveys in the three
poorest countries, with the most puzzling results, ask respondents to
remember household food expenditure over the previous week (Thailand and
South Africa), month (South Africa) or year (Pakistan). These surveys
use broad commodity detail (i.e., short questionnaires) with only
twenty-six to thirty-eight food items specified (fifty-seven to
seventy-four items in total). In Taiwan and the United States, where the
results are not as puzzling, a mixture of diary and recall methods is
used. (2) This cross-country variation may contribute to the puzzling
effect of household size on food demand in poorer countries. But it is
hard to isolate the role of measurement error because factors associated
with other explanations also differ across countries. To overcome this
problem we focus on variation in household survey design and
implementation within countries, to hold other factors constant. Two
recent surveys from Cambodia and Indonesia provide this variation.
Other recent studies also follow this approach. For example,
Attanasio, Battistin, and Ichimura (2004) show quite different
inequality trends between the diary and recall samples of the U.S.
Consumer Expenditure Survey. Ahmed, Brzozowski, and Crossley (2005)
compare diaries and recall applied to the same households in the
Canadian Food Expenditure Survey. By assuming that the diaries measure
"true" food consumption they find measurement errors in the
recalled expenditures that are correlated with true values. There is
less correlation with household size, perhaps because their survey asks
a single question about food spending over the past month. Respondents
asked this question may not actually try to add up all of their
spending, which is referred to as episodic enumeration below, and
instead may use an estimation strategy. While episodic enumeration
should be harder for a respondent from a larger household because of the
greater number of transactions to remember, forming some estimate based
on assumptions about average spending may not be. Thus the results
reported here may not apply to single-question food recalls used in some
surveys in developed countries (Browning, Crossley, and Weber 2003).
The next section of the article reviews literature on household
survey design. Two examples where errors in food expenditure data may
affect results are then described. Analytical and Monte Carlo results
relating to measurement error in food share equations are then developed
and an econometric testing procedure is outlined. Finally, evidence from
the household surveys is described and compared with the results from
the Monte Carlo experiments. This comparison suggests that food
expenditure estimates from less detailed recall surveys have measurement
errors that are correlated with household size.
Previous Literature
Existing evidence suggests that the measurement of both food and
total expenditures is sensitive to survey design. Three design
variations are considered in the literature: recording in diaries versus
respondent recall in an interview, longer (more detailed) versus shorter
(less detailed) recall questionnaires, and different periods over which
expenditures are meant to be recalled.
In an experiment in Latvia, one half of the households were given a
diary for recording expenditures and in a subsequent period they were
given a recall survey, while the other half had the recall first and
then the diary. (3) Reported food expenditures were about 46% higher
with the diary, regardless of whether the diary was used first or second
(Scott and Okrasa 1998). Another split-sample experiment in urban Papua
New Guinea found (geometric) mean food expenditures to be 26% higher and
the food budget share six percentage points higher with the diary
(Gibson 2002). Moreover, the difference in food shares between the two
questionnaires appeared to be correlated with household size.
A recall experiment in El Salvador gave a long questionnaire
(seventy-five food items, twenty-five nonfoods) to one quarter of the
sample, with others given a short questionnaire (eighteen foods, six
nonfoods) covering the same items more broadly. Average per capita
consumption was 31% higher with the long questionnaire (Jolliffe 2001).
A similar experiment, which is repeated every three years in Indonesia,
gives one sample a questionnaire with twenty-three broad categories and
another one with 320 detailed categories that nest within the broad
ones. Average consumption is between 12% and 20% lower with the short
questionnaire and the difference between questionnaires appears to be
correlated with the level of expenditures (Pradhan 2001).
An experiment in Ghana varied recall periods, with reported
spending on a group of frequently purchased items falling by 2.9 % for
every day added to the recall period, with the recall error leveling off
at about 20% after two weeks (Scott and Amenuvegbe 1991). The Indian
National Sample Survey (NSS) experimented with using a "last
week" versus a "last month" recall and found that for the
all-food aggregate the estimates based on weekly recall were 21% higher
(NSSO 2003).
These examples of widely different estimates of expenditure when
two survey designs are used in the same setting indicate measurement
error because it cannot be true that estimates from both surveys are
right. It is tempting to go further than this and suggest that some
designs are more accurate than others but such beliefs remain unproven
because it is hard to obtain actual expenditures, which are needed if
survey estimates are to be validated. For example, the NSS experiments
attempted to form a gold standard by having enumerators visit households
every day and giving respondents volumetric containers for measuring
food consumption. The monthly recall for the all-food aggregate was only
83% of this standard compared with 93% for the weekly recall. But the
gold standard may not have been completely accurate because for some
foods less than two-thirds of respondents used the measuring containers
and many respondents did not use the daily diary supplied to them (NSSO
2003).
COPYRIGHT 2007 American Agricultural Economics
Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.