More Resources

Measurement error in recall surveys and the relationship between household size and food demand.


by Gibson, John^Kim, Bonggeun

which is identical to equation (5) because [gamma] = [beta][alpha]. (9) According to equation (6), if [x.sup.0] is the outlay of a one-person household, an n-person household of the same composition needs total outlay of [x.sup.0] [n.sup.1-[sigma]] to have the same food share (and the same welfare level, by assumption).

Theoretical objections to this method have been raised at least since Nicholson (1976) and it is not the aim to add to those here. Instead, the aim is to see how correlated measurement error affects these Engel estimates. Because the scale economy parameter, [sigma] is just the ratio of [??] to [??], any measurement error that biases [??] will affect Engel estimates of scale economies. For example, Lanjouw and Ravallion estimate [sigma] to be 0.4, so if ten individuals in Pakistan formed a ten-person household, their per capita food spending could go down by 60% and they would still have the same level of welfare ([10.sup.0.6] = 3.98). These large scale economy estimates imply improbable reductions in food spending per head for consumers in a poor country (Deaton 1997). But if the estimates of [sigma] are sensitive to measurement error, not only will the Engel method be theoretically unfounded, it will also be shown to be empirically fragile.

Measurement Error and the Testing Procedure

Suppose that survey data on household expenditure is subject to reporting error of the form:

(7) [[??].sub.i] = [x.sub.i] + [m.sub.i] + [v.sup.x.sub.i]

where [[??].sub.i] is the survey response, [x.sub.i] is the true value of expenditure of the ith household, [m.sub.i] is a method effect, due perhaps to the use of a less detailed recall questionnaire rather than a more detailed one, and [v.sup.x.sub.i] is a pure random error. As discussed above, the method effect in the measurement error, [m.sub.i] may be negatively correlated with household size, [n.sub.i]. Thus it is also assumed to be negatively correlated with household expenditure, [x.sub.i] since [x.sub.i] is positively correlated with [n.sub.i]. Hence, the method effect can be expressed as:

(8) [m.sub.i] = [pi][[x.sub.i] + [v.sup.m.sub.i]

where [v.sup.m.sub.i] is a random deviation for the ith household from the average method effect. Combining the two equations gives:

(9) [[??].sub.i] = [[lambda].sub.x][x.sub.i] + [v.sub.i]

where [v.sub.i]([equivalent to] [v.sup.m.sub.i] + [v.sup.x.sub.i] is a pure random error and [[lambda].sub.x] ([equivalent to] 1 + [pi]) represents a potential correlation between the true values and the method effect in the measurement error. Note that [[lambda].sub.x] is the estimated slope in the regression of the method effect on the true value plus 1. Classical measurement error is a special case of equation (9) where [[lambda].sub.x] = 1. But with correlated errors, [pi] < 0 and (as long as measured expenditures are still positively correlated with true values) the measurement error follows a mean-reverting pattern (0 < [[lambda].sub.x] < 1). Thus, the expected value of measured expenditures, E([??]) is the population mean of true expenditures scaled down by [[lambda].sub.x] and this understatement is consistent with the literature summarized above (e.g., Jolliffe 2001).

To see the implications of nonclassical (i.e., [[lambda].sub.x] [not equal to] 1) measurement errors for regression parameters, consider the following simplified version of the linear regression model used by Deaton and Paxson (the demographic composition and control variables are ignored):

(10) [w.sub.f,i] = [alpha] + [beta] ln [(x/n).sub.i] + [gamma] ln [n.sub.i] + [u.sub.i].

The survey data on 1n [(x/n).sub.i] are subject to reporting error due to the reporting error on [x.sub.i]

(11) ln [([??]/n).sub.i] = ln [(x/n).sub.i] + [v.sub.i]

where measurement error [v.sub.i] can be correlated with 1n [(x/n).sub.i]. In addition, the food share is error-ridden (unless there is the same proportionate error in food and nonfood expenditures):

(12) [[??].sub.f,i] = [w.sub.f,i] + [v.sup.w.sub.i]

and the measurement error vi can be correlated with [w.sub.f,i].

With the error-ridden variables, the regression model becomes:

(13) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The slope coefficients in the population regression are

(14) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(15) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[sigma].sup.2.sub.ln([??]/n)]is population variance of ln([??]/n), [[sigma].sup.2.sub.ln n] is population variance of ln n, [[sigma].sub.ln([??]/n), ln n] is the population covariance of ln([??]/n) and in n, which is supposed to be negative, and [rho] is the population correlation of ln([??]/n) and ln n, which is -1 < [rho] < 0, and [beta] is supposed to be negative according to Engels (first) law. When the underreporting of nonfood expenditures is less than that of food expenditures, [v.sup.w.sub.i] will be negatively correlated with [w.sub.f,i]. It is also expected that [v.sub.i] will be positively correlated with lnp[(x/n).sub.i] since measurement error in total expenditure, [x.sub.i] is assumed to be negatively correlated with log household size, In [n.sub.i] which is negatively correlated with log PCE, ln[(x/n).sub.i]. Under these assumptions, the first two terms of the bias in [??} will be negative and the other two terms of the bias will be positive. Thus, [??] could be biased downward and even negative depending on the magnitude of these terms.

The relative size of the two negative terms ([MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and - [beta][[sigma].sub.ln n, v][[sigma].sup.2.sub.ln([??]/n)]) is greater than the two corresponding positive terms ([MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], [[sigma].sub.ln]([??]/n), ln n] and [beta][[sigma].sub.ln([??]/n), v] [[sigma].sub.ln([??]/n), ln n]) under reasonable assumptions (compare the ratio of the first to the third term and the second to the fourth term). Thus, [??] is more likely to be negative when the degree of underreporting of total expenditures increases or the relative underreporting of food expenditures to that of nonfood expenditures becomes larger.

The implications change when the stronger assumption of classical measurement error is used. When measurement errors are pure random errors (and hence uncorrelated with true values), the household size coefficient, [gamma] in the population regression is:

(16) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Thus, [??] will be upwardly biased and will be positive when [gamma] > 0. The important implication of the result in equation (16) is that classical measurement error could not account for the Deaton and Paxson puzzle. It is only some form of correlated error that could cause [??] to be biased downwards, so if measurement error is a cause of the puzzle it is in a form that differs from the standard white noise assumptions that are typically used in the literature.

To supplement the analytical results, Monte Carlo experiments were carried out on equation (10). The experiments are based on a value of [??} = -0.007, which is similar to the value found by Deaton and Paxson in surveys from the United States and France (this value also ensures that the Engel scale elasticity is positive, but small). The aim of the experiments was to see whether plausible values of measurement error could bias [??} downwards toward the values found in surveys from poor countries, -0.09 [less than or equal to] [??} [less than or equal to] -0.05. To implement the experiments, total expenditure, x was partitioned into food expenditures, [x.sub.f] = x x [w.sub.f] and nonfood expenditures, [x.sub.nf] = x - [x.sub.f]. In the first set of experiments, a proportionate error was added to true food expenditures, so that the observed variable was ln [[??}.sub.f] = ln [x.sub.f] + v. In the first case, the measurement error was independent of any of the variables in the model: v ~ N(0, [[sigma].sup.2.sub.v]), with three values of [[sigma].sub.v] used; 0.1, 0.2, and 0.3. In the second case, errors were correlated with true values, v = [phi] ln [x.sub.f] + [epsilon], where [epsilon] ~ N(0, [[sigma].sup.2.sub.[epsilon]]) and E([epsilon], [x.sub.f]) = 0. In the third case, errors were correlated with household size, v = [lambda] ln n + [epsilon], where [epsilon] ~ N(0, [[sigma].sup.2.sub.[epsilon]]) and E([epsilon], n) = 0. The values used for [phi] and [lambda] were -0.3, -0.2, and -0.1. The error-ridden total expenditure and food share variables were reconstructed as [??] = [[??}.sub.f] + [x.sub.nf] and [[??].sub.f] = [[??}.sub.f]/[??]. In other experiments, the errors in measuring food expenditures were mirrored by a similar set of errors in nonfood, recognizing the fact that it may also be difficult to accurately report expenditures on things like transportation and entertainment in larger households.


1  2  3  4  5  6  7  8  9  
COPYRIGHT 2007 American Agricultural Economics Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2007, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.


Browse by Journal Name:
Today on Entrepreneur
Related Video

e-Business & Technology
Franchise News
Business Book Sampler
Starting a Business
Sales & Marketing
Growing a Business
E-mail*:
Zip Code*: