The central difficulty in estimating the parameters of equation [2]
is [S.sub.i] is unobserved; location of purchase is not in the data. My
solution to this problem is to parameterize the S function and then
incorporate these parameters into equation [2]. Instead of a
deterministic indicator function governing the decision to smuggle, the
parameterization yields the probability, conditional on the observables,
that individual i purchases cigarettes in a border locality.
Specifically, I assume the probability an individual smuggles is
decreasing in the cost of smuggling and increasing in the marginal gains
from smuggling.
I model the smuggling cost of obtaining cigarettes in a lower-price
locality as [delta]ln(D) - [phi], where D is the distance to the closest
lower-price border state. The other cost parameter is [phi], which
indexes the fixed cost individual i would incur by purchasing in the
home state regardless of his location with respect to the lower-price
border.
Note that I assume all smugglers make the same number of trips,
which is akin to assuming smuggling costs are independent of the number
of cigarettes purchased. Thus, conditional on the consumer's
location, smuggling costs are fixed and vary only with the distance to a
lower-price border. The data corroborate this assumption by strongly
rejecting any correlation between distance and consumption absent any
price difference across localities.
I assume the savings from purchasing in a lower-price jurisdiction
is proportional to the difference in log home and log border state
prices. Assuming the probability one smuggles can be approximated using
a linear probability model, (18) the smuggling equation is
[4] P)[S.sub.i] = 1) = [phi] + [alpha](ln)([P.sub.h]) -
ln([P.sub.b])) -[delta]ln([D.sub.i]) [equivalent to] [rho].
Using the law of iterated expectations, equation [2] becomes
[5] [[beta].sub.0] + [[beta].sub.1](ln([P.sub.h])(1 - P([S.sub.i] =
1)) + ln([P.sub.b])P([S.sub.i] = 1)) + [gamma] [X.sub.i] =
[[beta].sub.0] + [[beta].sub.1] ln([P.sub.h]) - [[beta].sub.1]
(ln([P.sub.h]) - ln([P.sub.b])) [rho] + [gamma] [X.sub.i].
Equation [5] represents a regression of log cigarette consumption
on expected price given log distance, difference in log price, and
[phi]. If p equals zero such that the consumer purchases at home with
certainty, then only the home price matters. Conversely, if [rho] is one
and the consumer smuggles with certainty, then only the border price
matters.
In previous studies using consumption data, Lewit et al. (1981) and
Lewit and Coate (1982) assume full smuggling in a 20-mile band, which
implies [rho] = 1 if individuals live within 20 miles of the border and
[rho] = 0 if they do not. Similarly, by using an average price within 25
miles for all consumers, Chaloupka (1991) implicitly sets [rho] = 1/2
for those within 25 miles of a border and assumes [rho] = 0 for the rest
of the sample. My approach provides a less arbitrary and more reasonable
account of casual smuggling than previous models as it allows
the probability of smuggling (i.e., the weights on home and border
state prices) to vary over the entire population based on differences in
smuggling incentives.
Substituting equation [4] into equation [5] yields the reduced form
demand equation used throughout this study: (19)
[6] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
One concern with the reduced form demand function given by equation
[6] is the log distance measure. (20) This is a potential problem
because one might expect the impact of distance on demand to go to zero
as distance approaches infinity. The log distance term implies as
distance becomes arbitrarily large, log demand decreases to negative
infinity. While such a critique could be levied against any log-log
model, it is important to note using log distance is a simplifying
assumption, (21) and equation [6] represents a parametric approximation
to the true demand function. To address this problem when calculating
the home state price elasticities, I constrain the home state price
elasticity to be weakly smaller in absolute value than the full price
elasticity. In effect, this restricts cross-state purchases to be zero
when the cross-border price differential is low and/or the consumer
lives far from the border. (22)
As the model is constructed, the expectation is [delta], [phi], and
[alpha] are all positive because the probability of smuggling should be
decreasing in distance from a lower-price border, increasing in price
difference, and increasing in the fixed cost parameter. It is natural to
expect [[beta].sub.1] to be negative, which implies [[PI].sub.1], <
0, [[PI].sub.2] > 0, [[PI].sub.3] > 0, and [[PI].sub.4] < 0.
The expected signs of [[PI].sub.1] through [[PI].sub.4] illustrate
the predictions of the model for the responsiveness of consumption to
the home state price. Conditional on distance, an increase in the price
difference should render consumption less sensitive to the home state
price. Conversely, an increase in distance to a lower-price border
should make demand more responsive to the home state price as the cost
of obtaining a given amount of savings has risen.
ESTIMATION STRATEGY
I estimate demand functions on the intensive margin (Q = number of
cigarettes smoked per day by smokers), extensive margin (Q = smoking
participation rate), and full margin (Q = number of cigarettes smoked
per day, including non-smokers). I employ state--MSA fixed effects in
all regressions, so only within--MSA across-time variation in prices,
distance, and price differences are used to identify the parameters of
the demand function. It is important to use fixed effects in such
regressions because individuals may differ across MSAs and across states
in their preferences for smoking, conditional on price. For example,
people might be less averse to smoking in a tobacco producing state such
as Kentucky than in a high anti-smoking sentiment state like
Massachusetts. The fact that Massachusetts is a high cigarette tax state
and Kentucky is a low cigarette tax state is likely a function of these
same preferences. Without fixed effects, demand regressions attribute
some of the preference-related smoking differences across states or MSAs
to price differences, causing an upwards omitted variables bias in the
coefficient on price. (23)
Because I am interested in estimating demand functions, the price
changes that occur in the data need to be independent of the
unobservables in the quantity demanded equation, conditional on the
observable variables included in the model. Keeler, Hu, Barnett,
Manning, and Sung (1996) present evidence that such independence may not
hold; they find cigarette producers price discriminate by state based on
numerous demographic and state legal factors. If prices are a function
of the demographic composition of the state and if these demographic
factors play a role in preferences for cigarettes, price changes will be
endogenous to cigarette demand. It is unlikely I will be able to control
for all factors that jointly affect demand and price discrimination.
Thus, using state average prices in the demand regressions is likely to
lead to biased parameter estimates on the price variables. In order to
account for this endogeneity, I instrument all price variables with tax
variables. (24) Further, if price differences across MSAs in different
states are correlated with distance between the MSAs, there will be
measurement error in the price differences as I am using differences in
average state prices. Instrumenting the price difference with the tax
difference should overcome any biases associated with such measurement
error. Note taxes are thus only a valid instrument for prices if state
excise taxes are not set in response to the distance between MSAs across
states or in response to differing home state price elasticities. (25)
While much of the data are collected at the individual level, the
independent variables of interest vary at the state--MSA level. Thus,
for each of the 12 tobacco supplements, I collapse the data into
MSA--specific means using the non-response weights included in the
survey data. This aggregation is justified by interpreting the consumer
in the model presented in the fourth section as the representative or
"average" consumer in a given MSA. (26) The aggregated data
set contains 2,904 observations at the state--MSA level. I also weight
all regressions by the number of observations that constitute each MSA
mean and estimate heteroskedasticity--robust standard errors.
The demographic variables used in the regressions that follow are
the state-MSA mean values of age, sex, weekly wage, marital status, race
(with white as the excluded category), education (with no high school
diploma as the omitted category), and labor force status (with not in
the labor force as the omitted category). Means of all variables by year
are presented in Table 5.
As Table 5 illustrates, there is a large decrease in the amount
smoked by smokers and a modest decrease in the percentage of smokers
over the time span of this analysis. These trends could be due to the
price increases that occur over this period, but there are undoubtedly
also secular trends stemming from aggregate changes in views and
preferences with respect to smoking. Including a linear year trend in
the demand models is thus appropriate. I present estimates both
including and excluding the year trend for all specifications. (27)
COPYRIGHT 2008 National Tax
Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.