Entrepreneur: Start & Grow Your Business

Would you choose your preferred option? Comparing choice and recoded ranking experiments.


by Caparros, Alejandro^Oviedo, Jose L.^Campos, Pablo

The most widely used elicitation formats in conjoint analysis (CA) applied to environmental valuation have been rating, ranking, and choice. As economists tend to prefer ordinal measures of preferences rather than cardinal measures, especially due to the more obvious interpretation in terms of random utility (Roe, Boyle, and Teisl 1996; Holmes and Boyle 2001), we focus on ranking and choice experiments. The differences found between these formats in previous studies are not particularly surprising because different statistical techniques are used, and violations of transitivity after the first rank have been observed both in experiments and in field applications (see Foster and Mourato (2002) and Bateman et al. (2007)). However, even if data obtained through a ranking exercise are recoded as a choice experiment by assuming that the option ranked first would be the option chosen, and analyzed using statistical techniques employed in choice experiments, differences in response have been found to persist between the two formats (see Boyle et al. (2001) and the remaining literature summarized in table 1). That is, according to previous research, saying that an option is your preferred option is not the same as saying that you would choose the option. The explanation provided in Boyle et al. (2001) is that the cognitive process is different, if we ask a person to state her (or his) preferred option rather than asking her/him to state the option that she/he would choose. Holmes and Boyle (2001) state that "This surprising result suggests that different cognitive processes were used in seemingly identical tasks (i.e., choose the most preferred profile from a set)." If we turn the argument upside down, one can argue that a method that yields different results with such similar tasks (choose one versus state your preferred option) has a fundamental problem.

As we feel that, if true, this would have profound implications not only for economic valuation through CA but also for economic theory, we decided to investigate whether this difference remains when some of the shortcomings of existing comparisons are removed. Although the details are discussed below, in our experiment we essentially ensured that the choice and the ranking surveys were identical in all relevant features: same experimental design, same number of alternatives, and same questionnaire.

The implications of this comparison are also relevant for CA practitioners. If the differences persist, we should probably recommend choice experiments because they are closer to real-market decisions (Adamowicz, Louviere, and Williams 1994). If the differences disappear, the use of ranking could be recommended because we can obtain the same results as when using choice, and we may be able to use the information provided by the subsequent ranks to develop an additional measurement.

Having this idea in mind we decided to use a pairwise comparison, plus status quo, to test differences between choice and recoded ranking experiments. We chose this design because it has not been used for comparing the two formats in previous studies in spite of being the most common format used for environmental valuation applications (Adamowicz, Louviere, and Williams 1994; Blarney et al. 2002; among others). In addition, this format can be seen as a benchmark case to test the theoretical concerns discussed above (convergent-validity of a choice and a ranking recoded as a choice when both are analyzed as a choice), because it has the minimum number of alternatives necessary to provide a meaningful ranking.

Using a split-sample design we presented a choice experiment to one half of the sample and a ranking to the other half. Although a ranking can be performed using a simpler design (Louviere 1988, p. 100), we chose the experimental design to be identical in both cases. In other words, we used for both subsamples the experimental design that we would need to use in a choice, because this design can also be used in a ranking.

Our results, the opposite of those obtained in Boyle et al. (2001) and the remaining literature summarized in table 1, show that the choice experiment and the ranking recoded as a choice provide statistically similar parameter vectors (same structural models). The same cognitive process was apparently used for seemingly identical tasks (i.e., choose the most preferred profile from a set, or state the profile you would choose from a set). Aggregated and per attribute welfare measures comparison tests also show statistically identical results in almost all cases. This holds for parametric tests as well as for bootstrapping tests. We completed our analysis by trying to detect "learning" and "fatigue" effects with a subsample analysis. We also used follow-up questions to study the effects associated with the information provided, the difficulty of the valuation task, the number of sets of alternatives, and the response effort. Only the respondent's reported difficulty with the valuation task turned out to be relevant.

Literature Review

Mackenzie (1993) made the first comparison of CA formats applied to environmental valuation. However, he performed a rating experiment and then simulated choices and rankings from the ratings (see also Anderson and Bettencourt 1993). In Roe, Boyle, and Teisl (1996) and Stevens, Barrett, and Willis (1997), ratings were also recoded to rankings and choices. Thus, the results of these studies are not relevant for our comparison purposes, because a choice experiment and a ranking recoded as a choice were not compared. The same holds for studies that analyze full ranks, because the differences reported can be explained by the different statistical techniques employed, and by inconsistency in the second and subsequent ranks (Chapman and Staelin 1982; Hausman and Ruud 1987; Ben-Akiva, Morikawa, and Shiroishi 1991; Foster and Mourato 2002; Siikamaki and Layton 2007). However, this should not pertain when the first rank only is analyzed as a choice.

Table 1 summarizes the main features of previous environmental valuation comparisons and cross-validity tests between independent samples of choice and rankings recoded to choice formats (i.e., we analyze only studies that focus on the first rank). Although results demonstrate differences between these formats, all the studies featured at least one of the shortcomings discussed below.

The experimental design of choice and recoded ranking experiments differ in some of the studies in table 1, because rankings can employ simpler experimental designs (e.g., Mogas and Riera (2001); see table 1). The number of alternatives offered for ranking is in some cases more than the alternatives provided to choose from (e.g., Morrison and Boyle (2001); see table 1). In comparison studies where this occurs, it is hard to discern whether the differences are caused by the process of stating preferences or by the different experimental designs inducing different results. Furthermore, when respondents face a high number of alternatives to rank, they may reduce the precision of their valuation process, or they may simply assign ranks randomly. This also affects the first rank. All the studies in table 1 provided four alternatives to rank.

The inclusion of a status quo alternative in all sets of alternatives is also relevant. As in the contingent valuation method, a reference level must exist to obtain adequate welfare measures (Roe, Boyle, and Teisl 1996). Most of the studies in table 1 did not always include the status quo in all sets of alternatives. Experimental designs in Boyle et al. (2001) and Holmes and Boyle (2001) were random in attributes, implying that the whole status quo appeared only in some of the sets of alternatives. In Mogas and Riera (2001), the choice experiment presented two alternatives plus status quo and the ranking presented four alternatives without status quo.

As table 1 shows, convergent validity was generally not obtained in parameters and only Morrison and Boyle (2001) found convergence when exclusively including respondents who stated that the valuation task was easy. As to welfare measures, results generally pointed out that they are statistically different (table 1).

Methodology

The CA exercise presented in this article was applied to the valuation, by public visitors, of a reforestation program with cork oak trees in Alcornocales Natural Park (ANP). ANP is a protected Mediterranean forest of 1,677 [km.sup.2] located in the south of Spain and it is covered by extensive woodlands where the main species is cork oak. Public visitors value its recreational environmental services highly (Campos, Caparros, and Oviedo 2007). The ANP forests currently face aging cork oak trees, due to natural mortality accentuated by diseases, and lack of natural regeneration due to overgrazing. Unchecked, this process will eventually result in the gradual replacement of the cork oak forest with shrublands. The failure of natural regeneration and private reforestation programs has led the regional administration to implement a policy providing subsidies to landowners that reforest their lands. This policy is currently being applied within the framework of the European Union Common Agricultural Policy. We decided to investigate whether social preferences, expressed through willingness to pay (WTP), are in alignment with conserving and increasing cork oak forest extent in ANE An analysis of policy implications of the results of these experiments can be found in Caparros et al. (2007).

Survey Logistics and Experimental Design

The survey provided was a CA exercise where respondents had to complete either eight choices or eight rankings per questionnaire. In each case 450 individuals answered the survey. The interviews, made from June 2002 to May 2003, were face-to-face with ANP public visitors, who were given an informative booklet with basic information about ANP and the implications of the different reforestation options.

Previously, two focus groups were used to identify the main attributes of a reforestation program for the general public, and to evaluate the extent to which the information presented in the survey was understood. A preliminary design for the choice/ranking sets was tested as well. We used the focus group information to create a pretest (1) whose main objective was to obtain the vector of monetary values to be offered in the main survey. An open-ended WTP question was used to obtain a value for a whole reforestation program, followed by six open-ended WTP questions corresponding to the six different attributes selected using the focus group (the five used in the final version plus one attribute not included in the final version, "number of birds protected"). The pretest was presented to 115 ANP visitors.

Given the information obtained in the focus group and the pretest, the attributes presented in table 2 were chosen for the analysis. Figure 1 shows an example of a choice and a ranking set.

Given these attributes and their levels, we chose sixteen treatments from the universe of 1,024 possible combinations ([4.sup.4] x [2.sup.2]) of attributes, forming a main effects design for attributes. Then, we placed the sixteen treatments in pairwise combinations in order to obtain a full set of pairwise comparisons among treatments, yielding 120 choice sets ([sub.16][C.sub.2]). This full set enables us to take into account all interactions between treatments and is more appropriate for comparison purposes. Thus, our design considers main effects for attributes and all effects between treatments.

Statistical Models

We analyzed two data sets: the information provided by the choice experiment (model C) and the information provided by the contingent ranking recoded using only the first rank (model RC). (2) In this manner, we focus on the question of if people respond in different ways to the one you would choose question and to the your most preferred question. For the regression analysis we use the nested logit (NL) model (3) in the main text (reported here), while a supplemental appendix (Caparros, Oviedo, and Campos 2008) presents the random parameter logit model (RPL) and additional NL models.

In the NL we set a reforestation (REF) branch for the two reforestation alternatives and a no reforestation (NREF) branch for the status quo. The latter is known as a degenerate branch (Louviere, Hensher, and Swait 2000, pp. 153-54). A detailed explanation of the NL model can be found in McFadden (1981) and the particular case of the model with one-degenerate branch is discussed in Hunt (2000).

We assume a linear-in-parameters utility function that originates from an additively separable linear utility model with a systematic ([V.sub.ij]) and a random component ([[epsilon].sub.ij]) : [U.sub.ij] = [[summation].sup.K.sub.k=1][[beta]'.sub.k][X.sub.kj] + [[epsilon].sub.ij] = [V.sub.ij]([X.sub.kj]) + [[epsilon].sub.ij] where [[beta].sub.k] represents the regression coefficient for the attribute k; [X.sub.kj] the value of the attribute k for each possible alternative j in the choice set; and [[epsilon].sub.ij] the random errors. The notations of the attributes included in the regression (table 2) are respectively BIO, TEC, REC, EMP, SUR, and BID. We do not include an alternative specific constant (ASC) for reforestation alternatives in the NL models in the main text because the ASCs are not significant. The supplemental appendix (Caparros, Oviedo, and Campos 2008) reports the results of the models including ASCs. Thus, the vectors [[beta].sub.k] and [X.sub.kj] in the NL models reported in the article are

[[beta]'.sub.k] = [[beta].sub.BIO], [[beta].sub.TEC], [[beta].sub.REC], [[beta].sub.EMP], [[beta].sub.SUR], [[beta].sub.BID])

[X'.sub.kj] =([x.sub.BIOj], [x.sub.TECj], [x.sub.RECj], [x.sub.EMPj], [x.sub.SVRh], [x.sub.BIDj]).

The probability of choosing alternative j in a category r (REF or NREF) is represented as (Blamey et al. 2002)

(1) [P.sub.jr] = P(j | r) P(r)

= exp[V.sub.ijr / [[alpha].sub.r]] exp [[alpha].sub.r][I.sub.r]] / exp [[I.sub.r]] [[summation].sup.R.sub.k=1] exp [[alpha].sub.k][I.sub.k]]

where

[I.sub.r] = log [[jr.summation over i=1] exp ([V.sub.ir] / [[alpha].sub.r)].

[I.sub.r] represents the inclusive value, which is a measure of the expected maximum utility from the alternatives associated with the rth class of alternatives; and [[alpha].sub.r] is the parameter of the inclusive value [I.sub.r] (Blamey et al. 2002). For the degenerate branch, the inclusive value parameter is fixed to 1 (Louviere, Hensher, and Swait 2000, p. 154).

The quantitative attributes (BIO, EMP, SUR, and BID) were coded introducing their own values and not as categorical variables (BIO was not coded as categorical because the planted species are not specified, except for the always present cork oak). The attribute REC was dummy-coded. The attribute TEC was effect-coded (1 for natural regeneration, -1 for artificial plantation, and 0 for the status quo) to differentiate the effect of choosing any of the two possible techniques from the status quo.

For welfare measures, we calculated a point estimate of the mean WTP for a marginal increase in the level of an attribute (mWTP) dividing the [beta] associated to the attribute ([[beta].sub.k]) by the [beta] associated to the payment-vehicle ([[beta].sub.BID]), with negative sign. We also generated an empirical distribution of this mWTP for each attribute through the Krinsky and Robb (1986) bootstraping technique with 1,000 replacements. In this case, the mean of the empirical distribution is the mean of the mWTP for increasing the level of the attribute. Both techniques were also applied for two cases of Hicksian surplus (HS) (Choi and Moon 1997). We selected the cases of maximum and minimum possible HS (HSMAX and HSMIN), given the highest and the lowest levels of the attributes for the reforestation alternatives. HSMAX considers four species, natural regeneration, two recreational areas, eighty employees, and 140% of present extent of forest surface. HSMIN considers one species, artificial plantation, no recreational areas, twenty employees, and 90% of present extent of forest surface.

[FIGURE 1 OMITTED]

In the case of the point estimate, we used the Wald procedure (Greene 2007, p. E38-2) for calculating the variance for the mWTP and for the HS. Invoking Cramer's theorem we constructed the 95% confidence interval. In the case of the bootstrapping, we obtained the standard deviation from the empirical distribution and the 95% confidence interval through the percentile approach (Efron and Tibshriani 1993).

In the supplemental appendix (Caparros, Oviedo, and Campos 2008), NL and RPL models including socioeconomic variables are presented (including an ASC where appropriate). The findings reported here in this article remain essentially unchanged.

Tests

A Likelihood Ratio test was used to establish whether the parameter vectors are statistically similar, that is, whether the valuation tasks derive from the same cognitive process. We followed Swait and Louviere's (1993) proposal, also applied in Blamey et al. (2002) and Holmes and Boyle (2001). With this test we are able to distinguish whether differences between parameter vectors are due to differences in taste parameters ([[beta].sub.k]) or due to differences in scale parameters ([lambda]). The scale parameter is generally unknown and set equal to 1 ([lambda] = 1), but between two separate data sets it is possible to compute the relative scale parameter.

Swait and Louviere (1993) propose a double stage test to check the hypothesis [H.sub.1]:([[lambda].sup.C][[beta].sup.C]) = ([[lambda].sup.RC] [[beta].sup.RC]). First, we test [H.sub.A]:([[beta].sup.C]) = ([[beta].sup.RC]) setting the relative scale parameter as [[lambda].sup.RC] / [[lambda].sup.C]. If HA is rejected then [H.sub.1] is also rejected. If HA is not rejected, then we test [H.sub.B]:([[lambda].sup.C]) = ([[lambda].sup.RC]). If [H.sub.B] is not rejected, then we cannot reject [H.sub.1].

To complete the comparison of the parameters, we used a simulation to check if parameters of the RC model can recover the information of the C model and vice versa. This was performed using the parameters of the attributes ([[beta].sub.k]) plus an error component, assigned to each individual i, randomly drawn from the estimated variance of the error distribution of each attribute k ([[epsilon].sub.ik]). This gives one parameter for the attribute k for each individual i ([[beta].sub.ik] = [[beta].sub.k] + [[epsilon].sub.ik]). We then calculated the percentage of correct predictions of the choice of an alternative obtained with the C model and compared it with the percentage of correct predictions obtained with the C model that uses its actual [X.sub.jk] and the [[beta].sub.ik] and [[epsilon].sub.ik] from the RC model (the same was carried out for the RC model using the [[beta].sub.ik] and [[beta].sub.ik] from the C model).

We also tested for the equality of mWTP obtained for each attribute [H.sub.2]: [(mWT[P.sub.K]).sup.C] = [(mWT[P.sub.K]).sup.RC] and for the two cases of HS previously mentioned [H.sub.3]: [(HS).sup.C] = [(HS).sup.RC]. Three tests were carried out: the nonoverlapping confidence interval test, (4) the t-test and the complete combinatorial test (Poe, Giraud, and Loomis 2005).

Results

Refusals to answer the face-to-face survey were low and very similar, representing 6% of total attempts in both exercises. As we obtained eight observations per respondent and each survey format was completed by 450 respondents, the number of observations obtained with each survey was 3,600. After removing invalid responses, we have 3,600 useable observations for the choice experiment (there were no invalid responses in this experiment) and 3,594 for the contingent ranking.

Comparison

In table 3 we compare the most important socioeconomic characteristics of the choice and ranking subsamples. We also include two characteristics related to the attitude of respondents when answering the questionnaire (attitude and understanding). In all cases we cannot reject the null hypothesis that the characteristics of choice and ranking respondents are the same.

As we have used an experimental design where each possible treatment is compared with the remaining treatments the same number of times, we can compare the number of times one treatment was chosen or ranked first without worrying about the treatments it was compared with. Through a [chi square]-test we examine whether the proportion of respondents who chose or ranked first a concrete treatment is statistically different between choice and ranking surveys. The results show that in fifteen out of the seventeen possible cases we cannot reject the null hypothesis that the percentage of times a treatment is chosen/ranked first is statistically similar (at the 5% level). The supplemental appendix (Caparros, Oviedo, and Campos 2008) reports the detailed results of these tests. That is, respondents select the same alternative when they have to choose, and when they have to do the first ranking.

Table 4 shows the regression results of the C and RC models; as can be seen, there are no significant differences between them. All parameters have the expected sign and are significant at the 1% level. In both cases, BIO has the largest value of the part-worth utility (beta parameter), followed by TEC in the C model and by REC in the RC model.

The Likelihood Ratio test (table 4) is consistent with the hypothesis that C and RC models derive from the same cognitive process. Both [H.sub.A] and [H.sub.B] are not rejected and, consequently, [H.sub.1] is not rejected. The structural models are the same for the one you would choose and for your most preferred question. This is the most important result of our analysis, because no previous comparison has found this seemingly obvious result.

The percentage of correct predictions of C and RC models are high in both cases (67% and 66%, respectively). When using parameters of the RC model to predict the choices in the C model, we obtain a similar percentage of correct predictions (67 %). Similarly, we obtain the same percentage of correct predictions when we simulate the first rank of the RC model using the parameters of the C model (66%). Thus, the predictive power of one model is recovered with the parameters of the alternative one.

Table 5 shows the mean and confidence intervals (95%) of parametric and bootstrapped estimates of mWTP and HS from C and RC models and the p-values of the equality tests. For parametric and bootstrapping estimations, confidence intervals overlap in all cases (at the 5% level) and only EMP and HSMIN diverge in the t-test. The complete combinatorial test shows that EMP and HSMIN are statistically different again and also BIO and HSMAX, but the latter ones only at the 10% level. (5)

If we look at the efficiency of estimations, the C model offers the smallest relative errors in most cases, providing more efficient estimations of welfare measures. This, together with the fact that the C model yields lower estimations, (6) can be seen as an argument in favor of using choice experiments instead of ranking. Nevertheless, this argument is rather weak because the differences are not significant.

If we compare the efficiency of welfare measure estimations within each model rather than between models, the mWTP associated with the attribute REC offers the largest relative errors in all cases. This could be due to disagreement among respondents about the appropriateness of increasing recreational areas in ANE On the other hand, both the BIO attribute and HSMAX have the smallest relative errors in both models. In this sense, the C and RC models also converge in the efficiency of welfare measures.

Testing Effects

To detect the existence of effects influencing results that could lead to differences between a choice and a recoded ranking, we compare subsamples that isolate respondents possibly affected by those effects. These results are only discussed qualitatively; statistical details can be found in the supplemental appendix (Caparros, Oviedo, and Campos 2008).

The first analysis uses subsamples formed with the first four sets of alternatives answered by each respondent (out of the eight presented in each questionnaire) checking for a "learning" effect. The opposite, a "fatigue" effect, is tested using the last four alternatives answered by each respondent. The models of these sub-samples do not add anything new to the findings of the base models, suggesting that these effects are not present. The Likelihood Ratio tests show that there are no significant differences as it is the case with the comparison between welfare measures. Hanley, Wright, and Koop (2002) also find no evidence of "learning" and "fatigue" in choice experiments.

On the other hand, the surveys included four follow-up statements, made after the valuation exercise, which enabled us to check for the presence of four potential effects. The respondents were asked to rate the following statements from 1 (totally disagree) to 5 (totally agree): (a) "I correctly understood the information provided in the previous choices/rankings;" (b) "I had difficulties in stating my answers in the previous choices/rankings;" (c) "The number of choices/rankings that I faced has been excessive;" and (d) "I thought more about my answers of the first four choices/rankings than about the last four choices/rankings." The effects tested will be called, respectively, "information," "difficulty," "sets of alternatives," and "response effort" effects. The comparison of the scores to the follow-ups shows that we cannot reject the hypothesis of statistically similar scores (Caparros, Oviedo, and Campos 2008).

Using the scores, we created subsamples for the choice and the recoded ranking data corresponding to each follow-up. Then, we compared the subsamples made from each follow-up to test if the results of the comparison were different from the results of the comparison made with the full samples.

The regressions made with the subsamples corresponding to each follow-up show that all attributes are significant at the 1% level, as in the base models. The sole exception is that in the recoded ranking model that tries to capture the "information" effect, the attribute REC is significant at the 10% and not at the 1% level (Caparros, Oviedo, and Campos 2008).

On the other hand, the Likelihood Ratio test for the models hypothetically affected by the "difficulty" effect states that the scale parameter is statistically different, because [H.sub.B] is rejected (Caparros, Oviedo, and Campos 2008). The value of the relative scale parameter between these models ([[lambda].sup.RC]/[[lambda].sup.C]) is 0.835, implying that the recoded ranking subsample has a lower scale parameter and consequently a higher error variance. Thus, the first ranking implies a more difficult cognitive process than the choice for those who found the task difficult. The comparison between welfare measures in these models shows that there is no evidence of significant differences because the scale parameter (the factor causing significant differences in the Likelihood Ratio test) is cancelled in the calculations for welfare measures (Blamey et al. 2002, pp. 174-75). For the remaining models made from the follow-ups, the results of the comparison between parameters and between welfare measures are similar to those obtained in the comparison of the full samples.

Conclusions

Although previous literature has shown that a ranking exercise recoded and analyzed as a choice using only the preferred option is different from a choice task, our results provide the first case that, when differences are eliminated from the design of the experiment, suggests that there is no difference in the cognitive process. We also found that response rates and follow-up analysis did not show significant differences between formats, except in the case of the subsample of respondents that found the task difficult. For these subsamples the difference does not reside in the taste parameters but in the scale parameters, and the relative scale parameter shows that ranking has a higher error variance. None of the other effects studied seem to have any significant impact on the estimations.

Concerning welfare measures, results also show that they are not statistically distinguishable (per attribute as well as for aggregated welfare) in most cases. This conclusion holds both for parametric and bootstrapping tests. However, most of the estimations are more efficient and lower in the choice experiment. This could be used as an argument in favor of this format. Nevertheless, this argument is rather weak because the differences from a recoded ranking are not statistically significant.

Overall, our results suggest that doing a ranking experiment, but designing the survey as if it would be a choice, may be a safe practice even if the researcher wants to focus only on the first rank/choice and analyze it using choice-based methods. The question of whether it is convenient to use the subsequent ranks in the analysis has been studied extensively elsewhere (Foster and Mourato 2002; Bateman et al. 2007) and goes beyond the scope of this article. Nevertheless, the most important take home message is that people appear rational enough, and appear to take the task seriously enough, to ensure that they choose their preferred option.

[Received October 2006; accepted December 2007.]

References

Adamowicz, W., J. Louviere, and M. Williams. 1994. "Combining Revealed and Stated Preference Methods for Valuing Environmental Amenities." Journal of Environmental Economics and Management 26(3):271-92.

Anderson, J.L., and S.U. Bettencourt. 1993. "A Conjoint Approach to Model Product Preferences: The New England Market for Fresh and Frozen Salmon." Marine Resource Economics 8(1):31-49.

Bateman, I., B. Day, G. Loomes, and R. Sugden. 2007. "Can Ranking Techniques Elicit Robust Values?" Journal of Risk and Uncertainty 34(1):49-66.

Beggs, S., S. Cardell, and J. Hausman. 1981. "Assessing the Potential Demand for Electric Cars." Journal of Econometrics 17(1):1-19.

Ben-Akiva, M., T. Morikawa, and F. Shiroishi. 1991. "Analysis of the Reliability of Preference Rank Data." Journal of Business Research 23(3):253-68.

Blamey, R.K., J.W. Bennet, J.J. Louviere, M.D. Morrison, and J.C. Rolfe. 2002. "Attribute Causality in Environmental Choice Modelling." Environmental and Resource Economics 23(2):167-86.

Boyle, K.J., T.P. Holmes, M.F. Teisl, and B. Roe. 2001. "A Comparison of Conjoint Analysis Response Formats." American Journal of Agricultural Economics 83(2):441-54.

Campos, P., A. Caparros, and J.L. Oviedo. 2007. "Comparing Payment-Vehicle Effects in Contingent Valuation Studies for Recreational Use in Two Spanish Protected Forests." Journal of Leisure Research 39(1):60-85.

Caparros, A., J.L. Oviedo, and P. Campos. 2008. "AJAE Appendix: Would You Choose Your Preferred Option? Comparing Choice and Recoded Ranking Experiments." Unpublished manuscript. Available at http:// agecon.lib.umn.edu/.

Caparros, A., E. Cerda, P. Ovando, and P. Campos. 2007. "Carbon Sequestration with Reforestations and Biodiversity-Scenic Values." FEEM Working Paper 28. 2007, Milan.

Chapman, R.G., and R. Staelin. 1982. "Exploring Rank Ordered Choice Set Data Within the Stochastic Utility Model." Journal of Marketing Research 19(3):288-301.

Choi, K., and C. Moon. 1997. "Generalized Extreme Value Model and Additively Separable Generator Function." Journal of Econometrics 76(1-2):129-40.

Efron, B., and R.J. Tibshirani. 1993. An Introduction to the Bootstrap. New York: Chapman & Hall.

Foster, V., and S. Mourato. 2002. "Testing for Consistency in Contingent Ranking Experiments." Journal of Environmental Economics and Management 44(2):309-28.

Greene, W. 2007. Limdep Version 9.0. Econometric Modeling Guide Volume 2. New York: Econometric Software.

Hanley, N., R. Wright, and G. Koop. 2002. "Modelling Recreation Demand Using Choice Experiments: Climbing in Scotland." Environmental and Resource Economics 22(3):449-66.

Hausman, J.A., and P.A. Ruud. 1987. "Specifying and Testing Econometric Models for Rank-Ordered Data." Journal of Econometrics 34(1-2):83-104.

Herriges, J., and C. Kling. 1996. "Testing the Consistency of Nested Logit Model with Utility Maximization." Economic Letters 50(1):33-39.

Holmes, T.P., and K.J. Boyle. 2001. "Cross Validation of Conjoint Ranking and Choice Data: An Application to Timber Harvesting Preferences." Paper presented at EAERE 11th Annual Conference, Southamptom UK. 28-30 June.

Hunt, G.L. 2000. "Alternative Nested Logit Model Structures and the Special Case of Partial Degeneracy." Journal of Regional Science 40(1):89-113.

Krinsky, I., and A.L. Robb. 1986. "On Approximating the Statistical Properties of Elasticities." Review of Economics and Statistics 68(4):715-19.

Louviere, J.J. 1988. "Conjoint Analysis Modeling of Stated Preferences. A Review of Theory, Methods, Recent Developments and External Validity." Journal of Transport Economics and Policy 22:93-119.

Louviere, J.J., D.A. Hensher, and J.D. Swait. 2000. Stated Choice Methods. Analysis and Application. Cambridge: Cambridge University Press.

Mackenzie, J. 1993. "A Comparison of Contingent Preference Models." American Journal of Agricultural Economics 75(3):593-603.

McFadden, D. 1981. "Econometric Models of Probabilistic Choice." In C. Manski and D. McFadden, eds. Structural Analysis of Discrete Data with Econometric Applications. Cambridge, MA.: MIT Press, pp. 198-272.

Mogas, J., and P. Riera. 2001. "Comparacion de la Ordenacion Contingente y del Experimento de Eleccion en la Valoracion de las Funciones no Privadas de los Bosques." Economia Agraria y Recursos Naturales 1(2):125-47.

Morrison, M.D., and K.J. Boyle. 2001. "Comparative Reliability of Rank and Choice Data in Stated Preference Models." Paper presented at EAERE 11th Annual Conference, Southampton UK, 28-30 June.

Poe, G.L., K.L. Giraud, and J.B. Loomis. 2005. "Computational Methods for Measuring the Difference of Empirical Distributions." American Journal of Agricultural Economics 87(2):353-65.

Roe, B., K.J. Boyle, and M.F. Teisl. 1996. "Using Conjoint Analysis to Derive Estimates of Compensating Variation." Journal of Environmental Economics and Management 31(2):145-59.

Siikamaki, J., and D.F. Layton. 2007. "Discrete Choice Survey Experiments: A Comparison Using Flexible Methods." Journal of Environ mental Economics and Management 53(1):122-39.

Stevens, T.H., C. Barret, and C.E. Willis. 1997. "Conjoint Analysis of Groundwater Protection Programs." Agricultural and Resources Economics Review 26(2):229-36.

Swait, J., and J.J. Louviere. 1993. "The Role of the Scale Parameter in the Estimation and Comparison of Multinomial Logit Models." Journal of Marketing Research 30(3):305-14.

(1) In addition to the focus group and the pretest, interviews with experts from the INIA (National Institute of Alimentary and Agrarian Technology Research) and with the ANP Director were held.

(2) From now on, we use C and RC to refer to the variables and measures corresponding to the choice and to the recoded ranking model, respectively.

(3) Independence of Irrelevant Alternatives (IIA) assumption was violated when using the conditional logit.

(4) Given a level of significance ([alpha]), we report p-values corresponding to the lowest [alpha]% of significance level, at which the (1 - [alpha])% confidence intervals do not overlap.

(5) We also estimated a rank-ordered logit using the respondents' full ranking following the method proposed in Beggs, Cardell, and Hausman (1981). The welfare measures of this model are statistically different from the ones of the C model.

(6) Lower WTP values tend to be preferred in applications, since conservative estimates are usually preferred.

Alejandro Caparros, Jose L. Oviedo, and Pablo Campos are associate research professor (Investigador Cientifico), postdoctoral researcher, and full research professor (Profesor de Investigacion), respectively, in the Institute for Public Goods and Policies (IPP), Spanish Council for Scientific Research (CSIC).

Alejandro Caparros and Jose L. Oviedo share the first authorship of the article. We thank three anonymous referees and especially Stephen Swallow (journal co-editor) and Lynn Huntsinger for their helpful comments and suggestions. We would also like to thank participants at the following conferences: WCERE 2006 (Japan), TIES 2006 (Sweden), AERNA 2006 (Spain), and IX EEA (Spain). The usual disclaimer applies. We gratefully acknowledge funding provided by the European Commission (project MEDMONT-QLRT-1999-31031), the Consejeria de Medio Ambiente (Junta de Andalucia), and the National Institute of Alimentary and Agrarian Technology Research (INIA). Table 1. Previous Comparisons Between Independent Samples of Choice and Recoded Rankings Applied to Environmental Valuation

Experimental Authors Comparison Design Boyle et al. Rating, ranking, Random (not all (2001) choice and included status

recoded Ranking quo) Mogas and Riera Ranking, choice and Different (status (2001) recoded ranking quo not included

in ranking) Holmes and Boyle Ranking, choice and Random (2001) recoded ranking (not all included

status quo) Morrison and Ranking, choice and Not available Boyle (2001) recoded ranking Authors Alternatives Sample Boyle et al. Four for all Rating: 287 (2001) Ranking: 214

Choice: 278 Mogas and Riera Three for choice and Ranking: 626 (a) (2001) four for ranking Choice: 1140 Holmes and Boyle Four for all Ranking: 212 (2001) Choice: 278 Morrison and Three for choice and Ranking: 268 (b) Boyle (2001) four for ranking Choice: 297

Results (Choice Versus Recoded Ranking)

Welfare Authors Parameters Measures Boyle et al. Statistically No comparison test (2001) significant

differences (scale

parameter not

considered) Mogas and Riera No comparison test Statistically (2001) significant

differences Holmes and Boyle Statistically No comparison test (2001) significant

differences Morrison and Statistically Statistically Boyle (2001) significant significant

differences differences (a) The total number of observations for the ranking exercise was 626 and for the choice experiment 4,576. (b) The total number of observations for the ranking exercise was 1,905 and for the choice experiment 2,068. (c) In this case, a sub-sample analysis found convergent validity parameters when including exclusively respondents who stated that the valuation task was easy. Table 2. Attributes of the Experiment and Levels Attributes Levels Biodiversitya (BIO) 1; 2-,3; 4 Technique used (TEC) Natural regeneration;

artificial plantation Number of new 0; 2

recreational areas

(REC) Additional employees 20; 40; 60; 80

(equivalent

permanent employees

(EMP) Forest surface area 90% of present extent

conserved (SUR) (10% reduction);

100% of present extent

(same surface);

120% of present extent

(20% increase);

140% of present extent

(40% increase); Increase in taxes 6 [euro]; 12 [euro]; 24 [euro]; 48 [euro]

for this year (BID) Note: the status quo levels were: no trees, no technique, no additional recreational areas, no employees, 80% of the current forest surface conserved (20% reduction) and no additional taxes. (a) Number of native tree species used, always including cork oaks. Table 3. Socioeconomic and Attitudinal Characteristics of the Subsamples

Choice Ranking

Sample Sample Variables Mean N Mean Age 33 449 34

(9) (10) Family income ([euro] per month) 1,676 434 1,615

(745) (790) Trip cost per person ([euro] per day) 19 450 19

(20) (22) Gender (1 = female; 0 = male) 0.32 450 0.36

(0.47) (0.48) Education (1 = college degree; 0.45 450 0.39

0 = otherwise) (0.3) (0.2) Cadiz (1 = respondent from Cadiz 0.78 450 0.78

province; 0 = otherwise) (a) (0.42) (0.42) Reasons for the visit (1 = active 0.31 450 0.31

tourism; 0 = otherwise) (0.46) (0.46) Substitutive (1 = respondent knows 0.56 450 0.58

substitute for the visited area; (0.50) (0.49)

0 = otherwise) Attitude (1 = poor; 0 = good) (b) 0.11 446 0.10

(0.31) (0.31) Understanding (1 = poor; 0 = good) 0.04 450 0.05

(0.20) (0.23) Variables N t-statistic (a) Age 446 -0.085 Family income ([euro] per month) 427 0.055 Trip cost per person ([euro] per day) 450 -0.024 Gender (1 = female; 0 = male) 450 -0.053 Education (1 = college degree; 450 0.089

0 = otherwise) Cadiz (1 = respondent from Cadiz 450 0.000

province; 0 = otherwise) (a) Reasons for the visit (1 = active 450 -0.003

tourism; 0 = otherwise) Substitutive (1 = respondent knows 450 -0.029

substitute for the visited area;

0 = otherwise) Attitude (1 = poor; 0 = good) (b) 444 0.004 Understanding (1 = poor; 0 = good) 447 -0.038 Standard errors are shown in brackets. N is the number of observations. (a) t-statistic at the 5 % level = 1.965. (b) Information provided by the interviewers. Table 4. Choice and Recoded Ranking Nested Logit Models Attribute Choice Parameters Model BIO 0.4543 ***

(0.0281) TEC 0.4371 ***

(0.0401) REC 0.3909 ***

(0.0677) EMP 0.0155 ***

(0.0014) SUR 0.0224 ***

(0.0017) BID -0.0249 ***

(0.0028) IV [([alpha]. 1.4385 *** sub.REF])(a) (0.0752) N 3,600 LogL ([beta]) -2,616.876 LogL (0) -4,906.096 [[rho].sup.2] 0.467 Likelihood

Ratio tests (b) [H.sub.A:]

[[beta].sup.C] =

[[beta].sup.RC] [chi square] 8.428 (C vs. RC) Attribute Recoded Parameters Ranking Model BIO 0.4197 ***

(0.0255) TEC 0.3070 ***

(0.0345) REC 0.4071 ***

(0.0624) EMP 0.0167 ***

(0.0013) SUR 0.0192 ***

(0.0016) BID -0.0184 ***

(0.0023) IV [([alpha]. 1.3050 *** sub.REF])(a) (0.0644) N 3,594 LogL ([beta]) -2,656.531 LogL (0) -4,891.350 [[rho].sup.2] 0.457 Likelihood

Ratio tests (b) [H.sub.B]: Reject [H.sub.1]:

[[lambda].sup.C] = [beta]

[[lambda].sup.RC] [[lambda].sup.C] =

[[lambda].sup.RC]? [chi square] 0.746 Non (C vs. RC) C is the choice model, RC is the recorded ranking model. Standard errors are shown in brackets. N is the number of observations. IV ([alpha]REF) is the inclusive value parameter of the REF branch. Asterisks denote significance at the 1% level. (a) Although IV([[alpha].sub.REF]) > 1, the Herriges and Kling (1996) condition for local utility maximization is fulfilled. (b) For the hypothesis [H.sub.A], the [chi square] statistic for 8 degrees of freedom at the 5% level is 15.507. For the hypothesis [H.sub.B], the [chi square] statistic for 1 degree of freedom at the 5% level is 3.841. Table 5. Welfare Measures from Choice and Recoded Ranking Nested Logit Models

Parametric

C RC Attributes Mean Mean BIO 18.21 *** 22.82 ***

[14.21, 22.21] [17.20, 28.44] TEC 17.52 *** 16.69 ***

[13.55, 21.50] [11.90, 21.49] REC 15.67 *** 22.14 ***

[9.51, 21.83] [14.04, 30.24] EMP 0.62 *** 0.91 ***

[0.46, 0.78] [0.67,1.14] SUR 0.90 *** 1.05 ***

[0.69,1.10] [0.77,1.33] HSMIN 22.09*** 34.83 ***

[16.18, 28.031 [25.57, 43.871 HSMAX 209.63 *** 265.91 ***

[169.87, 249.40] [206.97, 323.85]

Nonoverlapping t-test Attributes p-value p-valise BIO 0.350 0.191 TEC 0.853 0.794 REC 0.374 0.212 EMP 0.153 0.044 ** SUR 0.555 0.400 HSMIN 0.101 0.023 ** HSMAX 0.265 0.122

Bootstrapping

C RC Attributes Mean Mean BIO 18.50 *** 23.31 ***

[14.88, 22.29] [18.35, 30.18] TEC 17.79 *** 17.05 ***

[14.2(1, 22.51] [12.81, 22.88] REC 15.74 *** 22.37 ***

[9.87, 22.41] [15.01, 31.84] EMP 0.63 *** 0.92 ***

[0.49, 0.81] [0.72, 1.20] SUR 0.91 *** 1.07 ***

[0.72, 1.17] [0.82,1.45] HSMIN 22.44 *** 35.45 ***

[17.26, 29.73] [27.31, 47.021 HSMAX 212.72 *** 270.84 ***

[177.95, 264.96] [219.84, 351.60]

Bootstrapping

Complete

Nonoverlapping t-test Combinatorial Attributes p-value p-value p-value BIO 0.340 0.218 0.090 * TEC 0.866 0.826 0.399 REC 0.381 0.235 0.105 EMP 0.163 0.067 * 0.021 ** SUR 0.555 0.424 0.201 HSMIN 0.094 * 0.036 ** 0.010 *** HSMAX 0.271 0.161 0.059 * Note: Table reports Parametric and Bootstrapping Measures and corresponding tests of the equality of means. C is the choice model. RC is the ranking recorded as choice model. Lower and upper bounds of the confidence interval (95%) are shown in brackets; asterisks (e.g., * single asterisk, ** double asterisks, and *** triple asterisks) denote significance at the 10%, 5%, and 1% level, respectively.


COPYRIGHT 2008 American Agricultural Economics Association Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2008 Gale, Cengage Learning. All rights reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.



Copyright © Entrepreneur.com, Inc. All rights reserved. Privacy Policy