More Resources

Experiments and quasi-experiments: methods for evaluating marketing options; hospitality managers could achieve greater success


True experiments with random assignment to treatments are sometimes impossible or impractical. In this case, one can conduct a quasi-experiment, a procedure that has the following characteristics: (1) at least one manipulated treatment group and one comparison group (15) (2) at least one outcome measure, and (3) nonrandom assignment of subjects to treatments. For example, a restaurant-chain executive may want to test the effects on sales of a proposed renovation of the chain's restaurants. Randomly assigning units within the chain to renovation and nonrenovation treatment groups would not be practical because renovating enough restaurants to make such random assignment meaningful would be too costly. In such a case, one could use a quasi-experimental design to test the redesign options. (16) For example, the executive could (1) identify a pair of units that are well matched on relevant characteristics such as customer demographics and sales, (2) renovate one unit in the pair, and (3) compare the sales achieve d by each unit. To the extent that the matched pair is similar on the characteristics that are most likely to affect the outcome variables, this quasi-experimental design provides a reasonable basis for conclusions about the effects of the renovation without the costs of renovating many units.

Although there are no theoretical limits to the size and complexity of experiments and quasi-experiments, practical considerations such as cost and the availability of suitable subjects generally restrict such studies to only a few conditions. One prominent experimental-marketing researcher reports that the average experiment he does has about three levels per variable studied. (17) Direct-mail experiments often involve as many as 12 different treatments, but direct-mail experiments are typically less expensive than others, so this represents the high end of practicable experiment size. Thus, experiments and quasi-experiments are primarily useful in selecting from among a relatively narrow range of options. (18)

Issues in Designing Experiments and Interpreting Their Results

The important issues that arise when designing marketing experiments and interpreting their results involve three types of validity--those being statistical-inference validity, internal validity, and external validity. (19) These three types of validity refer to the causal conclusions derived from an experiment. (20) Such a conclusion has statistical-inference validity if the experimenter can rule out chance as an explanation for the absence or existence of differences between treatment groups. Such a conclusion has internal validity if the experimenter can rule out nonchance causes other than the intended treatments as a source of differences between treatment groups. Finally, such a conclusion has external validity if it can be generalized beyond the experimental sample and context. The requirements for each of these types of validity are briefly discussed below along with the implications for marketers who are designing an experiment or interpreting its results.

Statistical-inference Validity

A marketer has established statistical-inference validity if he or she has ensured that chance or randomness does nor explain the observed differences among the treatment groups. Statistical-inference validity is threatened by random (or chance) variations in the outcome variable of an experiment. Such variation can lead to one of two fundamental errors when interpreting the experiment's results. First, chance can increase differences among treatment groups and lead experimenters to conclude that the treatments had an effect when they really did nor. This is known as "Type-1 error." Second, chance can decrease differences among treatment groups and lead experimenters to conclude that the treatments had no effect when they really did. This is known as "Type-2 error." Marketers can reduce these two threats to statistical-inference validity by selecting appropriate acceptable alpha levels, obtaining sufficient sample sizes, and reducing within-treatment-group variability. Each of these methods of reducing statis tical error is described below.

Appropriate acceptable alpha levels. The alpha level for a study is the probability of making a Type-1 error. The actual alpha is reported on the output of statistical-analysis programs, and is sometimes referred to as the "p value." Marketers decide what probability of making a Type1 error is acceptable and conclude that observed differences between treatments reflect real (non-chance) effects only when appropriate statistical tests indicate that the probability of making a Type-1 error is tolerable. The conventionally accepted alpha level is p [less than or equal to] .05, meaning that the experimenter is willing to take no more than a 5percent chance of accepting an observed effect as real when it is not. The reason for accepting some nonzero probability of making a Type-1 error is that lowering this probability increases the probability of making a Type-2 error Thus, marketers must weigh the relative consequences of making a Type-1 or a Type-2 error when deciding on the acceptable alpha level for a study.

Another thing to keep in mind is that the probability of making a Type-1 error increases with the number of comparisons being made between treatment groups. Assuming that an experiment's treatments have no real effect, the probability of making a Type-1 error could be 5 percent when making one comparison between two treatments, but it may be 50 percent when using the same alpha level to make 10 different comparisons among multiple treatments. Thus marketers making separate comparisons among multiple treatment groups may want to select a more-stringent acceptable alpha level than would those making a single comparison or else choose a statistical analysis that helps control the error rate.

Sufficient sample sizes. Large samples are less susceptible to the vagaries of chance than are small samples. Thus, marketers can reduce the probability of making a Type-2 error (using a given alpha level) by increasing the sample size. However, this does not mean that marketers always need to use large samples. If real treatment effects are large or alpha levels are high, then Type-2 errors could be rare even with small samples. Since large samples are expensive to obtain, marketers should make sure they are needed before using them.

To save money, marketers should determine the sample size needed to keep the probabilities of Type-1 and Type-2 errors at desired levels. These calculations can be done by hand or on one of several available computer programs.2' To calculate sample size, marketers must specify (1) the desired probability of Type-2 errors, (2) the real size of treatment effects, (3) the acceptable alpha level, and (4) the within-treatment variability in the outcome measure. Since the size of the real treatment effect is generally unknown (that is why an experiment is being conducted in the first place), marketers should decide what the smallest practically meaningful effect would be and use that as the effect size.

Variability within treatments. While researchers hope for variability in the outcome measure between treatment groups (that is, the effect of the treatment), variability within treatment groups increases the chances of a Type-2 error. Variability within treatments is a measure of the consistency of subjects' responses to the treatment. Ideally, there is low variability within treatments, which indicates that subjects responded to the treatments in the same way. If there is high variability, then observed differences between treatments may be due to chance and nor the manipulated treatment. So, one way marketers can reduce the probability of making a Type-2 error is to reduce differences in the outcome variable among the subjects within each treatment group. This can be accomplished by increasing the similarity of the subjects to one another and by increasing the uniformity of the conditions under which data are collected. For example, the restaurant-menu experiment described previously would have less variabi lity in check size if it used only evening dining parties comprising one male and one female as subjects than if it included lunch and evening dining parties of all compositions. Of course, increasing the similarity of subjects and the uniformity of conditions can compromise the generalizability of the results, so one must take this potential shortcoming into account. More will be said about generalizability in a subsequent section.

Internal Validity

Internal validity is the strength with which one can conclude that the manipulated treatment caused the observed changes in the outcome measure. High internal validity occurs when all alternative explanations for the observed treatment effect have been ruled out. Confounded treatments are the threat to internal validity. Confounding occurs when the treatment groups differ prior to the treatments or when the treatments differ in more ways than intended. For example, an experiment in which men get one treatment and women get another confounds the treatment with the sex of the subject. In this case, the researcher cannot tell whether any difference between the treatment groups in the outcome variable was caused by the treatments or by the subjects' sex. Similarly, an experiment in which the experimenter must interact with the subject after personally delivering the treatment may confound the treatment with other experimenter actions. Psychological research has found that experimenters who knew what treatment sub jects received and who subsequently interacted with the subjects often unintentionally behaved differently to those in the various treatment groups. (22) Confounding of this kind means that researchers cannot tell whether any differences between the treatment groups in the outcome variable were caused by the treatments or by the experimenter's actions. Such post-treatment confounding can be eliminated by keeping experimenters blind to the subject's treatment group. Pre-treatment confounding can be eliminated through random assignment of subjects to treatments and (barring random assignment) can be reduced through the matching of samples and the use of other quasi-experimental designs. Each of these latter means of promoting internal validity is discussed below.

COPYRIGHT 2003 Cornell University Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.

Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

NOTE: All illustrations and photos have been removed from this article.


Marketplace

Learn how to distribute a press release

Try our new online printing. theupsstore.com/print
Today on Entrepreneur

Sign Up for the Latest in:
Online Business
Franchise News
Starting a Business
Sales & Marketing
Growing a Business

E-mail*

Zip Code*