Experiments and quasi-experiments: methods for
evaluating marketing options; hospitality managers could achieve greater
success with marketing initiatives using experiments or
quasi-experiments to test those initiatives.
by Lynn, Ann^Lynn, Michael
Internal validity is the strength with which one can conclude that
the manipulated treatment caused the observed changes in the outcome
measure. High internal validity occurs when all alternative explanations
for the observed treatment effect have been ruled out. Confounded
treatments are the threat to internal validity. Confounding occurs when
the treatment groups differ prior to the treatments or when the
treatments differ in more ways than intended. For example, an experiment
in which men get one treatment and women get another confounds the
treatment with the sex of the subject. In this case, the researcher
cannot tell whether any difference between the treatment groups in the
outcome variable was caused by the treatments or by the subjects'
sex. Similarly, an experiment in which the experimenter must interact
with the subject after personally delivering the treatment may confound
the treatment with other experimenter actions. Psychological research
has found that experimenters who knew what treatment sub jects received
and who subsequently interacted with the subjects often unintentionally
behaved differently to those in the various treatment groups. (22)
Confounding of this kind means that researchers cannot tell whether any
differences between the treatment groups in the outcome variable were
caused by the treatments or by the experimenter's actions. Such
post-treatment confounding can be eliminated by keeping experimenters
blind to the subject's treatment group. Pre-treatment confounding
can be eliminated through random assignment of subjects to treatments
and (barring random assignment) can be reduced through the matching of
samples and the use of other quasi-experimental designs. Each of these
latter means of promoting internal validity is discussed below.
Random assignment. Assigning subjects to treatments so that each
subject has an equal chance of getting in each treatment group provides
the greatest assurance that treatment groups are similar prior to the
implementation of the treatments. As long as sample sizes are large,
this random assignment distributes the subjects' characteristics
evenly across the different treatment groups. (23) The larger the sample
being randomly assigned, the greater the similarity between the
resulting groups, but samples of 20 to 30 subjects per treatment group
are often sufficient to consider the different treatment groups as
equivalent. (24)
Random assignment of individual consumers to treatments is easy
when experiments are conducted in a laboratory or are conducted via post
or e-mail. In those cases, the experimenter has control over which
subjects get which treatment. In addition, many magazines and
television-cable companies now have the ability to deliver distinct
content to various (essentially random) subsets of their customers. This
allows marketers to expose different people to different ads even though
they are reading the same magazine or watching the same television show.
Those people can then be contacted and asked to provide information used
to compare the effectiveness of the different ads.
In some cases, random assignment of individuals to different
treatments is not possible. For example, a restaurateur could not
randomly assign individual dining parties in a field experiment that
compares the effects on sales of playing two different styles of music
over the sound system. In such cases, however, it is possible to use
different units of analysis and to conduct true experiments by randomly
assigning those units to treatments. A restaurateur could, for example,
randomly determine which of two different styles of music are played
each day for two months and could then compare the average daily sales
under each style of music. In this case, any differences between days in
the number and type of customers or other characteristics will be evenly
distributed across the two treatment groups and any subsequent
difference between me treatment groups in average daily sales can be
safely attributed to the different styles of music. In general,
researchers can assign many different units (e.g., individual consumers,
multi-person dining parties, days, units of a restaurant chain) to
treatment groups, but should make sure that those units are what are
described by the outcome measures. (25)
If random assignment of individuals or other units of analysis is
not practical, marketers can use a quasi-experimental design. To do this
the marketer must try to anticipate all the variables that might affect
the outcome variable and find naturally occurring units matched on those
variables. Unfortunately, it is nearly impossible to anticipate all the
relevant variables and find units that are perfectly matched thereon.
Even if matched pairs could be found, it is possible that factors
outside the experimenter's control could change one of the units
during the course of the study and thereby create a new confound. For
example, a competitor of one of two matched restaurants in a
quasi-experiment could suddenly close, boost the other restaurant's
sales, and confound the experiment. The internal validity of this simple
quasi-experimental design falls far short of that for a true experiment
with random assignment. There are a variety of more-complex
quasi-experimental designs that help address different threats to
internal validity, and marketers interested in conducting a
quasi-experiment should consult experts about the options available.
However, the internal validity of quasi-experiments is never as great as
that of true experiments, so whenever practical, random assignment is
the preferred method of assigning subjects to treatments.
External Validity
External validity is the extent to which an experiment's
results apply or generalize to the real marketing environment of
interest. External validity is threatened by differences between the
real-world and experimental samples, treatments, measured behavior, or
contexts. A common example of such a threat can be found in test
marketing of new entree items by, for instance, McDonald's (e.g.,
the McRib sandwich). Restaurants (not only McDonald's) often label
such items as special offers that are available "for a limited time
only." The problem with that approach stems from the fact that the
availability of the items will not remain limited if they are judged a
success and permanently added to the menu. In other words, the
experimental conditions differ from those to which the marketer wants to
generalize the experimental results. This difference is important
because limited availability increases demand for products. (26) Test
markets that describe items as special offers available for a limited
time generally i nflate the demand for those items and do not provide
good estimates of the demand that item would generate as a permanent
addition to the menu.
The way to ensure external validity is to make the features of the
experiment similar to the features of the situation to which the
experimental results will be generalized. Marketers should draw a sample
that is representative of the actual consumers of the product or
service, deliver the treatments to subjects in the same way and in the
same context that they will be delivered in the marketplace, and measure
the same outcome variable that managers want to affect in the
marketplace. However, it is expensive and difficult (if not impossible)
to make experiments similar in all respects to real-world situations of
interest. Thus, marketers must often conduct experiments that differ in
some ways from the situations to which they want to generalize the
experimental results. For example, marketers often settle for
nonrepresentative samples or measure attitudinal outcome variables when
it is consumers' behavior that they ultimately want to affect. How
much these differences affect the generalizability of the result s
depends on the specifics of the case. Some things to keep in mind when
evaluating the generalizability of results across samples and measures
are discussed below.
People of different ages, sexes, and ethnicities, as well as people
from different regions of the country or world, differ in terms of
tastes, value priorities, and other factors that may affect their
responses to marketing communications and offers. As a result, it is
dangerous to draw conclusions about one group of people based on data
about a different group of people. However, generalizing findings across
groups of people can be reasonable when there are only small differences
between the groups or when the differences that exist are unlikely to
affect responses to the treatment. For example, researchers have found
only small demographic and psychographic differences between the users
of different brands within consumer-product and -service categories.
(27) This suggests that marketers can run experiments on their own
customers and safely generalize the results to all users of the product
category. In addition, researchers have found that differences between
African-American and Caucasian consumers do nor affect their
responsiveness to point-of-purchase displays or price discounts. (28)
This suggests that marketers can generalize findings about the effects
of these tactics among one ethnic group to the other. The important
thing to keep in mind is that differences between two groups of people
do not necessarily make it inappropriate to generalize results from one
group to the other. Only when those differences affect responsiveness to
the experimental treatments is generalizability called into question.
COPYRIGHT 2003 Cornell
University Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.