More Resources

Area variance estimators for simulation using folded standardized time series.


1. Introduction

One of the most important problems in simulation output analysis is the estimation of the mean [mu] of a steady-state (stationary) simulation-generated process {[Y.sub.i] : i = 1,2,...}. For instance, we may be interested in determining the steady-state mean transit time in a job shop or the long-run expected profit per period arising from a certain inventory policy. If the simulation is indeed operating in steady state, then the estimation of [mu] is not itself a particularly difficult problem--simply use the sample mean of the simulation outputs, [[bar.Y].sub.n] [equivalent to] (1/n) [[SIGMA].sub.i=1.sup.n] [Y.sub.i], as the Point estimator of [mu], where n is the sample size.

Because any serious statistical analysis should also include a measure of the variability of the sample mean, point estimation of the mean is usually not enough. One of the most commonly used measures of this variability is the variance parameter [[sigma].sup.2], which is defined as the sum of covariances of the process {[Y.sub.i] : i = 1,2,...} at all lags, and which can often be written in the intuitively appealing form [[sigma].sup.2] = [lim.sub.n[right arrow][infinity]] n Var ([[bar.Y].sub.n]. With knowledge of such a measure in hand, we could provide, among other benefits, confidence intervals for [mu]--typically of the form

[[bar.Y].sub.n][+ or -]t[square root of ([[[^.[sigma]].sup.2]/n)],

where t is a quantile from the appropriate pivot distribution and [[^.[sigma].sup.2] is an estimator for [[sigma].sub.2].

Unfortunately, the problem of estimating the variance of the sample mean is not so straightforward. The trouble is caused by discrete-event simulation data almost always being serially correlated, non-normal and contaminated by initialization bias, e.g., consecutive waiting times in a queueing system starting in an empty and idle state. These characteristics invalidate traditional statistical analysis methods, which usually rely on the assumption of independent and identically distributed (i.i.d.) normal observations. With these problems in mind, this article is concerned with providing underlying theory for estimating the variance parameter [[sigma].sup.2] of a stationary simulation-generated process -that is, the output from a simulation in steady-state operation so that any effects arising from initialization bias have been largely eliminated, for example, by using an appropriate procedure to identify and delete the warm-up period (Lada et al, 2006).

Over the years, a number of methodologies for estimating [[sigma].sup.2] have been proposed in the literature (see Law (2007)), e.g., the techniques referred to as independent replications (IR), non-overlapping batch means (NBM), overlapping batch means (OBM), standardized time series (STS), spectral analysis and regenerative analysis. IR works quite well when the output process starts in steady state, but it can have severe drawbacks in the presence of an initial transient phase (see Alexopoulos and Goldsman (2004)), which is usually the situation in large-scale practical applications. NBM--conceptually the simplest of these methodologies--divides the data {[Y.sub.i] : i = 1, 2,..., n} into non-overlapping batches, and uses the sample variance of the sample means derived from the batches (i.e., the batch means) as a foundation to estimate [[sigma].sup.2]. OBM (Meketon and Schmeiser, 1984), on the other hand, effectively reuses the data by forming overlapping batches, and then invokes an appropriately scaled sample variance of the resulting sample means derived from the batches to estimate [[sigma].sup.2]. The result is an OBM variance estimator having about the same bias as, but significantly lower variance than, the benchmark NBM estimator employing the same batch size and total sample size. STS (Schruben, 1983) uses a functional central limit theorem to standardize a stationary time series, such as output from a steady-state discrete-event simulation, into a process that converges in distribution to a limiting Brownian bridge process as the batch size or total sample size becomes large. Known properties of the Brownian bridge process are then used to obtain estimators for [[sigma].sup.2], e.g., the basic area and Cramer-von Mises (CvM) STS estimators.

Alexopoulos et al. (2007b) and Alexopoulos et al. (2007c) show that similar to OBM, overlapping batched versions of the area and CvM STS estimators have the same bias as, but substantially lower variance than, their non-overlapping counterparts. Aktaran-Kalayci et al. (2007) form minimum-variance linear combinations of overlapping variance estimators of the same type (for example, an area estimator with a certain weighting function) but with different batch sizes for different constituent estimators. Additional variance-reducing tricks in which STS reuses data are given by Goldsman et al. (2007), who form linear combinations of two different types of variance estimators (specifically, the area and CvM estimators), and Foley and Goldsman (1999), who orthonormalize various area estimators.

A recurring theme that emerges in the development of new estimators for [[sigma].sup.2] is the effective reuse of data. In the current article, we study the consequences of a new "folding" operation on the original STS process (and its limiting Brownian bridge process). The folding operation produces multiple STS processes, which in turn will ultimately allow us to use the original data to produce multiple area estimators for [[sigma].sup.2] that use a common batch size--estimators that are often asymptotically independent as the sample size grows. These folded estimators will lead to combined estimators having smaller variance than existing estimators not based on the folding operation.

The article is organized as follows. Section 2 gives some background material on STS. In Section 3, we introduce the notion of folding a Brownian bridge, and we show that each application of folding yields a new Brownian bridge process. We also derive useful expressions for these folded processes in terms of the original Brownian bridge and in terms of the original underlying Brownian motion. Section 4 is concerned with derivations of the expected values, variances and covariances of certain functionals related to the area under a folded Brownian bridge. In Section 5, we finally show how to apply these results to the problem of estimating the variance parameter of a steady-state simulation output process. The idea is to start with a single STS; form folded versions of that STS (which converge to the corresponding folded versions of a Brownian bridge process); calculate an estimator for [[sigma].sup.2] from each folded STS; and then combine the estimators into one low-variance estimator. We also show how to use batching to further improve the estimators; and we illustrate the performance of the resulting variance estimators when they are applied to a first-order autoregressive process with autoregressive parameter 0.9 and to the queue-waiting-time process for an M/M/1 queue with server utilization 0.8. We find that the new estimators indeed reduce estimator variance at little cost in bias. Section 6 presents conclusions and suggestions for future research, while the technical details of some of the proofs are relegated to the online Appendix. Antonini et al. (2007) give an abridged version of some of the theoretical and experimental results that are fully detailed in this article.

2. Background

This section lays out preliminaries on the STS methodology. We begin with some standard assumptions that we shall invoke whenever needed in the following discussions. These assumptions ensure that the proposed variance estimators work properly on a wide variety of stationary stochastic processes.

Assumptions A

1. The process {[Y.sub.i] : i = 1, 2,...} is stationary and satisfies the following functional central limit theorem. For n = 1, 2,..., the process

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)

satisfies [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where: [mu] is the steady-state mean; [[sigma].sup.2] is the variance parameter; [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] denotes the greatest integer function; W is a standard Brownian motion process on [0, 1]; and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] denotes convergence in distribution on D[0, 1], the space of functions on [0, 1] that are right-continuous with left-hand limits, as n [right arrow] [infinity]. (See also Billingsley (1968) and Glynn and Iglehart (1990).)

2. The series of covariances at all lags [[SIGMA].sub.i=-[infinity].sup.[infinity]], [R.sub.i] = [[sigma].sup.2] [member of] (0, [infinity]), where [R.sub.i] [equivalent to] cov([Y.sub.1], [Y.sub.1+i]), i = 0, [+ or -]1, [+ or -]2,. ....

3. The series [[SIGMA].sub.i=1).sup.[infinity]] [i.sup.2] [absolute value of [R.sub.i]] < [infinity].

4. The function f(*), defined on [0, 1], is twice continuously differentiable. Furthermore, f(t) satisfies the normalizing condition [[integral].sub.0.sup.1] [[integral].sub.0.sup.1] f(S)f(t) [min{s, t} - st] ds dt = 1.

As elaborated in Glynn and Iglehart (1990) and in Remark 1 of Aktaran-Kalayci et al. (2007), Assumptions A. 1 to A.3 are mild conditions that hold for a variety of processes encountered in practice. On the other hand, these assumptions appear to be violated in computer and telecommunications systems, where some target processes have distributions and autocorrelation functions with heavy tails (Crovella and Lipsky, 1997). Assumption A.4 gives conditions on the normalized weight function f(*) that will be used in our estimators for [[sigma].sup.2].

Of fundamental importance to the rest of the paper is the standardized time series of the underlying stochastic process {[Y.sub.i] : i = 1,2,...}. It is the STS that will form the basis of most of the estimators studied herein.

Page 1 2 3 4 5 Next »
COPYRIGHT 2009 Institute of Industrial Engineers, Inc. (IIE) Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.

Copyright 2009 Gale, Cengage Learning. All rights reserved. Gale Group is a Thomson Corporation Company.

NOTE: All illustrations and photos have been removed from this article.


Marketplace

Learn how to distribute a press release

Try our new online printing. theupsstore.com/print
Today on Entrepreneur

Sign Up for the Latest in:
Online Business
Franchise News
Starting a Business
Sales & Marketing
Growing a Business

E-mail*

Zip Code*