Managing Web server performance with AutoTune
agents.
by Diao, Yixin^Hellerstein, Joseph L. ^Parekh, Sujay^Bigus, Joseph
P.
The increasing complexity of computing systems and applications
demands a correspondingly larger human effort for system configuration
and performance management. This manual effort can be time-consuming and
error-prone, and requires highly skilled personnel, making it costly.
Autonomic computing (1) uses the analogy of the human autonomic nervous
system to suggest the use of a higher level of automation and
self-management capability in computing systems.
The complexity and importance of developing autonomic computing
systems has attracted research efforts of a theoretical as well as an
applied nature. In particular, control theory is emerging as a promising
cornerstone to provide a rigorous mathematical foundation for designing
and analyzing autonomic controllers, which can reduce the work of system
administrators and provide guaranteed control performance. Some
applications of control theory to computing systems include flow and
congestion control, (2-5) differentiated caching and Web service, (6,7)
multimedia streaming, (8) Web server performance, (9) e-mail server
control, (10,11) and distributed resource allocation on the grid. (12)
These applications all provide a degree of autonomic behavior by
providing algorithms to automatically control some aspect of a computing
system's operation. However, a common theme in all this work is
that applying control techniques requires significant modeling and
design work, which is typically a manual process conducted by the system
designer. In the spirit of reducing all manual intervention, we propose
to automate this phase of the deployment of control systems as well.
In this paper, we describe an agent-based autonomic feedback
control system that uses high-level inputs from the human system
administrators to not only control a computing system, but also to
automatically design a controller suitable for that system. We
illustrate this in the context of an Apache ** Web server. Using the
Agent Building and Learning Environment (ABLE), (13) AutoTune (14)
agents are built for (1) modeling the behavior of an Apache Web server,
(2) designing the feedback control law, and (3) in on-line operation,
adjusting the server parameters in response to workload variations.
These agents cooperate at different phases of the life cycle of the
autonomic control system to achieve the function of automatically
controlling the Web server.
The remainder of the paper is organized as follows. The next
section describes the background of server self-tuning in the context of
Apache Web servers. The section "Server self-tuning with AutoTune
agents" introduces ABLE-based AutoTune agents and details the
architecture and algorithms used to automate server tuning. The
experimental results are described in the section "Experimental
assessment," comparing the performance of the Apache server
controlled by the proposed AutoTune controller and a heuristic manual
controller, particularly when significant workload variations exist.
Finally, our conclusions are presented.
Apache Web server and performance tuning
The Apache (15) Web server is the most popular Web server in use
today, (16) making resource management for such a server an important
problem. Version 1.3.x of the server on UNIX ** is structured as a
master process and a pool of worker processes. The master process
monitors the health of the worker processes and manages their creation
and destruction. The worker processes are responsible for communicating
with Web clients and generating responses, and one worker process can
handle at most one connection at a time. The number of worker processes
is limited by the parameter MaxClients, thereby throttling the Web
server's throughput. Worker processes cycle through three states:
idle, waiting, and busy. A worker is in an idle state if no Transmission
Control Protocol (TCP) connection from the client has been made to it.
Once a TCP connection is accepted, the worker process is either
waiting for a HyperText Transfer Protocol (HTTP) request from the
client, or is busy in processing the client request. According to
persistent connections in HTTP/1.1, (17) the established TCP connection
remains open between consecutive HTTP requests (which eliminates the
overhead for setting up one connection for each request as in HTTP/1.0).
This persistent connection can either be terminated by the client or by
the master process, if the waiting time of a worker process exceeds the
maximum allowed time specified by the parameter KeepAlive.
The performance of the Apache Web server can be measured by
different metrics, such as end-user response times or utilization of
various resources on the server. Selection of appropriate performance
metrics depends not only on management objectives but also on metric
availability. From the point of view of guaranteeing quality, of
service, bounding the end-user response times is desired. However,
end-user response time is a client-side metric, and additional
instrumentation such as a probing station needs to be added. This would
increase server load and raise other issues, including accuracy and
recentness of the available measurements. (Refer to Reference 18 for a
discussion of and control strategies for managing the end-user response
time.) In this paper, we quantify the server performance using
server-side metrics, CPU utilization and memory utilization, which are
easy to measure on the server and associated with business needs as
well.
System administrators typically maintain a certain utilization
level on the server, high enough to efficiently utilize system resources
but not so high that it causes thrashing and failures as a result of
over-utilization. Good end-user response times are ensured by reserving
sufficient capacity to handle workload surges. However, the system
utilization cannot be directly set in an Apache server. Instead, the
administrators must operate indirectly by adjusting certain tuning
parameters, among which MaxClients and KeepAlive are commonly used. A
higher MaxClients value allows the Apache server to process more client
requests, and increases both CPU and memory utilizations. Also,
decreasing the value of KeepAlive potentially allows worker processes to
be more active, which directly results in higher CPU utilization and
indirectly increases memory utilization, since more clients can connect
to the server. (9)
In principle, the desired and feasible CPU and memory utilizations
can be achieved by properly selecting the tuning parameters MaxClients
and Keep-Alive, but in practice, it is time-consuming, error-prone, and
skills-intensive to adjust these parameters manually. Moreover, this
tuning work has to be repeated as the workload changes or the server is
reconfigured for more CPU and memory. A change of Web site
contents may also affect the CPU and memory usage per request and can
also require different MaxClients and KeepAlive settings. We illustrate
the drawbacks of manual tuning using modified versions of our agents and
the testbed, both of which are described later. In essence, our
modifications to the Apache server allow us to change the MaxClients and
KeepAlive values without restarting the server, and the agents provide
the graphical user interface (GUI) for manually setting the values.
Suppose the administrator wants to have the desired CPU level at
0.5 and memory at 0.6. Manually tuning the Apache server is a
trial-and-error process and can be quite time-consuming. Due to the
interrelationships between the tuning parameters and performance
metrics, it may not be easy to find the proper KeepAlive and MaxClients
settings.
In Figure 1, the values for the tuning parameters KeepAlive and
MaxClients are shown in the top two Inspector windows, and the bottom
two Inspector windows show the corresponding effects on the performance
metrics CPU and memory utilization. The system is running in our testbed
and is subjected to a synthetic workload, both of which are described in
the section "Apache testbed and workload generator." The
y-axis shows the measured values, and the x-axis indicates the time,
which is measured by control intervals (i.e., sampling intervals). The
control interval is five seconds; every five seconds the tuning
parameters (if they are changed) are sent to the Apache server, and the
CPU and memory values of the Apache server are also provided to the
Apache adaptor for the system administrator to check the control
results.
[FIGURE 1 OMITTED]
As Figure 1 indicates, arbitrarily selecting Keep-Alive = 2 and
MaxClients = 100 will not yield CPU and memory utilization close to the
desired values. The tuning parameters and the performance metrics are
interrelated; for instance, increasing MaxClients to 150 causes
increases in both CPU and memory utilizations. Manually tuning the
Apache Web server using these controls is possible by using the
following heuristic. If we increase MaxClients, both CPU and memory
utilization will increase. If we increase KeepAlive, CPU utilization
will decrease. Thus, we can use MaxClients to adjust utilization until
the memory is at the desired level, and then get the desired CPU
utilization by adjusting KeepAlive. Using these tuning heuristics, in
order to achieve CPU = 0.5 and MEM = 0.6, after several tries the values
MaxClients = 400 and KeepAlive = 10 are found, which drive CPU and
memory close to the desired values.
COPYRIGHT 2003 All Rights
Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.