Security in an autonomic computing
environment.
by Chess, David M.^Palmer, Charles C.^White, Steve R.
As computing systems have become more complex, more interconnected,
and more tightly woven into the fabric of our lives, the resources
involved in managing and administering them have grown at a steadily
increasing rate. As the costs of system hardware and software have
leveled off or decreased, the costs of the human resources devoted to
system administration have continued to grow, and therefore constitute a
steadily larger fraction of information technology (IT) costs. The
autonomic computing initiative is aimed at addressing these increasing
costs by producing computing systems that require less human effort to
administer; systems that, like the biological systems that keep our
hearts beating and our body chemistry balanced, can take care of routine
and even exceptional functions without human intervention. (1)
Like any other significant computing system, autonomic systems need
to be secure. Building secure autonomic systems is a challenge for a
number of reasons. Many autonomic systems will use new techniques and
new architectures whose security implications are not yet well
understood. Autonomic systems should not rely on anomalous behavior
caused by security compromises being noticed by humans, if they are to
benefit from reduced human administration costs. Because many autonomic
systems are expected to deal with a constantly changing set of other
systems as suppliers, customers, and partners, they need flexible new
methods for reliably establishing trust, detecting attacks and
compromise, and recovering from security incidents. Because some
autonomic systems deal with personal information about individuals, they
need to be able to represent and demonstrably obey privacy policies
required by national laws and business ethics.
Successful autonomic systems will need to be self-configuring,
self-optimizing, self-protecting, and self-healing. Although security
concerns are most obvious in protecting the system from attack and in
recovering from the effects of attacks, security will be key in all
other aspects as well. Systems must be secure in every configuration
into which they might put themselves, and in every state into which they
might optimize themselves. Systems must be robust against attempts to
provide them with false or misleading information that might lead them
to configure or optimize themselves insecurely, to enter into an
unjustified trust relationship, or to fail to protect adequately against
a malicious attack.
Autonomic computing will not reinvent computer science ex nihilo
and the security of autonomic systems will not be an entirely new kind
of security. All the traditional issues familiar to computer security
researchers will arise in autonomic systems, some in more complex and
urgent forms. And just as widespread program sharing and ubiquitous
network connectivity took computer viruses and worms from a theoretical
possibility to an annoying oddity and then to a major security concern,
we should expect that the new computing environments made possible by
autonomic computing will give rise to unique security threats of their
own.
At least as significantly, the new abilities offered by autonomic
computing will also include ways to make our systems more secure and our
private data better protected. Building and administering secure
computing systems is well known to be a difficult task; properly
configuring a complex system to conform to security policies specified
at a high level is extremely challenging even for skilled practitioners,
and even the best-administered computing systems generally conform only
approximately to their putative security policies. (2) By automating the
process of configuring, optimizing, and protecting systems according to
explicitly stated security policies, autonomic systems offer us the
opportunity to do better.
In the next section of this paper, we provide a very brief overview
of some of the architectural features that will be important in the
design of autonomic computing systems, with an emphasis on the aspects
of that architecture that relate to security. In successive sections, we
survey a number of old and new security issues and opportunities as they
apply to autonomic systems, and we describe two existing systems in
which some of these issues have begun to emerge. Along the way, we will
note both the challenges and the opportunities in providing security for
autonomic computing systems.
Architectural features of autonomic computing
Autonomic computing will have implications for computing systems at
all scales, from single devices to the worldwide networked economy. At
small scales, we anticipate that the units of autonomic computing,
generally referred to as "autonomic elements," will be
comparatively simple and of fixed function, performing the same
activities in concert with the same set of other elements for long
periods of time. At higher levels, however, we expect that many
autonomic elements will function in a very dynamic environment, in which
only the element's essential mission and governing policies will
remain constant. The details of how they carry out their mission and
what other elements they interact with may change every day, or even
every second.
We anticipate that one very common architecture for an autonomic
element will involve two parts: a functional unit that performs whatever
basic function the element provides (such as storage, database
functions, Web services, and so on), and a management unit that oversees
the operation of the functional unit, ensures that it has the resources
that it needs to perform its function, configures and reconfigures it to
adapt to changing conditions, carries out negotiations with other
autonomic elements, and so on. (3) Figure 1 shows a simple conceptual
diagram of an autonomic element consisting of a functional unit and a
management unit.
[FIGURE 1 OMITTED]
The thin arrows connecting the management unit to the world outside
the autonomic element represent the management unit's dealings with
other autonomic elements (and potentially with other external
resources). The thick arrows connecting the functional unit to the
outside world represent the channels by which the element acquires the
resources that it needs to carry out its basic function, and by which it
delivers the results of that function to other elements. The arrows
between the management unit and the functional unit represent the
sensors and effectors by which the management unit monitors and controls
the functional unit, and the arrows between the management unit and the
loops around the function arrows represent the access control that the
management unit exercises over these functional channels. No autonomic
element or other entity can provide a resource to this element, or
obtain any service from it, without the permission of the management
unit, as negotiated and obtained through the management channels.
In order to make the decisions necessary to properly oversee the
operation of the functional unit and to achieve the flexibility required
to make the element self-managing, many management units will carry with
them, or otherwise have access to, policies that govern and constrain
their behaviors at a comparatively high level and task and state
representations that functionally describe their current mission,
strategy, and status at a lower level. Unlike conventional computing
systems, which behave as they do simply because they are explicitly
programmed that way, the management unit of an autonomic element will
often have a wide range of possible strategies available to it in
fulfilling the policies that govern it, and an explicit representation
of the current state of its efforts to carry out those policies.
Some of the policies that govern an autonomic element will be
security policies. An element's security policies may include
descriptions of what level of protection needs to be applied to the
various information resources that the element contains or controls,
rules that determine how much trust the element places in other elements
with which it communicates, what cryptographic protocols the element
should use in various situations, under what circumstances the element
should apply or accept security-related patches or other updates to its
own software, and so on. Other policies will control the strategies that
an element uses to recover when one of its suppliers fails to provide an
expected resource, and to which of its commitments to give the highest
priority when not all can be fully met. These policies will either be
directly specified by a human, implicitly specified (as by a human
accepting a default), or derived from higher-level policies by the rules
of the appropriate policy calculus.
Some of the task and state representations that a management unit
holds to describe the current status and activities of the element will
also be relevant to the element's security. A management unit may,
for instance, have:
* A representation of the other elements upon which it currently
depends, and how much it trusts each of them
* A representation of the current life-cycle state of the software
that the element is running and whether or not there are any security
updates available for it
* A list of contact information for one or more other autonomic
elements or human administrators who should be notified when certain
suspicious circumstances are observed
* Agreements with one or more other autonomic elements to provide
it with security-relevant information, such as log-file analyses or
secure time-stamping
* A list of previously-vetted resource suppliers, used to quickly
verify the digital signatures on the resources they provide
COPYRIGHT 2003 All Rights
Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.