A system model for dynamically reconfigurable
software.
by Whisnant, Keith^Kalbarczyk, Zbigniew T.^Iyer, Ravishankar
K.
Reconfigurability provides the foundation upon which autonomic
systems can adapt to their changing environments, which is useful for
dynamically optimizing system functionality based upon the observed
execution profile or for recovering from errors and failures without
human intervention. Unfortunately, reconfiguring distributed systems is
difficult in practice, since the processes are often interdependent.
Arbitrarily changing the behavior of one process through reconfiguration
may render other processes unusable. By formally expressing the
dependencies among processes, both static and dynamic reconfigurations
can be analyzed to determine whether they are compatible with the
existing configuration.
This paper describes a model for formally capturing the structure
and run-time behavior of a distributed system. The structure is defined
by a set of elements containing the state variables in the system. The
runtime behavior is defined by threads that execute atomic actions
called operations. Operations invoke code blocks to bring about state
changes in the system, but these state changes are limited to a single
element and thread. The variables accessed during the execution of an
operation constitute the signature of the operation. The collective set
of operation signatures in the system can be used to derive dataflow
dependencies between operations and between elements.
Reconfiguring the system is modeled by changing the code blocks
that are invoked by an operation. Given the current configuration of the
system, a proposed reconfiguration can be subjected to an off-line test
to determine whether any of the dataflow dependencies of the system
change as a result of the reconfiguration. If there are changes to the
dataflow dependencies, the user is warned that the proposed
reconfiguration may not be safe. Ultimately, the user decides whether or
not to apply a reconfiguration. (1)
We have employed the proposed system model in developing a
software-implemented fault-tolerant (SIFT) environment based around
ARMORs (Adaptive Reconfigurable Mobile Objects of Reliability) to
provide error detection and recovery for distributed applications. To
ensure that the SIFT environment is resilient to failures, dedicated
fault-tolerant mechanisms were added by reconfiguring the ARMOR
processes (2) that make up the SIFT environment. The system model and
associated criteria for safe reconfigurations were applied to verify
that the additional fault-tolerant mechanisms did not impact the SIFT
functionality in unexpected ways.
System model
An abstract model that captures the structure and run-time
execution of the system is used to analyze the effects of
reconfiguration on the dependencies among the various components in the
system. A system is specified by a triple (C, V, T):
* C is the set of code blocks in the system. A code block performs
a computation triggered by events called operations.
* V is the set of state variables (3) in the system. Variables are
only accessed through executing code blocks.
* T is the set of threads in the system. Threads execute by
sequentially invoking code blocks in the system through operations,
which bring about state changes by manipulating the system variables in
V.
Elements. State variables in the system are partitioned into
components called elements. Code blocks are placed in the elements
containing the state variables manipulated by the code blocks. Given a
predicate Access(c, v) that is true if and only if code block c [member
of] C reads or writes variable v [member of] V, an element can be
defined formally as follows.
Definition 1 (Element): An element is a pair (C, V), where V is a
set of variables and C is a set of code blocks that do not access
variables other than those in V:
element e [equivalent to] (C, V), C [subset or equal to] C
[conjunction]([for all]c [member of] C, V [subset or equal to] V, v
[member of] V : Access(c, v) [??] v [member of] V)
Elements partition the system so that each variable is found in one
and only one element, and the code blocks within an element cannot
access state variables residing in other elements.
Operations. Code blocks in the system are indirectly invoked
through operations. One operation can invoke several code blocks as
defined by a function BindCode: P [right arrow] [2.sup.c], where P is
the alphabet of operations and [2.sup.c] is the power set over all code
blocks (i.e., BindCode maps an operation to a set of code blocks). The
notation "p [right arrow] c" is also used to denote that
operation p is bound to code block c via the BindCode function. An
operation executes by being delivered to all code blocks bound to the
operation via the BindCode function.
Definition 2 (Operation Delivery): An operation p [member of] P is
said to be delivered to a code block c [member of] c, denoted Deliver(p,
c), by executing code block c. The delivery completes when code block c
completes executing.
Because each code block exists in one and only one element,
delivery can be thought of as occurring with respect to elements as
well. An operation executes by being delivered to all bound code blocks
(elements) in the system. Execution completes only after all deliveries
for p complete.
Definition 3 (Operation Execution): An operation p [member of] P is
said to have executed, denoted Execute(p), after it has been delivered
to all code blocks bound to p:
Execute(p) [equivalent to] [for all]c [member of] BindCode(p) :
Deliver(p, c)
Threads. The run-time behavior of the system is characterized by a
set of threads, each of which serially executes a sequence of
operations. Given P, the set of all operation sequences is given by
[P.sup.*]. A sequence can be specified by its constituent operations as
P = <[p.sub.0], [p.sub.1], [p.sub.2], ...>. We denote p to be in
sequence P by p [member of] P, and that p precedes q in sequence P by p
[<.sub.P] q.
A thread T [member of] T is specified by a triple T = (P, V, F):
* P [member of] [P.sup.*] is a sequence of operations organized as
a stack, also referred to as T.ops. Threads execute by sequentially
executing the operations on the stack, beginning with the head
operation. To emphasize that operations are executed within a thread,
the notations used in Definitions 2 and 3 are extended to T.Deliver(p,
c) and T.Execute(p).
* V [member of] [V.sub.thd] is a subset of private thread variables
that are only accessible through thread T. [V.sub.thd] is disjoint from
the state variables V. Although two threads can contain private
variables with the same names, each thread maintains its own independent
value. The set of private variables for thread T is also referred to as
T.vars.
* F is a data structure called a frame stack, which is used to
preserve the contents of private thread variables across nested
operation invocations, essentially providing scope to the variables in
the thread. A nested operation invocation occurs when an operation
pushes a sequence of operations onto T.ops.
The pseudocode for the execution of a thread is given in Figure 1.
The while loop executes until the operation stack is exhausted. The head
operation is popped from the stack and stored in variable p. Line 5
executes the operation. Note that because the statement T.Execute(p)
does not complete until all deliveries for p complete (i.e., until all
code blocks bound to p have executed), the operations within a thread
are executed in a strictly sequential order.
Figure 1 Pseudocode for thread execution
1 procedure ExecuteThread(T):
2 while T.ops[not equal to]0 do
3 p := head(T.ops)
4 T.ops := tail(T.ops)
5 T.Execute(p)
6 end do
Delivery actions. The code blocks that execute during an operation
delivery cannot make arbitrary state changes to the system. Their
effects are confined to the current thread and the element to which the
operation is delivered. Specifically, a single delivery of operation p
in thread T to code block c (i.e., T.Deliver(p, c)) can perform any of
the following actions:
1. State variables of element e (the element to which c belongs)
can be read. c.readVars denotes the set of state variables read by c.
2. State variables of element e can be written by code block c,
denoted by c.writeVars.
3. Private variables within thread T can be read by code block c,
denoted by c.readThdVars.
4. Private variables within thread T can be written by code block
c, denoted by c.writeThdVars.
5. The code block can atomically push new operations onto the
operation stack of T. Let c.pushOps refer to the set of operation
sequences that can be pushed while c executes. Only one of these
sequences, however, is pushed when c executes (allows the code block to
push different sequences depending upon run-time conditions).
6. The code block can create new threads whose states are
initialized from the private variables in the parent thread T and the
state variables in e.
These six items constitute a signature of code block c, and the
signature provides a succinct, black-box description of how the code
block manipulates element state and thread state during run time.
Programmers specify the signature as meta-data for each code block that
they develop. The collective set of signatures for all code blocks in
the system are used to derive dataflow dependencies among operations
given the current BindCode mapping function.
COPYRIGHT 2003 All Rights
Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.