More Resources

A system model for dynamically reconfigurable software.


by Whisnant, Keith^Kalbarczyk, Zbigniew T.^Iyer, Ravishankar K.
IBM Systems Journal • March, 2003 •

Reconfigurability provides the foundation upon which autonomic systems can adapt to their changing environments, which is useful for dynamically optimizing system functionality based upon the observed execution profile or for recovering from errors and failures without human intervention. Unfortunately, reconfiguring distributed systems is difficult in practice, since the processes are often interdependent. Arbitrarily changing the behavior of one process through reconfiguration may render other processes unusable. By formally expressing the dependencies among processes, both static and dynamic reconfigurations can be analyzed to determine whether they are compatible with the existing configuration.

This paper describes a model for formally capturing the structure and run-time behavior of a distributed system. The structure is defined by a set of elements containing the state variables in the system. The runtime behavior is defined by threads that execute atomic actions called operations. Operations invoke code blocks to bring about state changes in the system, but these state changes are limited to a single element and thread. The variables accessed during the execution of an operation constitute the signature of the operation. The collective set of operation signatures in the system can be used to derive dataflow dependencies between operations and between elements.

Reconfiguring the system is modeled by changing the code blocks that are invoked by an operation. Given the current configuration of the system, a proposed reconfiguration can be subjected to an off-line test to determine whether any of the dataflow dependencies of the system change as a result of the reconfiguration. If there are changes to the dataflow dependencies, the user is warned that the proposed reconfiguration may not be safe. Ultimately, the user decides whether or not to apply a reconfiguration. (1)

We have employed the proposed system model in developing a software-implemented fault-tolerant (SIFT) environment based around ARMORs (Adaptive Reconfigurable Mobile Objects of Reliability) to provide error detection and recovery for distributed applications. To ensure that the SIFT environment is resilient to failures, dedicated fault-tolerant mechanisms were added by reconfiguring the ARMOR processes (2) that make up the SIFT environment. The system model and associated criteria for safe reconfigurations were applied to verify that the additional fault-tolerant mechanisms did not impact the SIFT functionality in unexpected ways.

System model

An abstract model that captures the structure and run-time execution of the system is used to analyze the effects of reconfiguration on the dependencies among the various components in the system. A system is specified by a triple (C, V, T):

* C is the set of code blocks in the system. A code block performs a computation triggered by events called operations.

* V is the set of state variables (3) in the system. Variables are only accessed through executing code blocks.

* T is the set of threads in the system. Threads execute by sequentially invoking code blocks in the system through operations, which bring about state changes by manipulating the system variables in V.

Elements. State variables in the system are partitioned into components called elements. Code blocks are placed in the elements containing the state variables manipulated by the code blocks. Given a predicate Access(c, v) that is true if and only if code block c [member of] C reads or writes variable v [member of] V, an element can be defined formally as follows.

Definition 1 (Element): An element is a pair (C, V), where V is a set of variables and C is a set of code blocks that do not access variables other than those in V:

element e [equivalent to] (C, V), C [subset or equal to] C [conjunction]([for all]c [member of] C, V [subset or equal to] V, v [member of] V : Access(c, v) [??] v [member of] V)

Elements partition the system so that each variable is found in one and only one element, and the code blocks within an element cannot access state variables residing in other elements.

Operations. Code blocks in the system are indirectly invoked through operations. One operation can invoke several code blocks as defined by a function BindCode: P [right arrow] [2.sup.c], where P is the alphabet of operations and [2.sup.c] is the power set over all code blocks (i.e., BindCode maps an operation to a set of code blocks). The notation "p [right arrow] c" is also used to denote that operation p is bound to code block c via the BindCode function. An operation executes by being delivered to all code blocks bound to the operation via the BindCode function.

Definition 2 (Operation Delivery): An operation p [member of] P is said to be delivered to a code block c [member of] c, denoted Deliver(p, c), by executing code block c. The delivery completes when code block c completes executing.

Because each code block exists in one and only one element, delivery can be thought of as occurring with respect to elements as well. An operation executes by being delivered to all bound code blocks (elements) in the system. Execution completes only after all deliveries for p complete.

Definition 3 (Operation Execution): An operation p [member of] P is said to have executed, denoted Execute(p), after it has been delivered to all code blocks bound to p:

Execute(p) [equivalent to] [for all]c [member of] BindCode(p) : Deliver(p, c)

Threads. The run-time behavior of the system is characterized by a set of threads, each of which serially executes a sequence of operations. Given P, the set of all operation sequences is given by [P.sup.*]. A sequence can be specified by its constituent operations as P = <[p.sub.0], [p.sub.1], [p.sub.2], ...>. We denote p to be in sequence P by p [member of] P, and that p precedes q in sequence P by p [<.sub.P] q.

A thread T [member of] T is specified by a triple T = (P, V, F):

* P [member of] [P.sup.*] is a sequence of operations organized as a stack, also referred to as T.ops. Threads execute by sequentially executing the operations on the stack, beginning with the head operation. To emphasize that operations are executed within a thread, the notations used in Definitions 2 and 3 are extended to T.Deliver(p, c) and T.Execute(p).

* V [member of] [V.sub.thd] is a subset of private thread variables that are only accessible through thread T. [V.sub.thd] is disjoint from the state variables V. Although two threads can contain private variables with the same names, each thread maintains its own independent value. The set of private variables for thread T is also referred to as T.vars.

* F is a data structure called a frame stack, which is used to preserve the contents of private thread variables across nested operation invocations, essentially providing scope to the variables in the thread. A nested operation invocation occurs when an operation pushes a sequence of operations onto T.ops.

The pseudocode for the execution of a thread is given in Figure 1. The while loop executes until the operation stack is exhausted. The head operation is popped from the stack and stored in variable p. Line 5 executes the operation. Note that because the statement T.Execute(p) does not complete until all deliveries for p complete (i.e., until all code blocks bound to p have executed), the operations within a thread are executed in a strictly sequential order. Figure 1 Pseudocode for thread execution 1 procedure ExecuteThread(T): 2 while T.ops[not equal to]0 do 3 p := head(T.ops) 4 T.ops := tail(T.ops) 5 T.Execute(p) 6 end do

Delivery actions. The code blocks that execute during an operation delivery cannot make arbitrary state changes to the system. Their effects are confined to the current thread and the element to which the operation is delivered. Specifically, a single delivery of operation p in thread T to code block c (i.e., T.Deliver(p, c)) can perform any of the following actions:

1. State variables of element e (the element to which c belongs) can be read. c.readVars denotes the set of state variables read by c.

2. State variables of element e can be written by code block c, denoted by c.writeVars.

3. Private variables within thread T can be read by code block c, denoted by c.readThdVars.

4. Private variables within thread T can be written by code block c, denoted by c.writeThdVars.

5. The code block can atomically push new operations onto the operation stack of T. Let c.pushOps refer to the set of operation sequences that can be pushed while c executes. Only one of these sequences, however, is pushed when c executes (allows the code block to push different sequences depending upon run-time conditions).

6. The code block can create new threads whose states are initialized from the private variables in the parent thread T and the state variables in e.

These six items constitute a signature of code block c, and the signature provides a succinct, black-box description of how the code block manipulates element state and thread state during run time. Programmers specify the signature as meta-data for each code block that they develop. The collective set of signatures for all code blocks in the system are used to derive dataflow dependencies among operations given the current BindCode mapping function.


1  2  3  4  5  6  
COPYRIGHT 2003 All Rights Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.


Browse by Journal Name:
Today on Entrepreneur
Related Video

e-Business & Technology
Franchise News
Business Book Sampler
Starting a Business
Sales & Marketing
Growing a Business
E-mail*:
Zip Code*: