More Resources

The dawning of the autonomic computing era.


by Ganek, Alan G.^Corbi, Thomas A.
IBM Systems Journal • March, 2003 •

On March 8, 2001, Paul Horn, IBM Senior Vice President and Director of Research, presented the theme and importance of autonomic computing to the National Academy of Engineering at Harvard University. His message was:

The information technology industry loves to

prove the impossible possible. We obliterate barriers

and set records with astonishing regularity.

But now we face a problem springing from the very

core of our success--and too few of us are focused

on solving it. More than any other I/T problem,

this one--if it remains unsolved--will actually prevent

us from moving to the next era of computing.

The obstacle is complexity ... Dealing with

it is the single most important challenge facing the

I/T industry. (1)

One month later, Irving Wladawsky-Berger, Vice President of Strategy and Technology for the IBM Server Group, introduced the Server Group's autonomic computing project (then named eLiza * (2)), with the goal of providing self-managing systems to address those concerns. Thus began IBM's commitment to deliver "autonomic computing"--a new companywide and, it is to be hoped, industry-wide, initiative targeted at coping with the rapidly growing complexity of operating, managing, and integrating computing systems.

We do not see a change in Moore's law (3) that would slow development as the main obstacle to further progress in the information technology (IT) industry. Rather, it is the IT industry's exploitation of the technologies in accordance with Moore's law that has led to the verge of a complexity crisis. Software developers have fully exploited a four- to six-orders-of-magnitude increase in computational power--producing ever more sophisticated software applications and environments. There has been exponential growth in the number and variety of systems and components. The value of database technology and the Internet has fueled significant growth in storage subsystems to hold petabytes (4) of structured and unstructured information. Networks have interconnected the distributed, heterogeneous systems of the IT industry. Our information society creates unpredictable and highly variable workloads on those networked systems. And today, those increasingly valuable, complex systems require more and more skilled IT professionals to install, configure, operate, tune, and maintain them.

IBM is using the phrase "autonomic computing" (5) to represent the vision of how IBM, the rest of the IT industry, academia, and the national laboratories can address this new challenge. By choosing the word "autonomic," IBM is making an analogy with the autonomic nervous system. The autonomic nervous system frees our conscious brain from the burden of having to deal with vital but lower-level functions. Autonomic computing will free system administrators from many of today's routine management and operational tasks. Corporations will be able to devote more of their IT skills toward fulfilling the needs of their core businesses, instead of having to spend an increasing amount of time dealing with the complexity of computing systems.

Need for autonomic computing

As Frederick P. Brooks, Jr., one of the architects of the IBM System/360 *, observed, "Complexity is the business we are in, and complexity is what limits us." (6) The computer industry has spent decades creating systems of marvelous and ever-increasing complexity. But today, complexity itself is the problem.

The spiraling cost of managing the increasing complexity of computing systems is becoming a significant inhibitor that threatens to undermine the future growth and societal benefits of information technology. Simply stated, managing complex systems has grown too costly and prone to error. Administering a myriad of system management details is too labor-intensive. People under such pressure make mistakes, increasing the potential of system outages with a concurrent impact on business. And, testing and tuning complex systems is becoming more difficult. Consider:

* It is now estimated that one-third to one-half of a company's total IT budget is spent preventing or recovering from crashes. (7)

* Nick Tabellion, CTO of Fujitsu Softek, said: "The commonly used number is: For every dollar to purchase storage, you spend $9 to have someone manage it." (8)

* Aberdeen Group studies show that administrative cost can account for 60 to 75 percent of the overall cost of database ownership (this includes administrative tools, installation, upgrade and deployment, training, administrator salaries, and service and support from database suppliers). (9)

* When you examine data on the root cause of computer system outages, you find that about 40 percent are caused by operator error, (10) and the reason is not because operators are not well-trained or do not have the right capabilities. Rather, it is because the complexities of today's computer systems are too difficult to understand, and IT operators and managers are under pressure to make decisions about problems in seconds. (11)

* A Yankee Group report (12) estimated that downtime caused by security incidents cost as much as $4,500,000 per hour for brokerages and $2,600,000 for banking firms.

* David J. Clancy, chief of the Computational Sciences Division at the NASA Ames Research Center, underscored the problem of the increasing systems complexity issues: "Forty percent of the group's software work is devoted to test," he said, and added, "As the range of behavior of a system grows, the test problem grows exponentially." (13)

* A recent Meta Group study looked at the impact of downtime by industry sector as shown in Figure 1.

[FIGURE 1 OMITTED]

Although estimated, cost data such as shown in Figure 1 are indicative of the economic impact of system failures and downtime. According to a recent IT resource survey by the Merit Project of Computer Associates International, 1867 respondents grouped the most common causes of outages into four areas of data center operations: systems, networks, database, and applications. (14) Most frequently cited outages included:

* For systems: operational error, user error, third-party software error, internally developed software problem, inadequate change control, lack of automated processes

* For networks: performance overload, peak load problems, insufficient bandwidth

* For database: out of disk space, log file full, performance overload

* For applications: application error, inadequate change control, operational error, nonautomated application exceptions

Well-engineered autonomic functions targeted at improving and automating systems operations, installation, dependency management, and performance management can address many causes of these "most frequent" outages and reduce outages and downtime.

A confluence of marketplace forces are driving the industry toward autonomic computing. Complex heterogeneous infrastructures composed of dozens of applications, hundreds of system components, and thousands of tuning parameters are a reality. New business models depend on the IT infrastructure being available 24 hours a day, 7 days a week. In the face of an economic downturn, there is an increasing management focus on "return on investment" and operational cost controls--while staffing costs exceed the costs of technology. To compound matters further, there continues to be a scarcity of highly skilled IT professionals to install, configure, optimize, and maintain these complex, heterogeneous systems.

To respond, system design objectives must shift from the "pure" price/performance requirements to issues of robustness and manageability in the total-cost-of-ownership equation. As a profession, we must strive to simplify and automate the management of systems. Today's systems must evolve to become much more self-managing, that is: self-configuring, self-healing, self-optimizing, and self-protecting.

Irving Wladawsky-Berger outlined the solution at the Kennedy Consulting Summit in November 2001: "There is only one answer: The technology needs to manage itself. Now, I don't mean any far out AI project; what I mean is that we need to develop the right software, the right architecture, the right mechanisms ... So that instead of the technology behaving in its usual pedantic way and requiring a human being to do everything for it, it starts behaving more like the `intelligent' computer we all expect it to be, and starts taking care of its own needs. If it doesn't feel well, it does something. If someone is attacking it, the system recognizes it and deals with the attack. If it needs more computing power, it just goes and gets it, and it doesn't keep looking for human beings to step in." (15)

What is autonomic computing?


1  2  3  4  5  6  7  
COPYRIGHT 2003 All Rights Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.


Browse by Journal Name:
Today on Entrepreneur
Related Video

e-Business & Technology
Franchise News
Business Book Sampler
Starting a Business
Sales & Marketing
Growing a Business
E-mail*:
Zip Code*: