More Resources

The software architecture of a SAN storage control system.


by Glider, J.S.^Fuente, C.F.^Scales, W.J.
IBM Systems Journal • July, 2003 •

Storage controllers have traditionally enabled mainframe computers to access disk drives and other storage devices. (1) To support expensive enterprise-level mainframes built for high performance and reliability, storage controllers were designed to move data in and out of mainframe memory as quickly as possible, with as little impact on mainframe resources as possible. Consequently, storage controllers were carefully crafted from custom-designed processing and communication components, and optimized to match the performance and reliability requirements of the mainframe.

In recent years, several trends in the information technology used in large commercial enterprises have affected the requirements that are placed on storage controllers. UNIX ** and Windows ** servers have gained significant market share in the enterprise. The requirements placed on storage controllers in a UNIX or Windows environment are less exacting in terms of response time. In addition, UNIX and Windows systems require fewer protocols and connectivity options. Enterprise systems have evolved from a single operating system environment to a heterogeneous open systems environment in which multiple operating systems must connect to storage devices from multiple vendors.

Storage area networks (SANs) have gained wide acceptance. Interoperability issues between components from different vendors connected by a SAN fabric have received attention and have mostly been resolved, but the problem of managing the data stored on a variety of devices from different vendors is still a major challenge to the industry. At the same time, various components for building storage systems have become commoditized and are available as inexpensive off-the-shelf items: high-performance processors (Pentium **-based or similar), communication components such as Fibre Channel switches and adapters, and RAID (2) (redundant array of independent disks) controllers.

In 1996, IBM embarked on a program that eventually led to the IBM TotalStorage * Enterprise Storage Server * (ESS). The ESS core components include such standard components as the PowerPC *-based pSeries * platform running AIX * (a UNIX operating system built by IBM) and the RAID adapter. ESS also includes custom-designed components such as nonvolatile memory, adapters for host connectivity through SCSI (Small Computer System Interface) buses, and adapters for Fibre Channel, ESCON * (Enterprise Systems Connection) and FICON * (Fiber Connection) fabrics. An ESS provides high-end storage control features such as very large unified caches, support for zSeries * FICON and ESCON attachment as well as open systems SCSI attachment, high availability through the use of RAID-5 arrays, failover pairs of access paths and fault-tolerant power supplies, and advanced storage functions such as point-in-time copy and peer-to-peer remote copy. An ESS controller, containing two access paths to data, can have varying amounts of back-end storage, front-end connections to hosts, and disk cache, thereby achieving a degree of scalability.

A project to build an enterprise-level storage control system, also referred to as a "storage virtualization engine," was initiated at the IBM Almaden Research Center in the second half of 1999. One of its goals was to build such a system almost exclusively from off-the-shelf standard parts. As any enterprise-level storage control system, it had to deliver high performance and availability, comparable to the highly optimized storage controllers of previous generations. It also had to address a major challenge for the heterogeneous open systems environment, namely to reduce the complexity of managing storage on block devices. The importance of dealing with the complexity of managing storage networks is brought to light by the total-cost-of-ownership (TCO) metric applied to storage networks. A Gartner report (4) indicates that the storage acquisition costs are only about 20 percent of the TCO. Most of the remaining costs are related, in one way or another, to managing the storage system.

Thus, the SAN storage control project targets one area of complexity through block aggregation, also known as block virtualization. (5) Block virtualization is an organizational approach to the SAN in which storage on the SAN is managed by aggregating it into a common pool, and by allocating storage to hosts from that common pool. Its chief benefits are efficient and flexible usage of storage capacity, centralized (and simplified) storage management, as well as providing a platform for advanced storage functions.

A Pentium-based server was chosen for the processing platform, in preference to a UNIX server, because of lower cost. However, the bandwidth and memory of a typical Pentium-based server are significantly lower than those of a typical UNIX server. Therefore, instead of a monolithic architecture of two nodes (for high availability) where each node has very high bandwidth and memory, the design is based on a cluster of lower-performance Pentium-based servers, an arrangement that also offers high availability (the cluster has at least two nodes).

The idea of building a storage control system based on a scalable cluster of such servers is a compelling one. A storage control system consisting only of a pair of servers would be comparable in its utility to a midrange storage controller. However, a scalable cluster of servers could not only support a wide range of configurations, but also enable the managing of all these configurations in almost the same way. The value of a scalable storage control system would be much more than simply building a storage controller with less cost and effort. It would drastically simplify the storage management of the enterprise storage by providing a single point of management, aggregated storage pools in which storage can easily be allocated to different hosts, scalability in growing the system by adding storage or storage control nodes, and a platform for implementing advanced functions such as fast-write cache, point-in-time copy, transparent data migration, and remote copy.

In contrast, current enterprise data centers are often organized as many islands, each island containing its own application servers and storage, where free space from one island cannot be used in another island. Compare this with a common storage pool from which all requests for storage, from various hosts, are allocated. Storage management tasks-such as allocation of storage to hosts, scheduling remote copies, point-in-time copies and backups, commissioning and decommissioning storage--are simplified when using a single set of tools and when all storage resources are pooled together.

The design of the virtualization engine follows an "in-band" approach, which means that all I/O requests, as well as all management and configuration requests, are sent to it and are serviced by it. This approach migrates intelligence from individual devices to the network, and its first implementation is appliance-based (which means that the virtualization software runs on stand-alone units), although other variations, such as incorporating the virtualization application into a storage network switch, are possible.

There have been other efforts in the industry to build scalable virtualized storage. The Petal research project from Digital Equipment Corporation (6) and the DSM ** product from LeftHand Networks, Inc. (7) both incorporate clusters of storage servers, each server privately attached to its own back-end storage. Our virtualization engine prototype differs from these designs in that the back-end storage is shared by all the servers in the cluster. VERITAS Software Corporation8 markets Foundation Suite **, a clustered volume manager that provides virtualization and storage management. This design has the virtualization application running on hosts, thus requiring that the software be installed on all hosts and that all hosts run the same operating system. Compaq (now part of Hewlett-Packard Company) uses the Versastor ** technology, (9) which provides a virtualization solution based on an out-of-band manager appliance controlling multiple in-band virtualization agents running on specialized Fibre Channel host bus adapters or other processing elements in the data path. This more complex structure amounts to a two-level hierarchical architecture in which a single manager appliance controls a set of slave host-resident or switch-resident agents.

The rest of the paper is structured as follows. In the next section, we present an overview of the virtualization engine, which includes the hardware configuration, a discussion of the main challenges facing the software designers, and an overview of the software architecture. In the following four sections we describe the major software infrastructure components: the cluster operating environment, the distributed I/O facilities, the buffer management component, and the hierarchical object pools. In the last section we describe the experience gained from implementing the virtualization engine and present our conclusions.

Overview of the virtualization engine


1  2  3  4  5  6  7  8  
COPYRIGHT 2003 All Rights Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.


Browse by Journal Name:
Today on Entrepreneur
Related Video

e-Business & Technology
Franchise News
Business Book Sampler
Starting a Business
Sales & Marketing
Growing a Business
E-mail*:
Zip Code*: