Beyond backup toward storage
management.
by Kaczmarski, M.^Jiang, T.^Pease, D.A.
The proliferation of distributed computing and Internet usage
together with continually falling storage prices, greater disk
capacities, and tremendous data growth, challenge storage administrators
to adequately provide nonintrusive backup and proper recovery of data.
Enterprise computer system data protection is subject to operational
demands that are driven by varying business requirements and continual
advancements in storage technology. A number of factors lead to inherent
complexity in the seemingly mundane task of recovering data.
All data are not the same. Information that supports important
business processes may be distributed across multiple applications,
databases, file systems, and hosts--intermixed with data that are easily
recreated and clearly less important to the enterprise. Data elements
that share the same file system or host containers have varying levels
of importance, depending upon the applications they support or the rate
at which they are changed. The management complexity in dealing with
this environment often leads to inefficient backup practices because all
data are treated at the level required for the most important elements.
Differentiated data requirements need to be recognized and managed in an
automated way to control network and media resources and the
administrative expense involved in backup processing.
Disaster recovery can involve many dimensions, ranging from simple
user errors that cause the loss of word-processing files or
spreadsheets, to hard drive failures that impact entire file systems or
databases, to tragic losses of buildings and assets that include
large-scale information technology infrastructure and storage
subsystems. Backup management is usually tuned to one of these possible
disaster situations at the expense of efficient recovery should the
other occur. Automated management of backup storage helps administrators
move from operational monitoring to developing strategies and practices
for handling each of these situations.
New storage devices and technology come and go, creating a struggle
in migrating massive amounts of data from one device type to another
while maintaining application availability. Failure to keep up with
advances in storage technology can expose an enterprise to long-term
support problems should its existing devices fail. These exposures
directly affect the ability of the organization to provide proper data
protection.
These factors provide the motivation to go beyond backup processing
to a more comprehensive storage management paradigm--one that controls
costs and automates common tasks by providing a means to map the
underlying storage to the requirements of a business.
The IBM Workstation Data Save Facility (WDSF) was developed in the
late 1980s at the IBM Almaden Research Center to meet customer
requirements for distributed network backup. The product underwent
significant redevelopment and became the ADSTAR Distributed Storage
Manager (ADSM) in 1993. It was later renamed the Tivoli Storage Manager
(TSM). The need for network backup emerged from distributed
client/server computing with the proliferation of personal computers and
workstations. The goal was to centralize the protection of distributed
data in an environment where information assets were no longer
restricted to controlled mainframe computer environments. Backing up
individual computers to locally attached devices was, and still is,
costly and error-prone and often did not meet requirements for disaster
recovery. With TSM, clients can back up their data to central servers.
The servers store the data on a variety of media and track the location
of the data for later retrieval.
Tivoli Storage Manager is a client/server application that provides
backup and recovery operations, archive and retrieve operations,
hierarchical storage management (HSM), and disaster recovery planning
across heterogeneous client hosts and centralized storage management
servers. Support has been made available for over 15 client platforms, 7
server platforms, and over 400 different storage devices as illustrated
in Figure 1. Specialized clients, represented as green cylinders in the
figure, supply backup and restore or archive and retrieve support for
specific applications such as DB2 * (Database 2 *), Lotus Domino *,
Microsoft Exchange, and SAP R/3 **. A client application programming
interface (APO is also provided for those customers or business partners
who wish to store and retrieve data directly into TSM. Data are
transferred between the clients and the TSM server over the
communications network or across a storage area network. A Web-based
administrative interface and a coordinated distribution of shared
management policy provide a common control point for multiple storage
management server instances.
[FIGURE 1 OMITTED]
Advanced design points were established for TSM in an environment
where many network backup applications evolved from simple single-host
backup utilities. The primary influences were the need to deal with
relatively slow network speeds, scalability in handling a large number
of clients and platforms, and the desire to manage data with policy
constructs borrowed from systems-managed storage (SMS) of mainframe
computers. This paper describes these design points with a survey of
functions and features that demonstrate storage management capabilities
in TSM. Today, these capabilities provide management for active data as
well as backup copies.
Minimizing network traffic: Progressive incremental backup
The rate of growth in the amount of data stored in computer systems
has traditionally outpaced growth in network bandwidth. The use of
traditional communication lines for backup processing suggests that
indiscriminate backup loads can clog or disable communication networks.
Control is needed to determine when backup processing takes place and to
ensure that backup communications traffic is minimized when it does
occur.
The goal behind the progressive incremental backup of TSM is that
once backed up, unchanged data should never have to be resent (or
rebacked up) to the server. Most methodologies for open systems have
been developed to optimize data placement for tape media reuse and not
to minimize data transfer and optimize scalability in a client/server
environment.
The use of tape as a backup medium requires that periodic
consolidation of data be performed. Tape differs from disk in that once
a tape is initialized (or "labeled"), data can only be
appended to the tape until it is full, after which time it must be
reinitialized before it can be reused. Tape consolidation is required
because files change at differing rates; subsequent backup operations
that only copy changed files will store the new copies on additional
tapes. On existing tapes this leaves logical "holes" occupied
by file copies that are no longer current. Over time, these operations
fragment the regions of useful data on the original tape volumes and
spread backup data across many tapes, requiring more time and mount
activity to perform a restore operation and requiring more media for
storing backup data. Traditional incremental or differential backup
methods achieve consolidation by periodically performing a new full
backup of all of the data to a fresh tape. This method frees the
original tapes so that they can be reused but has the side effect of
resending all (unchanged) data across the network. This method not only
wastes network bandwidth, processing cycles, and tapes, but it also
leads to having to manage more data.
Figure 2 illustrates the most common consolidation methods in use
in comparison with the progressive (or incremental forever) methodology
of TSM. Increasing points in time are displayed on the left as times
[T.sub.0] through [T.sub.5]. Each column in the figure represents a
different backup technique, with use of tape for backup storage depicted
as square tape cartridges. The dark areas on the cartridges represent
used portions of tape, whereas lighter regions represent unused tape or
regions of tape that are no longer valid because the file copies in
these areas are no longer needed.
[FIGURE 2 OMITTED]
Incremental backup processing is shown in the first column. The
technique involves periodic full backup operations that copy all data
(at times [T.sub.0] and [T.sub.3]), interspersed with
"incremental" backup operations that only copy data that have
changed since the last full or incremental backup operation (times
[T.sub.1], [T.sub.2], [T.sub.4], and [T.sub.5]). Although relatively
efficient for backup processing, the technique can require the highest
number of tape mount operations when restoring data. A full restore
operation needed shortly after time [T.sub.2] but before time [T.sub.3],
for example, would have to restore the data from tapes created at times
[T.sub.0], [T.sub.1], and [T.sub.2].
Differential backup processing, column 2 in the figure, is similar
to incremental backup processing. Periodic full backup operations (times
[T.sub.0] and [T.sub.4]) are interspersed by "differential"
backups (times [T.sub.1], [T.sub.2], and [T.sub.3]), which copy all data
that have changed since the last full backup operation. Restore
operations are more efficient than with incremental processing because
fewer volumes need to be mounted, but differential backup still requires
unchanged data to be repetitively copied (or backed up) from the client
to the server. A full restore operation after time [T.sub.2] in the
differential model would need data from tapes created at times [T.sub.0]
and [T.sub.2] (since the tape at time [T.sub.2] also contains the data
copied to the tape created at time [T.sub.1]).
COPYRIGHT 2003 All Rights
Reserved. Reproduced with permission of the copyright holder. Further reproduction or distribution is prohibited without permission.
Copyright 2003, Gale Group. All rights
reserved. Gale Group is a Thomson Corporation Company.
NOTE: All illustrations and photos have been removed from this article.