IBM Spectrum Protect Introduction To Data Protection Solutions
IBM Spectrum Protect Introduction To Data Protection Solutions
8.1.16
IBM
Note:
Before you use this information and the product it supports, read the information in “Notices” on page
47.
Edition notice
This edition applies to version 8, release 1, modification 16 of IBM Spectrum® Protect (product numbers 5725-W98,
5725-W99, 5725-X15), and to all subsequent releases and modifications until otherwise indicated in new editions.
© Copyright International Business Machines Corporation 1993, 2022.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with
IBM Corp.
Contents
Appendix A. Accessibility.................................................................................... 45
Notices................................................................................................................47
Glossary..............................................................................................................51
Index.................................................................................................................. 53
iii
iv
About this publication
This publication provides an overview of IBM Spectrum Protect concepts and data protection solutions
that use best practices for IBM Spectrum Protect. A feature comparison chart helps you select the best
solution for your organization's needs.
Publications
The IBM Spectrum Protect product family includes IBM Spectrum Protect Plus, IBM Spectrum Protect
for Virtual Environments, IBM Spectrum Protect for Databases, and several other storage management
products from IBM®.
To view IBM product documentation, see IBM Documentation.
Server
Client systems send data to the server to be stored as backups or archived data. The server includes an
inventory, which is a repository of information about client data.
The inventory includes the following components:
Database
Information about each file, logical volume, or database that the server archives, migrates or creates
a backup of, is stored in the server database. The server database also contains information about the
policy and schedules for data protection services.
Recovery log
Records of database transactions are kept in this log. The recovery log helps to maintain data
consistency in the database.
Client software
For IBM Spectrum Protect to protect client data, the appropriate software must be installed on the
client system and the client must be registered with the server.
Client nodes
A client node is equivalent to a computer, virtual machine, or application, such as a backup-archive
client that is installed on a workstation for file system backups. Each client node must be registered
with the server. Multiple nodes can be registered on a single computer.
IBM Spectrum Protect provides the following types of data protection services:
Back up and restore services
You run a backup process to create a copy of a data object that can be used for recovery if the original
data object is lost. A data object can be a file, a directory, or a user-defined data object, such as a
database.
IBM Spectrum Protect uses policies to control how the server stores and manages data objects on various
types of storage devices and media. You associate a client with a policy domain that contains one active
policy set. When a client creates a backup, archives, or migrates a file, the file is bound to a management
class in the active policy set of the policy domain. The management class and the backup and archive
copy groups specify where files are stored and how they are managed. If you set up server storage in a
hierarchy, you can migrate files to different storage pools.
Security management
IBM Spectrum Protect includes security features for registration of administrators and users. After
administrators are registered, they must be granted authority by being assigned one or more
administrative privilege classes. An administrator with system privilege can perform any server function.
Administrators with policy, storage, operator, or node privileges can perform subsets of server functions.
The server can be accessed by using the following methods, each controlled with a password:
• Administrator access to manage the server
• Client access to nodes to store and retrieve data
Also includes features that can help to ensure security when clients connect to the server with the
following closed registration method:
Closed registration
Closed registration is the default method for client registration to the server. For this type of
registration, an administrator registers all clients. The administrator can implement the following
settings:
• Assign the node to any policy domain
• Determine whether the user can use compression or not, or if the user can choose
• Control whether the user can delete backed up files or archived files
You can add more protection for your data and passwords by using Secure Sockets Layer (SSL). SSL
is the standard technology that you use to create encrypted sessions for servers and clients, and
provides a secure channel to communicate over open communication paths. With SSL, the identity of
the server is verified by using digital certificates. If you authenticate with a Lightweight Directory Access
You use the following interfaces to interact with IBM Spectrum Protect:
Operations Center
The Operations Center provides web and mobile access to status information about the IBM
Spectrum Protect environment. You can use the Operations Center to complete monitoring and
certain administration tasks, for example:
• You can monitor multiple servers and clients.
• You can monitor the transaction activity for specific components in the data path, such as the server
database, the recovery log, storage devices, and storage pools.
Command-line interface
You can use a command-line interface to run administration tasks for servers. You can access
the command-line interface through either the IBM Spectrum Protect administrative client or the
Operations Center.
The administrator defines the storage objects in the logical layer of the server, as illustrated in Figure 5 on
page 13.
Disk devices
You can store client data on disk devices with the following types of volumes:
• Directories in directory-container storage pools
• Random-access volumes of device type DISK
• Sequential-access volumes of device type FILE
IBM Spectrum Protect offers the following features when you use directory-container storage pools for
data storage:
• You can apply data deduplication and disk caching techniques to maximize data storage usage.
• You can retrieve data from disk much faster than you can retrieve data from tape storage.
Copy storage pools provide a means of recovering from disasters or media failures. For example, when a
client fails to retrieve a damaged file from the primary storage pool, the client can restore the data from
the copy storage pool.
You can move the volumes of copy storage pools offsite and still have the server track the volumes.
Moving these volumes offsite provides a means of recovering from an onsite disaster. A copy storage pool
can use sequential-access storage only, such as a tape device class or FILE device class.
Depending on your system configuration, you can create protection schedules to simultaneously copy
the directory-container storage pool data to onsite or offsite container-copy storage pools to meet your
requirements:
• If replication is enabled, you can create one offsite container-copy pool. The offsite copy can be used to
provide extra protection in a replicated environment.
• If replication is not enabled, you can create one onsite and one offsite container-copy storage pool.
Depending on the resources and requirements of your site, the ability to copy directory-container storage
pools to tape has the following benefits:
• You avoid maintaining another server and more disk storage space.
• Data is copied to storage pools that are defined on the server. Performance is not dependent on, or
affected by, the network connection between servers.
• You can satisfy regulatory and business requirements for offsite tape copies.
Active-data pools can use any type of sequential-access storage. However, the benefits of an active-data
pool depend on the device type that is associated with the pool. For example, active-data pools that are
associated with a FILE device class are ideal for fast client restore operations because of the following
reasons:
• FILE volumes do not have to be physically mounted.
• Client sessions that are restoring from FILE volumes in an active-data pool can access the volumes
concurrently, which improves restore performance.
Data stored in
sequential-access volumes
(tape device classes)
STGTYPE=DEVCLASS
Containers
With retention storage pools, you can optimize the processes for storing retained data at an offsite
location. You can separate long-term data from data that you want to keep on a short-term basis.
In a LAN configuration, one or more tape libraries are associated with a single IBM Spectrum Protect
server. In this type of configuration, client data, electronic mail, terminal connection, application
program, and device control information must all be handled by the same network. Device control
information and client backup and restore data flow across the LAN.
Data backup operations over a SAN
Figure 12 on page 20 shows the data path for IBM Spectrum Protect backup operations over a SAN.
Storage management
You manage the devices and media that are used to store client data through the IBM Spectrum Protect
server. The server integrates storage management with the policies that you define for managing client
data in the following areas:
Types of devices for server storage
With IBM Spectrum Protect, you can use directly attached devices and network-attached devices
for server storage. IBM Spectrum Protect represents physical storage devices and media with
administrator-defined storage objects.
Data migration through the storage hierarchy
For primary storage pools other than directory-container storage pools, you can organize the storage
pools into one or more hierarchical structures. This storage hierarchy provides flexibility in a number
of ways. For example, you can set a policy to back up data to disks for faster backup operations. The
IBM Spectrum Protect server can then automatically migrate data from disk to tape.
Data deduplication
When the IBM Spectrum Protect server receives data from a client, the server identifies duplicate data
extents and stores unique instances of the data extents in a directory-container storage pool. The
data deduplication technique improves storage utilization and eliminates the need for a dedicated data
deduplication appliance.
If the same byte pattern occurs many times, data deduplication greatly reduces the amount of data that
must be stored or transferred. In addition to whole files, IBM Spectrum Protect can also deduplicate parts
of files that are common with parts of other files.
IBM Spectrum Protect provides the following types of data deduplication:
Server-side data deduplication
The server identifies duplicate data extents and moves the data to a directory-container storage pool.
The server-side process uses inline data deduplication, where data is deduplicated at the same time
that the data is written to a directory-container storage pool. Deduplicated data can also be stored in
other types of storage pools. Inline data deduplication on the server provides the following benefits:
• Eliminates the need for reclamation
• Reduces the space that is occupied by the stored data
Client-side data deduplication
With this method, processing is distributed between the server and the client during a backup
process. The client and the server identify and remove duplicate data to save storage space on the
server. In client-side data deduplication, only compressed, deduplicated data is sent to the server.
When client data is replicated, data that is not on the target replication server is copied to the
target replication server. When replicated data exceeds the retention limit, the target replication server
automatically removes the data from the source replication server. To maximize data protection, you
synchronize the local server and the remote server; for example, Site B replicates data from Site A and
Site A replicates data from Site B. As part of replication processing, client data that was deleted from the
source replication server is also deleted from the target replication server.
IBM Spectrum Protect provides the following replication functions:
• You can define policies for the target replication server in the following ways:
– Identical policies on the source replication server and target replication server
– Different policies on the source replication server and target replication server to meet different
business requirements.
If a disaster occurs and the source replication server is not available, clients can recover data from the
target replication server. If the source replication server cannot be recovered, you can direct clients to
store data on the target replication server. When an outage occurs, the clients that are backed up to
the source replication server can automatically fail over to restore their data from the target replication
server.
• You can use replication processing to recover damaged files from storage pools. You must replicate
the client data to the target replication server before the file damage occurs. Subsequent replication
processes detect damaged files on the source replication server and replace the files with undamaged
files from the target replication server.
Database backups
You use database backups to recover your system following database damage. Also, database backup
operations must be used to prevent Db2 from running out of archive log space. Database backup
operations are not part of node replication. A database backup can be full, incremental, or snapshot.
To provide for disaster recovery, a copy of the database backups must be stored offsite. To restore the
database, you must have the backup volumes for the database. You can restore the database from backup
volumes by either a point-in-time restore or a most current restore operation.
Point-in-time restore
Use point-in-time restore operations for situations such as disaster recovery or to remove the effects
of errors that can cause inconsistencies in the database. Restore operations for the database that use
snapshot backups are a form of point-in-time restore operation. The point-in-time restore operation
includes the following actions:
• Removes and re-creates the active log directory and archive log directory that are specified in the
dsmserv.opt file.
• Restores the database image from backup volumes to the database directories that are recorded in
a database backup or to new directories.
Automatic failover for data recovery occurs if the source replication server is unavailable because of a
disaster or a system outage. During normal operations, when the client accesses a source replication
server, the client receives connection information for the target replication server. The client node stores
the failover connection information in the client options file.
During client restore operations, the server automatically changes clients from the source replication
server to the target replication server and back again. Only one server per node can be used for failover
protection at any time. When a new client operation is started, the client attempts to connect to the
source replication server. The client resumes operations on the source server if the source replication
server is available.
To use automatic failover for replicated client nodes, the source replication server, the target replication
server, and the client must be at the V7.1 level or later. If any of the servers are at an earlier level,
automatic failover is disabled and you must rely on a manual failover process.
Highlights
Cost $ $$$ $$$$ $$
Protection level One data Two or more Two or more Two or more
copy data copies data copies data copies
Disaster recovery None Active Standby server Offsite copies
server
Key benefits
Leading-edge data reduction
Appliance-based replication
What to do next
Review available documentation for the solutions in Chapter 9, “Roadmap for implementing a data
protection solution,” on page 43.
Related reference
Disk-based implementation of a data protection solution for a single site
This disk-based implementation of a data protection solution with IBM Spectrum Protect uses inline data
deduplication and provides protection for data on a single site.
Disk-based implementation of a data protection solution for multiple sites
This disk-based implementation of a data protection solution with IBM Spectrum Protect uses inline data
deduplication and replication at two sites.
Appliance-based implementation of a data protection solution for multiple sites
This implementation of a multi-site IBM Spectrum Protect data protection solution uses appliance-based
data deduplication and replication. A standby server is configured at a second site to recover data if the
primary server is unavailable.
Tape-based implementation of a data protection solution
This implementation of a data protection solution with IBM Spectrum Protect uses one or more tape
storage devices to back up data. Tape backup provides low-cost scalability that is optimized for long-term
retention.
Tape solution
For steps that describe how to plan for, implement, monitor, and operate a tape device solution, see Tape
solution.
Overview
The IBM Spectrum Protect family of products includes the following major accessibility features:
• Keyboard-only operation
• Operations that use a screen reader
The IBM Spectrum Protect family of products uses the latest W3C Standard, WAI-ARIA 1.0
(www.w3.org/TR/wai-aria/), to ensure compliance with US Section 508 and Web Content Accessibility
Guidelines (WCAG) 2.0 (www.w3.org/TR/WCAG20/). To take advantage of accessibility features, use the
latest release of your screen reader and the latest web browser that is supported by the product.
The product documentation in IBM Documentation is enabled for accessibility.
Keyboard navigation
This product uses standard navigation keys.
Interface information
User interfaces do not have content that flashes 2 - 55 times per second.
Web user interfaces rely on cascading style sheets to render content properly and to provide a usable
experience. The application provides an equivalent way for low-vision users to use system display
settings, including high-contrast mode. You can control font size by using the device or web browser
settings.
Web user interfaces include WAI-ARIA navigational landmarks that you can use to quickly navigate to
functional areas in the application.
Vendor software
The IBM Spectrum Protect product family includes certain vendor software that is not covered under the
IBM license agreement. IBM makes no representation about the accessibility features of these products.
Contact the vendor for accessibility information about its products.
TTY service
800-IBM-3383 (800-426-3383)
(within North America)
For more information about the commitment that IBM has to accessibility, see IBM Accessibility
(www.ibm.com/able).
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual
Property Department in your country or send inquiries, in writing, to:
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs
in any form without payment to IBM, for the purposes of developing, using, marketing or distributing
application programs conforming to the application programming interface for the operating platform
for which the sample programs are written. These examples have not been thoroughly tested under
all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these
programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be
liable for any damages arising out of your use of the sample programs.
Each copy or any portion of these sample programs or any derivative work must include a copyright
notice as follows: © (your company name) (year). Portions of this code are derived from IBM Corp. Sample
Programs. © Copyright IBM Corp. _enter the year or years_.
Trademarks
IBM, the IBM logo, and ibm.com® are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at
"Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.
Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other
countries.
Linear Tape-Open, LTO, and Ultrium are trademarks of HP, IBM Corp. and Quantum in the U.S. and other
countries.
Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the
United States and other countries.
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive
licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the United States, other
countries, or both.
Java™ and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Red Hat®, OpenShift®, Ansible®, and Ceph® are trademarks or registered trademarks of Red Hat, Inc. or its
subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
48 Notices
VMware, VMware vCenter Server, and VMware vSphere are registered trademarks or trademarks of
VMware, Inc. or its subsidiaries in the United States and/or other jurisdictions.
Notices 49
50 IBM Spectrum Protect: Introduction to Data Protection Solutions
Glossary
A glossary is available with terms and definitions for the IBM Spectrum Protect family of products.
See the IBM Spectrum Protect glossary.
A device class 11
device replication 24, 28
About this publication v directory-container storage pools 14
accessibility features 45 disability 45
active-data pools 20 disaster recovery
active-data storage pools 14 automatic failover 28
API, See application programming interface DRM 28
application clients 4 manager 28
application programming interface 9 methods 24
archive service 4 preventive measures 28
drive 11
B
F
backup service 4
failover, automatic 28
C
G
client data
consolidation of 20 GUI, for clients 9
create a backup set for 20
management of 20
migration of 20
I
moving to storage 20 IBM Documentation v
clients IBM Spectrum Protect solutions
applications 4 data protection solutions
client nodes 3 comparison 41
client software 3 multisite disk 35
concepts 3 single-site disk 33
system clients 4 multisite solution
types of 4 disk-based 35
virtual machines 4 roadmap 43
cloud-container storage pools 14 single-site solution
collocation 20 disk-based 33
command-line interface 9 inline data deduplication 23
concepts interfaces
clients 3 API 9
database 3 backup-archive client 9
inventory 3 client GUI 9
overview 3 command-line 9
recovery log 3 operations center 9
server 3 SQL statements 9
storage 3 inventory 6
container storage pools 23
container-copy storage pools 14
copy storage pools 14 K
keyboard 45
D
data deduplication L
client-side 23
layer
inline 23
logical 11
server-side 23
physical 11
data mover 11
library 11
data protection
log
strategies 23
active log 6
data protection services 4
archive failover log 6
Index 53
log (continued) security management (continued)
archive log 6 open registration 6, 20
log mirror 6 passwords 6, 20
recovery log 6 SSL 6, 20
TLS 6, 20
server
M concepts 3
media data stores 6
reclamation of 20 inventory 6
media, removable 11 recovery log 6
migrate service 4 services
multi-target replication 24 archive and retrieve 4
backup and restore 4
migrate and recall 4
N solutions
data protection solutions
network, types of
appliance-based 39
LAN 20
multisite solution
LAN-free 20
appliance-based 39
NAS 20
SQL statements, to access server database 9
Network attached storage 20
storage
SAN 20
concepts 3
node replication 24, 28
device support for 20
devices 3, 11
O hierarchy 3, 20
management of 20
operating systems 4 networks 20
operations center objects 11
access to 9 pools 3, 11, 14
functions 9 representations 11
types 11
P volumes 14
storage pools
path 11 archive-data 14
policy cloud 14
data management by 6 container 14, 23
policy domain 6 container-copy 14
policy set 6 copy 14
standard 6 primary 14
primary storage pools 14 representation 14
progressive incremental backup 23 types of 14
publications v system clients 4
R T
recall service 4 tape devices
recovery physical 11
data 28 virtual 11
system components 28 tape transport 24, 28
recovery log 6
replication
V
node 24
role in disaster recovery 24 virtual machines 4
source server 24 volume 11
target server 24 volumes 14
restore service 4
retrieve service 4
W
S web interface, for backup-archive client 9