commvault guide
commvault guide
Education Services
Commvault® Engineer
Student Guide
Copyright
Information in this document, including URL and other website references, represents the current view of Commvault
Systems, Inc. as of the date of publication and is subject to change without notice to you.
Descriptions or references to third party products, services or websites are provided only as a convenience to you and
should not be considered an endorsement by Commvault. Commvault makes no representations or warranties, express
or implied, as to any third-party products, services or websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos,
people, places, and events depicted herein are fictitious.
Complying with all applicable copyright laws is the responsibility of the user. This document is intended for distribution to
and use only by Commvault customers. Use or distribution of this document by any other persons is prohibited without
the express written permission of Commvault. Without limiting the rights under copyright, no part of this document may
be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic,
mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of
Commvault Systems, Inc.
Commvault may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering
subject matter in this document. Except as expressly provided in any written license agreement from Commvault, this
document does not give you any license to Commvault’s intellectual property.
©1999-2020 Commvault Systems, Inc. All rights reserved. Commvault, Commvault and logo, the "C hexagon” logo,
Commvault Systems, Solving Forward, SIM, Singular Information Management, Commvault HyperScale, ScaleProtect,
Commvault OnePass, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor,
Vault Tracker, InnerVault, Quick Snap, QSnap, IntelliSnap, Recovery Director, CommServe, CommCell, APSS,
Commvault Edge, Commvault GO, Commvault Advantage, Commvault Complete, Commvault Activate, Commvault
Orchestrate, and CommValue are trademarks or registered trademarks of Commvault Systems, Inc. All other third party
brands, products, service names, trademarks, or registered service marks are the property of and used to identify the
products or services of their respective owners. All specifications are subject to change without notice.
Confidentiality
The descriptive materials and related information in the document contain information that is confidential and proprietary
to Commvault. This information is submitted with the express understanding that it will be held in strict confidence and
will not be disclosed, duplicated or used, in whole or in part, for any purpose other than evaluation purposes. All right,
title and intellectual property rights in and to the document is owned by Commvault. No rights are granted to you other
than a license to use the document for your personal use and information. You may not make a copy or derivative work of
this document. You may not sell, resell, sublicense, rent, loan or lease the document to another party, transfer or assign
your rights to use the document or otherwise exploit or use the Manual for any purpose other than for your personal use
and reference. The document is provided "AS IS" without a warranty of any kind and the information provided herein is
subject to change without notice.
Contents
Advanced Infrastructure Design .............................................................................................. Error! Bookmark not
defined.
Introduction ..............................................................................................................................................................................
7
Advanced Infrastructure Design Course Overview ..........................................................................................................
8
Education Advantage .......................................................................................................................................................
9
Class Resources ............................................................................................................................................................
10
CVLab On Demand Lab Environment ...........................................................................................................................
11
Commvault® On-Demand Learning ...............................................................................................................................
12
Commvault® Education Career Path .............................................................................................................................
13
Education Services V11 Certification .............................................................................................................................
14
Course Overview ...........................................................................................................................................................
17
CommCell® Environment Design ..........................................................................................................................................
18
CommCell® Structure Planning ......................................................................................................................................
19
CommServe® Server Design .........................................................................................................................................
20
CommServe® Availability ...............................................................................................................................................
26
MediaAgent Scaling .......................................................................................................................................................
29
Indexing ............................................................................................................................................................................. 32
Indexing Overview .........................................................................................................................................................
33
V2 Indexing Overview ....................................................................................................................................................
37
Index Process for Data Protection Jobs ........................................................................................................................
38
Index Database Backup Operations ..............................................................................................................................
39
Index Checkpoint and Backup Process .........................................................................................................................
43
191
COMMVAULT® ENGINEER
INTRODUCTION
Education Advantage
The Commvault® Education Advantage product training portal contains a set of powerful tools to enable Commvault
customers and partners to better educate themselves on the use of the Commvault software suite. The portal includes:
Class Resources
Course manuals and activity guides are available for download for Instructor-Led Training (ILT) and Virtual Instructor-Led
Training (vILT) courses. It is recommended to download these documents the day prior to attending class to ensure the
latest document versions are being used.
Self-paced eLearning courses can be launched directly from the EA page. If an eLearning course is part of an ILT or vILT
course, it is a required prerequisite and should be viewed prior to attending class.
If an ILT or vILT class will be using the Commvault® Virtual Lab environment, a button will be used to launch the lab on the
first day of class.
Commvault® certification exams can be launched directly from the EA page. If you are automatically registered for an
exam as part of an ILT or vILT course, it will be available on the final day of class. There is no time limit on when the
exams need to be taken, but it is recommended to take them as soon as you feel you are ready.
The CVLab time can be purchased as standalone on-demand CVLab time, or to extend lab time for training courses
attended. Extending CVLab time must be purchased within 48-hours after class end time to maintain your lab progress
from the training course. Whether purchasing on-demand or extending; CVLab connect time may be purchased in
fourhour blocks in any quantity. Access will be available for 90 days from point of purchase and is priced at just one
Training Unit per four-hour block.
Commvault On-Demand Learning is a convenient, flexible, and cost-effective training solution that gives you the tools to
keep a step ahead of your company’s digital transformation initiatives. You and your company will benefit by:
Commvault's Certification Program offers Professional-level, Engineer-level, and Master-level certifications. This Program
provides certification based on a career path, and enables advancement based on an individual’s previous experience
and desired area of focus. It also distinguishes higher-level certifications such as Engineer and Master from lower-level
certification as a verified proof of expertise.
Key Points
• Certification is integrated with and managed through Commvault's online registration in the
Education Advantage Customer Portal.
• Cost of certification registration is included in the associated training course.
• Practice assessments are available at ea.commvault.com.
• The Commvault Certified Professional Exam Prep course is also available.
• Students may take the online certification exam(s) any time after completing the course.
• Although it is recommended to attend training prior to attempting an exam, it is not required.
• CommCell Administration – user and group security, configuring administrative tasks, conducting
data protection and recovery operations, and CommCell monitoring.
• Storage Administration – deduplication configuration, disk library settings, tape library settings,
media management handling, and snapshot administration.
• CommCell Implementation – CommServe® server design, MediaAgent design and placement,
indexing settings, client and agent deployment, and CommCell maintenance.
Certification status as a Commvault Certified Professional requires passing the Commvault® Certified Professional Exam.
Commvault® Engineer Exam – this exam validates expertise in deploying medium and enterprise
level CommCell® environments with a focus on storage design, virtual environment protection, and
application data protection strategies.
Certification status as a Commvault Certified Engineer requires certification as a Commvault Certified Professional and
passing the Advanced Infrastructure Design exam.
Certification status as a Commvault Certified Master requires certification as both a Commvault Certified Professional and
Certified Engineer, and successful completion of Master certification requirements. These Master certification
requirements include attending a Master class and passing the Master Certification exam.
Course Overview
Single CommCell • Provides central management. If central site hosting the CommServe server
environment • Allows data to easily be restored goes offline, all data management activities
across all sites. will be disrupted.
Multi-CommCell • Provides full autonomy and resiliency. Cross-site restore operations are more
environment • Allows each IT group to independently complicated if each site is its own CommCell
manage their environment. structure.
Based on the size of an environment, the CommServe server must be scaled appropriately. For current scalability
guidelines, refer to the Commvault Online Documentation section, ‘Hardware Specifications for the CommServe.’
• For CommServe server high availability the following options are available:
o The CommServe server can be clustered – This is recommended for larger
environments where high availability is critical. o The CommServe server can be
virtualized – This is suitable for small to mid-size environments.
• It is ABSOLUTELY CRITICAL that the CommServe database is properly protected. By default, every
day at 10 AM, a CommServe DR backup job is conducted. This operation can be completely
customized and set to run multiple times a day if required.
• All activity is conducted through the CommServe server. Therefore, it is important that
communication between the CommServe server and all CommCell ® components is maintained.
During auxiliary copy jobs, the JobMgr initiates the job and spawns the AuxCopyMgr process on the CommServe server.
This process is responsible for sending chunk information to the source MediaAgent and recording chunk updates from
the destination MediaAgent. In Commvault V11, a good portion of this workload is distributed to on demand services on
MediaAgents to assist in the workload. This offload is enabled using the ‘use scalable resource allocation’ setting in the
auxiliary copy configuration.
During data protection and auxiliary copy jobs, the CommServe server has a substantial responsibility. Consider this when
planning the resources for the CommServe server, especially in larger environments where hundreds of jobs will be
running in parallel.
CommServe® DR Backup
By default, every day at 10:00 AM, the CommServe DR backup process is executed. This process first dumps the
CommServe SQL database to a local folder path. An export process then copies the folder contents to a user defined
drive letter or UNC path. A backup phase subsequently backs up the DR Metadata and any user defined log files to a
location based on the storage policy associated with the backup phase of the DR process. All processes, schedules and
export/backup location are customizable in the DR Backup Settings applet in the Control Panel.
Additionally, a copy of the DR backup can be uploaded to Commvault® Cloud Services, which guarantees that an offline
copy exists and is accessible during recovery if a disaster was to occur.
Database Dump
During the dump phase, the system stores the dump files in the following location:
If available space is low, the location of the dump can be modified using the ‘ERStagingDirectory’ in the CommServe
Additional Settings tab.
Export
The Export process copies the contents of the \CommServDR folder to the user defined export location. A drive letter or
UNC path can be defined. The export location should NOT be on the local CommServe® server. If a standby CommServe
server is available, define the export location to a share on the standby server.
By default, five metadata backups are retained in the export location. It is recommended to have enough disk space to
maintain one weeks’ worth of DR exports and adjust the number of exports to the DR backup schedule frequency.
Backup
The Backup process is used to back up the DR Metadata to protected storage. This is accomplished by associating the
backup phase with a storage policy. A default DR storage policy is automatically created when the first library is configured
in the CommCell environment. Although the backup phase can be associated with a regular storage policy, it is
recommended to use a dedicated DR storage policy to protect the DR Metadata.
DR Storage Policy
When the first library in a CommCell environment is configured, a CommServe Disaster Recovery storage policy is
automatically created. The Backup phase of the DR backup process is automatically associated with this storage policy. If
the first library configured is a disk library and a tape library is subsequently added, a storage policy secondary copy is
created and associated with the tape library.
There are several critical points regarding the DR storage policy and backup phase configurations:
• Although the Backup phase can be associated with any storage policy in the CommCell ®
environment, it is recommended to use a dedicated DR storage policy. Using a dedicated policy
isolates DR Metadata on its own set of media making it potentially easier to locate and catalog in a
disaster situation.
• The most common reason the Backup phase is associated with regular data protection storage
policies is to reduce the number of tapes being sent off-site. If the backup phase is associated with
a regular storage policy, consider the following key points:
o Make sure the 'Erase Data' feature is disabled in the storage policy. If this is not done, the DR
Metadata will not be recoverable using the Media Explorer utility.
o When the storage policy secondary copy is created, ensure the DR Metadata is included in
the Associations tab of the policy copy.
o Make sure you are properly running and storing media reports. This is especially important
when sending large numbers of tapes off-site. If you don't know which tape the metadata is
on, you will have to catalog every tape until you locate the correct media which is storing
the DR Metadata.
The free cloud service requires a Commvault Cloud Services account, which is created using the following URL:
http://cloud.commvault.com
To configure DR Backups to the Commvault® cloud
Backup Frequency
By default, the DR backup runs once a day at 10:00 AM. The time the backup runs can be modified, and the DR backup
can be scheduled to run multiple times a day or saved as a script to be executed on demand.
Consider the following key points regarding the scheduling time and frequency of DR backups:
• If tapes are being sent off-site daily prior to 10:00 AM then the default DR backup time is not
adequate. Alter the default schedule so the backup can complete, and DR tapes can be exported
from the library prior to media being sent off-site.
• The DR Metadata is essential to recover protected data. If backups are conducted at night and
auxiliary copies are run during the day, consider setting up a second schedule after auxiliary copies
complete.
• For mission critical jobs, consider saving a DR backup job as a script. The script can then be
executed by using an alert to execute the script upon successful completion of the job.
Locations
Multiple copies of the DR backup can be maintained in its raw (export) form using scripts. Multiple copies of the backup
phase are created within the DR storage policy by creating secondary copies, or by creating a data backup storage policy
and including the metadata in the secondary copy’s Association tab.
• On-site and off-site standby CommServe® servers should have an export copy of the metadata.
• Wherever protected data is located, a copy of the DR Metadata should also be included.
• Whenever protected data is sent off-site a copy of the DR Metadata should be included.
• Since DR Metadata does not consume a lot of space, longer retention is recommended.
Retention
By default, the export phase maintains five copies of the metadata. A general recommendation is to maintain a weeks’
worth of metadata exports if disk space is available. This means if the DR backup is scheduled to run two times per day,
then 14 metadata backups should be maintained.
For the metadata backup phase, the default storage policy retention is 60 days and 60 cycles. A general best practice is
that the metadata should be saved based on the longest data being retained. If data is being sent off-site on tape for ten
years, a copy of the DR database should be included with the data.
Metadata Security
Securing the location where the DR Metadata is copied to is critical since all security and encryption keys are maintained
in the CommServe database. If the metadata is copied to removable drives or network locations, best practices
recommend using disk-based encryption.
CommServe® Availability
High availability for the CommServe® server is essential to allow normal CommCell® operations to run. If the CommServe
server goes offline, data protection and recovery jobs are affected.
• Meeting backup windows – During data protection jobs, if the CommServe server is not
reachable, the client continues backing up data to a MediaAgent by default for 20 minutes. The
‘Network Retries’ determines the maximum time interval and number of attempts to contact the
CommServe system. The default is 40 retries at 30 second intervals.
• Restores – The CommServe server must be available to browse and recover data within a
CommCell environment.
• Deduplication database consistency – In the event of a CommServe failure, all Deduplication
Databases (DDBs) within a CommCell environment will be in an inconsistent state. When the
CommServe metadata is restored, all DDBs must be brought back to a consistent state. This
process brings the DDBs to a state as they existed based on the point-in-time of the CommServe
database restore point. This could result in losing some backup data if the backups completed after
the most recent CommServe DR backup.
• Archive stub recalls – When using Commvault archiving, stub recalls require the CommServe
server to be present. The HSM recall service redirects all item retrieval requests to the CommServe
server which then locates which MediaAgent and media contains the data.
• It is critical that both the production and standby CommServe servers are patched to the same
level. After applying updates to the production CommServe server, ensure the same updates are
applied to the standby CommServe server.
• Multiple standby CommServe servers can be used. For example, an on-site standby and an off-site
DR CommServe server. Use post script processes to copy the raw DR Metadata to additional
CommServe servers.
• A standby CommServe server can be a multi-function system. The most common multi-function
system would be installing the CommServe software on a MediaAgent.
• If a virtual environment is present, consider using a virtual standby CommServe server. This avoids
problems associated with multi-function standby CommServe servers and eliminates the need to
invest in additional hardware. Ensure the virtual environment is properly scaled to handle the extra
load that may result when activating the virtual standby CommServe server.
Virtualization
Some customers with virtual environments are choosing to virtualize the production CommServe server. A virtualized
CommServe server has an advantage of using the hypervisors high availability functionality (when multiple hypervisors
are configured in a cluster) which reduces costs since separate CommServe hardware is not required. Although this
method could be beneficial, it should be properly planned and implemented.
If the virtual environment is not properly scaled, the CommServe server could become a bottleneck when conducting data
protection jobs. In larger environments where jobs run throughout the business day, the CommServe server activity may
have a negative performance impact on production servers.
When virtualizing the CommServe server, it is still critical to run the CommServe DR backup. In the event of a disaster, the
CommServe server may still have to be reconstructed on a physical server. Do not rely on the availability of a virtual
environment in the case of a disaster. Follow normal Commvault software best practices in protecting the CommServe
metadata.
Clustering
The CommServe® server can be deployed in a clustered configuration. This provides high availability for environments
where CommCell operations run 24/7. Clustering the CommServe server is a good solution in large environments where
performance and availability are critical.
Note that a clustered CommServe server is not a DR solution, therefore a standby CommServe server must be planned
for at a DR site.
Another benefit for using a clustered CommServe server is when using Commvault OnePass® archiving. Archiving
operations are configured to create stub files which allow end users to initiate recall operations. For the end user recall to
complete successfully the CommServe server must be available. Having the CommServe server clustered ensures that
recalls can be accomplished.
CommServe Failover
CommServe failover provides methods for log shipping CommServe database data to a pre-configured standby
CommServe server.
For more information, refer to the Commvault Online Documentation sections, 'Setup a Standby CommServe Host for
Failover' and 'Testing Disaster Readiness'.
MediaAgent Scaling
MediaAgents are the multifunction workhorses of a Commvault® software environment. They facilitate the transfer of data
from source to destination, and hosts the deduplication database, metadata indexes, and run analytic engines.
For MediaAgent resource requirements and guidelines, refer to the Commvault Online Documentation.
• Data Mover – moves data during data protection, data recovery, auxiliary copy, and content
indexing jobs.
• Deduplication Database (DDB) – hosts one or more deduplication databases on high speed solid
state or PCI storage.
• Metadata indexes – hosts both V1 and V2 indexes on high speed dedicated disks.
• Analytics – runs various analytics engines including data analytics, log monitoring, web analytics,
and the Exchange index for the new Exchange Mailbox agent.
Since all data moving to/from protected storage must move through a MediaAgent, resource provisioning for MediaAgent
hosts (e.g., CPU, memory, and bandwidth) must be adequate for both the volume and the concurrency of data movement
you expect it to handle.
In the scenario where the MediaAgent component is co-located on the same host as the client agent, the exchange of
data is contained within the host. This is called a SAN MediaAgent configuration, or sometimes referred to as LAN-free
backups, and has its advantages of keeping data off potentially slower TCP/IP networks by using local higher performance
transmission devices (e.g., Fibre Channel, SCSI, etc.). On the other hand, a MediaAgent component located on a host by
itself can provide dedicated resources and facilitate exchange of data over longer distances using TCP/IP (e.g., LAN,
WAN, etc.).
Deduplication Database
The Deduplication Database (DDB) maintains all signature records for a deduplication engine. During data protection
operations, signatures are generated on data blocks and sent to the DDB to determine if data blocks are duplicate or
unique. During data aging operations, the DDB is used to decrement signature counters for blocks from aged jobs and
subsequently prune signatures, and block records when the signature counter reaches zero. For these reasons, it is
critical that the DDB is located on high performance, locally attached solid state or PCI storage technology.
Metadata Indexes
Commvault® software uses a distributed indexing structure that provides for enterprise level scalability and automated
index management. This works by using the CommServe® database to only retain job-based metadata such as chunk
information, which keeps the database relatively small. Detailed index information, such as details of protected objects is
kept on the MediaAgent. The index location can maintain both V1 and V2 indexes. Ensure the index location is on high
speed dedicated disks.
Analytics
One or more analytics engines can be installed on a MediaAgent. The following provides a high-level overview of the
commonly used analytics engines:
• Data analytics – provides a view into unstructured data within an environment. Some capabilities
include:
o identifying old files and emails o
identifying multiple copies of large
files o removing unauthorized file
types
• Log monitoring – identifies and monitors any logs on client systems. The monitoring process is
used to identify specific log entries and set filters based on criteria defined within a monitoring
policy.
• Exchange index engine – maintains V2 metadata indexing information for the new Exchange
Mailbox Agent. It is recommended when using the Exchange index server that no other analytic
engines are installed on the MediaAgent hosting the index.
You need to protect a smaller remote site and want to keep a local copy of data for quick restore. However, you
are concerned about hardware costs for a MediaAgent.
Solution: Virtualize the remote site MediaAgent and keep a shorter retention for the local copy, producing a
smaller footprint. Then replicate the data using DASH Copy to the main data center physical MediaAgent where it
can be kept for a longer retention.
Indexing
Indexing Overview
Commvault® software uses a distributed indexing structure that provides for enterprise-level scalability and automated
index management. This works by using the CommServe® database to only retain job-based metadata such as chunk
information, which keeps the database relatively small. Detailed index information, such as details of protected objects is
kept on the MediaAgent managing the job. When using Commvault deduplication, block and metadata indexing are
maintained within volume folders in the disk library.
Job summary data maintained in the CommServe database keeps track of all data chunks being written to media. As each
chunk completes, it is logged in the CommServe database. This information also tracks the media used to store the
chunks.
Commvault® Version 11 introduces the new V2 indexing model, which has significant benefits over its predecessor.
MediaAgents can host both V1 and V2 indexing in the index directory. The primary difference between these two indexing
models, relative to the index directory sizing, are as follows:
• V1 indexes are pruned from the directory based on the days and the index cleanup percentage
settings in the MediaAgent catalog tab.
• V2 indexes are persistent and not pruned from the index directory unless the backup set associated
with the V2 index is deleted
Detailed index information for jobs is maintained in the MediaAgent's index directory. This information contains:
• Each object
• Which chunk the data is in
• The chunk offset defining the exact location of the data within the chunk
The index files are stored in the index directory and after the data is protected to media, an archive index operation is
conducted to write the index to media. This method automatically protects the index. The archived index is also used if the
index directory is not available, when restoring the data at alternate locations, or if the indexes have been pruned from the
index directory location.
One major distinction between Commvault® software and other backup products is that Commvault uses a distributed self-
protecting index structure. The modular nature of the indexes allows the small index files to automatically be copied to
media at the conclusion of data protection jobs.
Indexing Operations
The following steps provide a high-level overview of indexing operations during data protection and recovery operations.
1. A browse or find operation is initiated. Restore by job operations do not use the index directory.
2. The index file is accessed / retrieved
a. If the index is in the index directory it is accessed, and the operation continues.
b. If the index is not in the index directory, it is automatically retrieved from media
If media is not in the library, the system prompts you to place the media in the library.
During a browse operation, if it is known that the media is not in the library, use the 'List Media' button to determine which
media is required for the browse operation.
It is important to note that the 'Time in Days' and 'Index Cleanup Percent' settings use OR logic to determine how long
indexes will be maintained in the index directory. If either one of these criteria are met, index files are pruned from the
directory. When files are pruned from the index, they are deleted based on access time; deleting the least frequently
accessed files first. This means that older index files that have been more recently accessed may be kept in the directory
while newer index files that have not been accessed are deleted.
Indexing Service
The Indexing Service process on the MediaAgent is responsible for cleaning up the index directory location. This service
runs every 24 hours. Any indexes older than 15 days are pruned from the index directory. If the directory location is above
the 90% space threshold, additional index files are pruned.
V2 Indexing Overview
Commvault® version 11 introduces the next generation indexing called indexing V2. It provides improved performance and
resiliency, while shrinking the size of index files in the index directory and in storage.
V2 indexing works by using a persistent index database maintained at the backup set level. During subclient data
protection jobs, log files are generated with all protected objects and placed into the index database.
During data protection jobs, log files are generated with records of protected objects. The maximum size of a log is 10,000
objects or a complete chunk. Once a log is filled or a new chunk is started, a new log file is created, and the closed log is
written to the index database. By writing index logs to the database while the job is still running, the indexing operations of
the job runs independent of the actual job; allowing a job to complete even if log operations are still committing information
to the database.
At the end of each job, the log files are written to storage along with the job. This is an important distinction from traditional
indexing, which copies the entire index to storage. By copying just logs to storage, indexes require significantly less space
in storage, which is a benefit when protecting large file servers. Since the index database is not copied to storage at the
end of each job, a special IndexBackup subclient is used to protect index databases
The index databases are protected with system created subclients, which are displayed under the Index Servers
computers group in the CommCell® browser. An index server instance is created for each storage policy. An index backup
operation is scheduled to run every twenty-four hours. During the backup operation, index databases are checked to
determine if they qualify for protection. The two primary criteria to determine if a database qualifies for protection are one
million changes or 7 days since the last backup.
1. Expand Client Computer Groups | The Storage Policy pseudo client | Big Data Apps | classicIndexInstance |
Right-click the default subclient.
1. Expand Policies | Schedule Policies | Right-click the System Created for IndexBackup subclients schedule policy |
Edit.
2. The description field confirms that this is the schedule policy used for Index backups.
4. By default, the index backups are scheduled to run three times a day, but this can be modified as needed.
• A database checkpoint
• The database is compacted
• The database is backed up to the storage policy associated with the index server subclient
Database Checkpoint
Checkpoints are used to indicate a point-in-time in which a database was backed up. Once the database is protected to
storage, any logs that are older than the checkpoint can be deleted from the index directory location.
Database Compaction
During data aging operations, deleted jobs are marked in the database as unrecoverable, but objects associated with the
job remain in the database. The compaction operation deletes all aged objects and compacts the database.
Database Backup
Once the checkpoint and compaction occur, the database is backed up to the primary copy location of the storage policy.
Three copies of the database are kept in storage and normal storage policy retention rules are ignored.
During the index backup process, the database is frozen and Browse or Find operations cannot be run against the
database. Each database that qualifies for backup is protected sequentially minimizing the freeze time. Data protection
jobs are not affected by the index backup.
If needed, file system agents can be upgraded to V2 indexing by using a Workflow. The Workflow is available for
download from the Commvault® Store.
Note that currently if the client has completed backup jobs, only the file system agent can be upgraded.
Upgrade Requirements
• The client and its agents must have a valid license applied.
• The BackupSet must be scheduled for backups.
• The client cannot be de-configured
• The Virtual Server Agent will be upgraded only if:
• The CommServe server is V11 SP13 or above.
• The VSA client does not have any completed or running backup jobs.
Any client that does not meet the requirements, will be skipped during the upgrade process.
First Steps:
It is important to prepare for the upgrade. The following steps must be completed before running the script.
• The Workflow can be executed against clients or client computer groups. Take note of the client or
client computer group names that require an upgrade.
• Note the agents installed on the clients. The Workflow can be executed only against clients with the
same agent set. If the clients have different agents install, run the Workflow multiple times.
• Note how many clients you want to upgrade in parallel. It can be from 1 to 20.
• Download the 'Upgrade to Indexing V2' Workflow from Commvault® Store.
• Run a full or synthetic full backup for the clients.
• Ensure that no jobs are running for the client.
1. Click Workflows.
3. Select if the workflow is executed against specific clients or specific computer groups.
STORAGE DESIGN
• If using DAS or SAN, format mount paths using a 64KB block size.
• If using DAS or SAN, try to create multiple mount path. For instance, if there are 10 mount paths,
and there is a maintenance job, such as a defrag job running on one, the mount path can be set to
read-only, leaving 90% of the disk library available for backup jobs.
• Set mount path usage to Spill and Fill, even if using only one mount path. If additional mount paths
are added later, the streams will spill as expected.
• Share the disk library if required.
• From the CommCell® console, validate the mount path speed and document for future reference.
• Dedicated – disk libraries are created by first adding a disk library entity to the MediaAgent using
either the right-click All Tasks menu or the Control Panel's Expert Storage Configuration tool. One or
more mount paths can be created/added to the library. Mount Paths are configured as Shared Disk
Devices. The Shared Disk Device in a dedicated disk library has only one Primary Sharing Folder.
• Shared – disk libraries are libraries with more than one Primary Sharing Folder configured on a
Shared Disk Device. This enables other MediaAgents access to the same shared volume resource. A
shared disk library can then be created and the 'Shared Disk Devices' added to the library. One path
to the shared folder can be direct while the others are Common Internet File System (CIFS) shared
directory paths. CIFS protocol is used to manage multiple MediaAgent access to the same directory.
For UNIX hosted MediaAgents, Network File System (NFS) protocol can be used. NFS shared disks
appear to the MediaAgent as local drives.
• Replicated – disk libraries are configured like a shared disk library with the exception that the
Shared Disk Device has a replicated data path defined to a volume accessible via another
MediaAgent. Replicated folders are read-only and replication can be configured for use with third
party replication hardware.
There are three methods that disk library data paths can be configured:
When using SAN storage, each building block should use a dedicated MediaAgent, DDB and disk library. Although the
backend disk storage in the SAN can reside on the same disk array, it should be configured in the Commvault® software
as two separate libraries; where Logical unit numbers (LUNs) are presented as mount paths in dedicated libraries for
specific MediaAgents.
SAN storage provides fast and efficient movement of data but, if the building block MediaAgent fails, data cannot be
restored. When using SAN storage, either the MediaAgent can be rebuilt or the disk library can be re-zoned to a different
MediaAgent. If the disk library is rezoned, it must be reconfigured in the Commvault® software to the MediaAgent that has
access to the LUN.
• When configuring the Data Server feature, there are three types of connections to
storage/MediaAgent:
• Data Server IP - A MediaAgent presents local storage to other MediaAgents through the IP network
as an NFS volume.
• Data Server SAN - A Linux MediaAgent acts as a proxy to present storage to other MediaAgents
using Fibre Channel connections.
• Data Server iSCSI - A Linux MediaAgent acts as a proxy to present storage to other MediaAgents
using iSCSI connections.
A tape library is a library where media can be added, removed, and moved between multiple libraries. The term removable
media is used to specify various types of removable media supported by Commvault® software, including tape and USB
disk drives, which can be moved between MediaAgents for data protection and recovery operations.
• Configure the tape library cleaning method to use. Software cleaning (Commvault) or hardware
cleaning (library) can be used, but not both. A choice must be made.
• Share the tape library if required.
• Create a barcode pattern for cleaning tapes and assign it to the Cleaning Media group.
• If using multiple scratch media groups, create scratch groups and barcode patterns to use.
• Validate drive speed (from the CommCell console) and document for future reference.
Tape libraries are divided into the following components:
• Library – is the logical representation of a library within a CommCell® environment. A library can be dedicated to a
MediaAgent or shared between multiple MediaAgents. Sharing of removable media libraries can be static or
dynamic depending on the library type and the network connection method between the MediaAgents and the
library.
• Master drive pool – is a physical representation of drives of the same technology within a library. An example of
master drive pools would be a tape library with different drive types like LTO4 and LTO5 drives within the same
library. Drive pool – is used to logically divide drives within a library. The drives can then be assigned to protect
different jobs.
• Scratch pool – is defined to manage scratch media, also referred to as spare media, which can then be assigned
to different data protection jobs. o Custom scratch pools – can be defined and media can be assigned
to each pool.
o Custom barcode patterns – can be defined to automatically assign specific media to different
scratch pools or media can manually be moved between scratch pools in the library.
GridStor® Technology
Storage policies are used to define one or more paths data takes from source to destination. When a MediaAgent and a
client agent are installed on the same server, a 'LAN Free' or 'preferred path' can be used to backup data directly to
storage. Network based clients can backup through a MediaAgent using a 'default path', a 'failover' path, or 'round-robin'
load balancing paths.
When configuring storage policy copy data paths, by default, the first data path defined becomes the 'Default Data Path.' If
multiple data paths are defined, the 'Default Data Path' is the first one to be used. This path can be modified later.
• Failover
• Round-Robin
This Commvault® software feature is called GridStor™ technology. For more information, about GridStor™ features, refer
to the Commvault® Online Documentation.
• Hardware compression
• Hardware encryption
• Chunk size
• Block size
Hardware Compression
For data paths defined to write to tape libraries, the 'Hardware Compression' option is enabled by default. If a tape drive
supports hardware compression, then this option is enabled in the General tab of the Data Path Properties.
Hardware Encryption
For tape drives that support hardware encryption, Commvault® software manages configuration settings and keys. Keys
are stored in the CommServe® database and can optionally be placed on the media to allow recovery of data if the
CommServe database is not available at time of recovery. The data path option 'Via Media Password' places the keys on
the media. The 'No Access' option only stores the keys in the CommServe database.
If the 'Via Media Password' option is chosen, it is essential that a Media Password be configured, or the encrypted data
can be recovered without entering any password during the recovery process. A global Media Password can be set in the
'System Settings' in the Control Panel applet. Optionally a storage policy level password can be set in the Advanced tab of
the Storage Policy Properties.
Chunk Size
Chunk sizes define the size of data chunks that are written to media and is also a checkpoint in a job. The default size for
disk is 4GB. The default size for tape is 8GB for indexed based operations or 16GB for non-indexed database backups.
The data path 'Chunk Size' setting can override the default settings. A higher chunk size results in a more efficient data
movement process. In highly reliable networks, increasing chunk size can improve performance. However, for unreliable
networks, any failed chunks must be rewritten, so a larger chunk size could have a negative effect on performance.
Block Size
The default block size Commvault® software uses to move and write data to media is 64KB. This setting can be set from
32KB – 2048KB. Like chunk size, a higher block size can increase performance. However, block size is hardware
dependent. Before modifying this setting, ensure all hardware being used at your production and DR sites support the
higher block size. If you are not sure, don't change this value.
When writing to tape media, changing the block size only becomes effective when Commvault software rewrites the OML
header on the tape. This is done when new media is added to the library, or existing media is recycled into a scratch pool.
Media with existing jobs continue to use the block size established by its OML setting.
When writing to disk, it is important to match the block size data path setting to the formatted block size of the disk.
Matching block sizes can greatly improve disk performance. The default block sizes operating systems use to format disks
is usually much smaller than the default setting in the Commvault software.
It is strongly recommended to format disks to the block size being used in Commvault software.
Consult with your hardware vendor’s documentation and operating system settings to properly format
disks.
Cloud
What is Cloud?
Commvault® is a leader in the protection, management, and migration of cloud infrastructure. Whether it is a public cloud
environment (cloud provider), a private cloud infrastructure (on-premises) or a hybrid cloud made of both cloud and
onpremises, Commvault® Software offers tools to handle ever-growing cloud environments.
• Application Agents
• Virtual Server Agents
• Application-Aware features
• Workflows
Before deciding which options to use, it is first important to collect information about the environment to protect, as well as
understanding the differences between cloud offerings. This can significantly impact the features available to use.
What is a Cloud?
Several cloud offerings and technologies can be used when building a cloud infrastructure. They are classified in the
following major categories, which basically defines the responsibility boundaries between the customer and the cloud
provider:
• Private cloud (or on-premises) - a cloud infrastructure hosted on-premises where the customer is
responsible for managing the entire stack (hardware and software).
• Infrastructure-as-a-Service (or IaaS) - A public cloud environment hosted by a cloud provider
allowing a customer to run virtual machines. The cloud vendor is responsible to manage the
hardware (physical servers, storage, and networking), while the customer is responsible to create
and maintain virtual machines. This includes maintaining the operating system, applications, and
data.
• Platform-as-a-Service (or PaaS) - As the name suggests, the cloud vendor provides a platform that
typically includes the hardware, the operating system, the database engine, a programming
language execution environment, as well as web servers. The customer is not responsible to
maintain any virtual servers and can focus on using the framework to develop applications using
databases. The customer is therefore responsible to maintain the applications and the data. Good
examples of PaaS are Microsoft® Azure Database services and Amazon Relational Database
Services (RDS).
• Software-as-a-Service (or SaaS) - A cloud-based application for which the cloud provider is
responsible in its entirety. This includes the application itself, which is offered 'on-demand' to the
customer. A good example of SaaS is Microsoft® Office 365.
Responsibility boundaries by cloud offering
1. The main data center VMs, physical servers, and applications are backed up to a local deduplicated
library.
2. A predefined schedule (i.e., every 30 minutes) copies the backup data to a deduplicated cloud
library using the Commvault® Dash Copy feature.
3. If the data center is lost in a disaster, the data recovery is initiated from the cloud library.
Commvault® Education Services Page 60 of 178
V11 SP18 Commvault® Engineer February 2020
4. The virtual machines are recovered and converted into cloud provider VMs. For instance, VMWare
virtual machines, protected in the data center could be recovered and converted into Microsoft ®
Azure VMs.
5. If needed, the file system of physical servers is restored in cloud provider VMs.
6. Applications are restored either in VMs, Platform-as-a-Service (PaaS) or Software-as-a-Service (SaaS)
instances.
7. Applications are brought online and users can connect.
Disaster Recovery to Cloud Workflow
Using advanced features such as Commvault deduplication can greatly reduce the bandwidth requirements of backing up
to cloud storage. However, in a disaster situation where a significant amount of data must be restored, bandwidth can
become a serious bottleneck.
Data transfers are achieved using secured channels (HTTPS) and are optionally encrypted to further secure the data sent
to the cloud.
information, refer to the Commvault Online Documentation, 'Configuring Micro Pruning on Cloud
Storage' section.
The list of supported cloud providers for Commvault® software grew over the years — up to 30 providers
as of Service Pack 14. For a complete list of supported providers, please refer to Commvault Online
Documentation.
A MediaAgent must be defined to act as a gateway and to send the data to the cloud. If the library is used for secondary
copies of data store in local library, it is recommended whenever possible to use the MediaAgent hosting the primary copy
to avoid unnecessary traffic. If the MediaAgent requires a proxy to reach the cloud, it can be defined during the cloud
library creation process by using the Advanced tab.
7. Provide the cloud storage connection credentials from the list or click create if they were not already configured.
brought online. In several cases, it is less costly than maintaining a complete disaster recovery infrastructure in a
secondary site. Cloud storage can be leveraged to host a copy of the backup data, ready to be restored if needed.
Furthermore, the Commvault® Live Sync feature can be used to recover the backup data automatically, significantly
reducing recovery time objectives (RTO).
1. The main data center VMs, physical servers, and applications are backed up to a local
deduplicated library.
2. A predefined schedule (i.e., every 30 minutes) copies the backup data to a deduplicated cloud
library using the Commvault® Dash Copy feature.
3. If the data center is lost in a disaster, the data recovery is initiated from the cloud library.
4. The virtual machines are recovered and converted into cloud provider VMs. For instance,
VMWare virtual machines, protected in the data center could be recovered and converted into
Microsoft® Azure VMs.
5. If needed, the file system of physical servers is restored in cloud provider VMs.
6. Applications are restored either in VMs, Platform-as-a-Service (PaaS) or Software-as-a-Service
(SaaS) instances.
7. Applications are brought online and users can connect.
time objective (RTO) of systems but incurs larger costs as the cloud resource usage is increased. In this situation, the
'Disaster Recovery' workflow is used:
• The main data center VMs, physical servers, and applications are backed up to a local deduplicated
library.
• As soon as a backup completes, the data is copied to a deduplicated cloud library using the
Commvault® Dash Copy feature.
• As soon as the copy to the cloud library completes, a recovery process is automatically initiated.
• The virtual machines are recovered and converted into cloud provider VMs. For instance, VMWare
virtual machines, protected in the data center could be recovered and converted into Microsoft ®
Azure VMs.
• If needed, the file system of physical servers is restored in cloud provider VMs.
• Applications are restored either in VMs, Platform-as-a-Service (PaaS) or Software-as-a-Service (SaaS)
instances.
• If a disaster occurs, applications are brought online and users can connect.
Deduplication
• The Global Deduplication Policy – defines the rules for the Deduplication Engine. These rules
include:
o Deduplication Store location and configuration settings
o The Deduplication Database (DDB) location and configuration settings
• A Data Management Storage Policy – is configured as a traditional storage policy, where the
former also manages subclient associations and retention. Storage policy copies defined within the
Data Management policy are associated with Global Deduplication storage policies. This association
of the Data Management Storage Policy copy to a Global Deduplication Policy determines in which
Deduplication Store the protected data resides.
• Deduplication Database (DDB) – is the database that maintains records of all signatures for data
blocks in the Deduplication Store.
• Deduplication Store – contains the protected storage using Commvault deduplication. The store
is a disk library which contains non-duplicate blocks, along with block indexing information, job
metadata, and job indexes.
• Client – is the production client where data is being protected. The client has a file system and/or
an application agent installed. The agent contains the functionality to conduct deduplication
operations, such as creating data blocks and generating signatures.
• MediaAgent – coordinates signature lookups in the DDB and writes data to a protected storage.
The signature lookups operation is performed using the DDB on the MediaAgent.
In the unlikely event that the DDB becomes corrupt, the system automatically recovers the DDB from the most recent
backup. Once the DDB backup has been restored, a reconstruct process occurs which will ‘crawl’ job data since the last
DDB backup point. This brings the restored DDB to the most up-to-date state. Keep in mind that the more frequently DDB
backups are conducted, the shorter the ‘crawl’ period lasts to completely restore the DDB. Note that during this entire
recovery process, jobs that require the DDB must not be running.
When using transactional DDB, the system checks the integrity of the database and the ‘DiskDB’ logs and attempts to
bring the database to an online consistent state. If this process succeeds, it only takes few minutes to bring the database
online. If the process is not successful, such as the case if the entire disk was lost, the process is automatically switched
to full reconstruct mode.
• Delta Reconstruction – When using transaction deduplication, in the event of an unclean DDB
shutdown due to MediaAgent reboot or system crash, the ‘DiskDB’ logs can be used to bring the
DDB to a consistent state.
• Partial Database Reconstruction – If the DDB is lost or corrupt, a backup copy of the database is
restored and the database is reconstructed using chunk metadata.
• Full Database Reconstruction – If the DDB is lost and no backup copy is available, the entire
database is reconstructed from chunk metadata.
The remaining 16KB will be hashed in its entirety. In other words, Commvault® deduplication will not add more data to the
deduplication buffer. The result is if the object containing the three deduplication blocks never changes, all three blocks
will always deduplicate against themselves.
The minimum fallback size to deduplicate the trailing block of an object is 4096 bytes (4 KB). Any trailing blocks smaller
than 4096 bytes is protected but will not be deduplicate.
When using Commvault compression during backups instead of application compression, the application agent can be
configured to detect the database backup and generates a signature on uncompressed data. After the signature has been
generated, the block is then compressed, which leads to improved deduplication ratios. By default, Commvault® software
always compresses prior to signature generation. Note that an additional setting can be added to the database client to
generate the signature prior to compression.
Commvault® Education Services Page 71 of 178
V11 SP18 Commvault® Engineer February 2020
Log files are constantly changing with new information added and old information truncated. Since the state of the data is
constantly changing, deduplication will provide no space saving benefits. During log backup jobs, the application agent
detects the log backup and no signatures are generated. This saves CPU and memory resources on the production
system and speeds up backups by eliminating signature lookups in the DDB.
Source-Side Deduplication
Source-side deduplication, also referred to as 'client-side deduplication,' occurs when signatures are generated on
deduplication blocks by the client and the signature is sent to a MediaAgent hosting the DDB. The MediaAgent looks up
the signature within the DDB. If the signature is unique, a message is sent back to the client to transmit the block to the
MediaAgent, which then writes it to the disk library. The signature is logged in the DDB to signify the deduplication block is
now in storage.
If the signature already exists in the DDB then the block already exists in the disk library. The MediaAgent communicates
back to the client agent to discard the block and only send metadata information.
Target-Side Deduplication
Target-side deduplication requires all data to be transmitted to the MediaAgent. Signatures are generated on the client or
on the MediaAgent. The MediaAgent checks each signature in the DDB. If the signature does not exist, it is registered in
the database and the deduplication block is written to the disk library.
If the signature does exist in the DDB, then the block already exists in the library. The deduplication block is discarded and
only metadata associated with the block is written to disk.
Commvault® software is used to configure deduplication to occur either on the client or on the MediaAgent, but which is
best? This depends on several environmental variables including network bandwidth, client performance and MediaAgent
performance.
1. Signature is generated at the source - For primary data protection jobs using client-side
deduplication, the source location is the client. For auxiliary DASH copy jobs, the source MediaAgent
generates signatures.
2. Based on the generated signature it is sent to its respective database. The database compares the
signature to determine if the block is duplicate or unique.
3. The defined storage policy data path is used to protect data – regardless of which database the
signature is compared in, the data path remains consistent throughout the job. If GridStor ® Round-
Robin has been enabled for the storage policy primary copy, jobs will load balance across
MediaAgents.
When enabling Commvault deduplication for a primary copy, the ‘Enable DASH Full’ option is selected
by default.
• DASH Copies are auxiliary copy operations so they can be scheduled to run at optimal time periods
when network bandwidth is readily available. Traditional replication would replicate data blocks as it
arrives at the source.
• Not all data on the source disk needs to be copied to the target disk. Using the subclient
associations of the secondary copy, only the data required to be copied would be selected.
Traditional replication would require all data on the source to be replicated to the destination.
• Different retention values can be set to each copy. Traditional replication would use the same
retention settings for both the source and target.
• DASH Copy is more resilient in that if the source disk data becomes corrupt the target is still aware
of all data blocks existing on the disk. This means after the source disk is repopulated with data
blocks, duplicate blocks will not be sent to the target, only changed blocks. Traditional replication
would require the entire replication process to start over if the source data became corrupt.
Disk optimized DASH Copy will extract signatures from chunk metadata during the auxiliary copy process which reduces
the load on the source disks and the MediaAgent since blocks do not need to be read back to the MediaAgent and
signatures generated on the blocks.
Network optimized DASH Copy reads all blocks required for the auxiliary copy job back to the MediaAgent, which
generates signatures on each block.
To schedule an auxiliary copy job as a DASH Copy, first go to the Secondary Copy Properties Deduplication tab and, from
the Advanced subtab, select the ‘Enable DASH Copy’ check box and ensure that 'Disk Optimized' is also checked.
1. No additional block data that generates the same signature will reference a block in an incomplete
chunk.
2. Once the chunk and signatures are committed, any signatures that match ones from the committed
chunk can immediately start deduplicating against the blocks within the chunk.
Another way to look at this is Commvault® software deduplicates on chunk boundaries. If multiple identical signatures
appear in the same chunk, each signature will be registered in the DDB and the blocks will be written multiple times. Once
the chunk is committed, duplicate signatures will only increase the record counter on the first occurrence of the signature.
All the other duplicate signatures registered in the DDB will remain with until the job is aged and pruned from storage.
It is also important to note that the chunk data is written as part of the job. Once the chunk is committed, SFiles that make
up the chunk are no longer bound to the job since other jobs can reference blocks within the SFile.
DASH Copy process for disk and network optimized auxiliary copy jobs
Optimize for high latency network is an optional setting which will first check the local MediaAgent disk cache. If the
signature is not found in the local cache, the process assumes the block is unique and sends both the block and the
signature to the destination MediaAgent.
2. Check this option to create a small cache used for initial lookups on the source MediaAgent, before querying the
destination MediaAgent.
Pruning is the process of physically deleting data from disk storage. During normal data aging operations, all chunks
related to an aged job are marked as aged and pruned from disk. With Commvault deduplication, data blocks within
SFILES can be referenced by multiple jobs. If the entire SFILE was pruned, jobs referencing blocks within the SFILE
would not be recoverable. Commvault software uses a different mechanism when performing pruning operations for
deduplicated storage.
The aging and pruning process for deduplicated data is made up of several steps. When the data aging operation runs, it
appears in the Job Controller and may run for several minutes. This aging process logically marks data as aged. Behind
the scenes on the MediaAgent, the pruning process runs, which can take considerably more time depending on the
performance characteristics of the MediaAgent and DDB, as well as how many records need to be deleted.
Pruning Methods
Commvault® software supports the following pruning methods:
• Drill Holes – For disk libraries and MediaAgent operating systems that support the Sparse file
attribute, data blocks are pruned from within the SFILE. This frees up space at the block level
(default 128 KB) but over time can lead to disk fragmentation.
• SFILE truncation – If all trailing blocks in an SFILE are marked to be pruned, the End of File (EOF)
marker is reset reclaiming disk space.
• SFILE deletion – If all blocks in an SFILE are marked to be pruned, the SFILE is deleted.
• Store pruning – If all jobs within a store are aged and the DDB is sealed and a new DDB is created,
all data within the sealed store folders is deleted. This pruning method is a last resort measure and
requires sealing the DDB, which is strongly NOT recommended. This process should only be done
with Commvault Support and Development assistance.
5. Signatures no longer referenced are moved into the zero reference table.
6. Signatures for blocks no longer being referenced are updated in the chunk metadata information.
Blocks are then deleted using the drill holes, truncation or chunk file deletion method.
To avoid that initial transfer over the WAN, Commvault® software offers a procedure called DDB Seeding. This procedure
transfers the initial baseline backup between two sites using available removable storage such as tapes, USB drives or
an iSCSI appliance.
Use DDB Seeding when remote office sites are separated from the data center across a WAN and data needs to be either
backed up remotely or replicated periodically to a central data center site. Once the initial baseline is established, all
subsequent backups and auxiliary copy operations consume less network bandwidth because only the changes are
transferred.
Note that this procedure is used to transfer only the initial baseline backup between two sites. It cannot be used for
subsequent backups.
• The initial backup of a large remote client or a large remote site with several clients.
• The initial auxiliary (DASH) copy between the main data center and the secondary data center.
Commvault® software also offers a workflow that automates most of those steps. For more information about the workflow,
consult the Commvault Online Documentation.
9. Run an auxiliary copy for the target library copy. This will copy the data from the removable storage
to the target disk library.
10. Once completed, validate that the data is accessible from the target disk library.
11. Modify the storage policy target library copy to use the primary copy as a source for auxiliary copy.
12. Delete the removable storage copy from the storage policy.
From this point on, traditional DASH copies will be used to transfer the data between the two sites. But since the baseline
exists in the target library, only blocks that have changed will be sent over the WAN.
For example, client backups are executed every hour and the DR backup is scheduled for 10:00 a.m. If the CommServe®
server crashes at 1:00 p.m. and is restored, it uses the 10:00 a.m. DR backup. However, since some client backups ran
during 10:00 a.m. and 1:00 p.m., the deduplication database contains block entries created after 10:00 a.m. Therefore,
these orphaned blocks entries are not known by the CommServe® server database.
Note that after a CommServe® server database restore, the deduplication databases may be in maintenance mode, which
requires resynchronization. But the resync process will work only if the CommServe server database is restored from a
DR backup that is less than five days old. If the DR backup used is older than five days, the deduplication databases must
be sealed, leading to a re-baseline of the deduplication store.
In a traditional backup environment, scalability is achieved by scaling up to increase resources. For instance, if a lack of
resources is detected for a media server, memory or processors can be added. If the server has used all of its resources,
then it must be replaced. Depending on the controller-based technology used (i.e., SAN, DAS, NAS), options are available
to add disks or an additional shelf of disks when storage space is low. But if the unit is already saturated, it must be
replaced by a larger one. This situation can involve high costs, significant planning, and migration efforts.
Using Commvault® HyperScale™ technology mitigates costly endeavors by providing on-premises scale-out backup and
recovery that delivers "cloud-like” scale and flexibility.
Commvault HyperScale™ technology allows you to start small and continues to grow as needed, significantly reducing
costs in the long run. For instance, deploying a block of three nodes provides 80 TB of available space as a storage
target. When space is low, another block of nodes can be added to the pool. This new set of nodes expands the existing
pool and is used automatically. Scaling an environment becomes a simple and easy task with no need for reconfiguration.
Data is spread across all nodes using erasure coding which provides resiliency. A disk can have a failure, or a node can
be offline without affecting the environment operations without losing any data.
Infrastructure Models
The Commvault HyperScale™ environment can be implemented using the following two models:
Support is provided by Commvault® not only for the software but also for the operating system, the firmware, and part
replacement.
the Commvault HyperScale™ appliance is that the support for firmware and part replacement is provided by the vendor.
Commvault is still responsible for the software and the operating system (RedHat® Linux) support.
The Reference Architecture can scale to hundreds of terabytes and is therefore suited to protect large organizations and
data centers.
For more information on the supported servers and vendors, consult the Commvault® online documentation. The number
of validated servers constantly grow with each service pack.
• CommServe server
• MediaAgents
• Deduplication Database Partitions
• CommServe® Server
In a Commvault HyperScale™ environment, the CommServe® server is required to control all operations with nodes
running on a RedHat® Linux operating system. Since the CommServe® server is a Windows-only server, it cannot be
installed directly on the node. Therefore, a Linux virtualization platform clustered across all nodes using GlusterFS (which
is a Linux clustered file system) is leveraged to run the CommServe® server as a virtual machine. If anything happens to
the active node running the CommServe® server, it will failover to the next node of the block.
MediaAgents
Each node within a block acts as a MediaAgent, a data mover that ingests data received from servers and sends the data
back to the servers during restore operations. Data is spread to disks across all nodes of the block. Catalogs of protected
objects are stored in the index directory, which is present on each node. The streams received from servers are load
balanced across all MediaAgents part of the storage pool.
Note that if there is a need to achieve LAN free backups or to create a tape copy, an additional controller can be added to
connect to the storage or tape library.
When adding an additional three nodes block to expand the storage pool, one database partition from each of the initial
nodes will automatically be moved on one of the three additional nodes. This results in a single partition per node. If the
storage pool is expanded again with another block, these new nodes will be part of the storage pool to increase the
storage capacity but will not host any database partitions. However, these additional nodes could host a deduplication
database partition for another storage pool, such as one using a cloud storage target. This is ideal to offer an offsite copy
of the data.
network can be used for DNS resolution purposes. All the switching, routing, and VLANs are not part of the reference
architecture nor the Commvault HyperScale™ appliance and must be provided and configured by the customer.
The network configuration also relies heavily on DNS resolution, both forward and reverse. Entries must be created in the
DNS server for each node. If not, hosts files on each node can be configured, but it increases the chance of human error
and misconfigurations. It is recommended to instead use DNS resolution.
When configuring the Commvault HyperScale™ node, the interface for this 10 GB network is represented as eno3 at the
Linux operating system level. If you run a Linux ifconfig command, it will return that eno3 interface configuration. A
representation of the backup network
Storage Network
The second 10 GB interface is used for the storage network (backend network). This isolated network is used for
communication of the clustered file system (GlusterFS) acting as a storage target to write backup data. This network can
use any arbitrary VLAN and does not require any routing towards other networks, nor a network providing DNS resolution.
All communications are handled internally by the Commvault HyperScale™ technology.
When configuring the Commvault HyperScale™ node, the interface used for the storage network is identified as eno4. If
you run a Linux ifconfig command, it will return that eno4 interface configuration.
An important requirement for this mechanism to work is that this network must be routed to communicate with the data
protection network. For instance, if network connectivity is lost on one of the nodes on the data protection network, the
software uses the iRMC network to automatically shut down the node, avoiding any inconsistencies (split brain). A
representation of the iRMC network
protection to isolate data transfer from the production network. If the data protection network has no DNS services
accessible, it makes the client backup configuration a lot harder. In this case, a 1GB optional network can be configured to
access the production network DNS services. The CommServe® server uses that interface to query network services.
The data transfer between clients and MediaAgents still travel on the data protection network.
At the nodes operating system level, the management network interface is represented as eno2. Running an ifconfig
command gives configuration information for that network.
Storage Architecture
Commvault HyperScale™ technology relies on a resilient storage system using erasure coding. The data is therefore
scattered on multiple disks and nodes. When using Commvault HyperScale™ reference architecture, nodes can have six,
twelve or twenty-four disk drives per node. The Commvault HyperScale™ appliance (HS1300) uses four disk drives per
node. Depending on the configuration it allows losing one or more disk without losing data.
First, a choice must be made in the parity scheme to use. Commvault HyperScale™ technology offers two options:
These files are then scattered on different disks and nodes. Since it uses the 4,2 model, it means that the data is always
available as long as four of the six segments are available.
The storage is logically divided into subvolumes. Each subvolume is made up of two physical disks per node from three
different nodes, for a total of six. As many logical subvolumes as needed are created until all disks are consumed.
Commvault HyperScale™ logical division of storage
File encoded by Commvault HyperScale™ erasure coding, are written to storage by following a simple rule. The six
segments of a file must be written to the same subvolume, one segment per disk. This rule ensures that all segments of a
file do not end all up on the same node, or even worst, the same disk. The segments of the next file can be written on
another subvolume or even the same, but never will the segments of the same file be split across multiple subvolumes.
File segments being written to a storage subvolume
Therefore, using a 4,2 parity means that as long as four segments of a file are still available, the data is valid. Up to two
disks could fail, or even a complete node without impacting operations. Using an 8,4 parity means that if 8 of the 12
segments of the file are available, the data can be read. It is important if a disk or a node should fail, to address the issue
as soon as possible to avoid reaching the threshold in numbers of failures, which would corrupt data. Resiliency when
using 4,2 parity
Storage Policies
1. Plan
2. Build
3. Assess & Modify
The following highlights the key elements of each phase:
• The Planning Phase – focuses on gathering all information to properly determine the minimum
number of storage policies required. Careful planning in this step makes it easier to build or modify
policies and subclients. The objective is to determine the basic structure required to meet
protection objectives. Modifications can later be made to meet additional requirements.
• There are three design methods that can be used during the plan phase:
o Basic Planning Methodology which focuses on generic guidelines to building storage policies
and subclients.
• The Build Phase – focuses on configuring storage policies, policy copies, and subclients. Proper
implementation in this phase is based on proper planning and documentation from the design
phase.
• The Modification Phase – focuses on key points for meeting backup/recovery windows, media
management requirements and environmental/procedural changes to modify, remove, or add any
additional storage policy or subclient components.
It is important to note that the ‘Design-Build-Modify’ approach is a cyclical process since an
environment is always changing. Not only is this important for data growth and procedural changes,
but it also allows you to modify your CommCell environment and protection strategies based on
emerging technologies. This provides greater speed and flexibility for managing protected data as
our industry continues to change at a rapid pace.
Consider these four basic rules for approaching storage policy design:
1. Keep it simple
2. Meet protection requirements
3. Meet media management requirements
4. Meet recovery requirements
Sometimes a 'Pie in the Sky' vision of protecting data can be brought right down to reality through a little education and a
cost association of the business requirements. Although you understand the capabilities and limitations of your storage,
the non-technical people may not. Provide basic guidance and education so they better understand what you and the
Commvault® software suite can do. You may not have the power to make the final decisions, but you do have the power to
influence the decision process.
• Protection Windows
• Recovery Time Objectives (RTO)
• Recovery Point Objectives (RPO)
When designing a CommCell environment, focus should always be placed on how data will be recovered. Does an entire
server need to be recovered or only certain critical data on the server require recovery? What other systems are required
for the data to be accessible by users? What is the business function that the data relies on? What is the associated cost
with that system being down for long periods of time? The following sections will address RTO and RPO and methods for
improving recovery performance.
Data Locations
In a distributed CommCell® architecture where different physical locations are using local storage, different storage
policies should be used. This avoids the potential of improper data path configurations within the policy copy resulting in
data being unintentionally moved over WAN connections. This also provides the ability to delegate control of local policies
to administrators at that location without potentially providing them full control to all policies.
Data Paths
For simplicity of managing a CommCell® environment, different libraries as well as location of the libraries may require
separate storage policies. This allows for easier policy management, security configurations, and media management.
Consider the following when determining storage policy strategies for libraries and data paths:
• When using Commvault® deduplication, for performance and scalability reasons different policies
should be used for each MediaAgent data path. This allows the deduplication database to be locally
accessible by each MediaAgent providing better throughput, higher scalability, and more streams to
be run concurrently.
• If a shared disk (not using Commvault deduplication) or shared tape library is being used where
multiple Client / MediaAgents have LAN free (Preferred) paths to storage, a single storage policy can
be used. Add each path in the Data Path Properties tab of the Primary Copy. Each Client /
MediaAgent will use the LAN Free path to write to the shared library. This allows for simplified
storage policy management and the consolidation of data to tape media during auxiliary copy
operations.
• If a shared disk (not using Commvault deduplication) or tape library is protecting LAN based client
data where multiple MediaAgents can see the library, each data path can be added to the primary
copy. GridStor® Round
Robin or failover can be implemented to provide data path availability and load balancing for data
protection jobs.
• Keep it simple. Unless specific content within an application or file system requires special retention
requirements, don't over design subclients.
• Consider using default retention policies providing several levels of protection. Provide the options
to the data owners and allow them to choose. Also, stipulate that if they do not make a choice, then
a primary default retention will be used. State a deadline in which they must provide their retention
requirements. It is important to note that this is a basic recommendation and you should always
follow policies based on company and compliance guidelines.
Consider defining retention rules for the following:
Disaster Recovery requirements should be based on the number of Cycles of data that should be retained. This should
also include how many copies (on-site / off-site) for each cycle.
Data Recovery requirements should be based on how far back in time (days) that data may be required for recovery.
Data Preservation/Compliance should be based on the frequency of point-in-time copies (Monthly, Quarterly, Yearly) and
how long the copies should be kept for (Days).
Data Isolation
A storage policy creates logical boundaries for protected data. Data associated with and managed by a storage policy is
bound to that policy. Protected data can be moved between copies within the same storage policy, but the data cannot be
moved from one storage policy to another. This data isolation can be crucial when considering the management of data by
different departments, by different data types, different retention needs, or different storage locations.
Compliance
Compliance requirements often dictate the long-term preservation of specific business data. There are multiple features
built into Commvault® software that provides business data isolation and long term storage for compliance data.
Reference Copy and legal hold provide methods to extract data from standard data protection jobs and associate the data
with storage policies configured to meet compliance retention requirements. When using these features, it is
recommended to configure separate storage policies to manage compliance data in isolation.
Legal Hold Storage Policies can also be used with Content Director for records management policies. This allows content
searches to be scheduled and results of the searches can be automatically copied into a designated Legal Hold Policy.
To use a legal hold storage policy, simply create a storage policy with the required legal hold retention. Then, enable it as
a legal hold policy, and the compliance officers and legal team members will be able to use it from the Compliance Search
portal.
Erase Data
Erase data is a powerful tool that allows end users or Commvault® administrators to granularly mark objects as
unrecoverable within the CommCell® environment. For object level archiving such as files and email messages, if an end
user deleted a stub, the corresponding object in Commvault protected storage can be marked as unrecoverable.
Administrators can also browse or search for data through the CommCell® console and mark the data as unrecoverable.
It is technically not possible to erase specific data from within a job. The way 'Erase Data' works is by logically marking the
data unrecoverable. If a Browse or Find operation is conducted, the data does not appear. For this feature to be effective,
any media managed by a storage policy with the 'Erase Data' option enabled will not be able to be recovered through
Media Explorer, Restore by Job, or Cataloged.
It is important to note that enabling or disabling this feature cannot be applied retroactively to media already written to. If
this option is enabled, then all media managed by the policy cannot be recovered other than through the CommCell
console. If it is not enabled then all data managed by the policy can be recovered through Media Explorer, Restore by Job,
or Cataloged.
If this feature is going to be used, it is recommended to use dedicated storage policies for all data that may require the
'Erase Data' option to be applied. Disable this feature for data that is known to not require this option.
To configure and use a Global Secondary Copy, the Global Secondary Copy Policy first needs to be created. Then, in
every storage policy for which you want to use it, a secondary copy associated to the Global Secondary Copy Policy must
be created.
Security
If specific users or groups need rights to manage a storage policy, it is recommended to use different policies for each
group. Each group can be granted management capabilities to their own storage policies.
Media Password
The Media Password is used when recovering data through Media Explorer or by Cataloging media. When using
hardware encryption or Commvault copy based encryption with the 'Direct Media Access' option set to 'Via Media
Password,' a media password is essential. By default, the password is set for the entire CommCell environment in the
System applet located in the Control Panel. Storage policy level media passwords can be set to override the CommCell
password settings. For a higher level of security or if a department requires specific passwords, use the 'Policy level'
password setting which is configured in the Advanced tab of the Storage Policy Properties.
RETENTION
Retention Overview
A data retention strategy is important for managing storage in your CommCell® environment. With Commvault® software,
you can define retention for multiple copies of data with each copy having different retention requirements. Additionally,
retention may be required at the object-level and not just the data protection operation. Commvault software makes this
strategy straight forward to implement by using storage policy copies, subclient object-level retention, and Exchange
configuration retention policies.
• Job based retention – Configured at the storage policy copy level, job schedule level, or manually by
selecting jobs or media to retain, and applying different retention.
• Subclient object based retention – Configured at the subclient level, it applies retention-based on
the deletion point of an object. Object-based retention is based on the retention setting in the
subclient properties plus the storage policy copy retention settings.
• Configuration policies – Currently used for Exchange mailbox protection. These policies include
archive, retention, cleanup, and journaling. Configuration policies provide the ability to define
complete retention and destruction policies, including the capability of deleting messages from the
production Exchange environment.
Retention Basics
Commvault® software provides extensive retention control for protected data. For basic retention requirements, follow the
general guidelines and best practices for retention configuration.
Disk storage:
o Leave the Cycles retention set at the default of two
o Use the Days retention to govern retention policies for each copy o Never use extended
retention rules when using Commvault deduplication Tape storage:
o Set the Cycles retention based on the number of complete sets of tape copies you want
to retain. For example, if you want 30 days of data stored off-site, which includes at least
four full backups and all dependent jobs (incremental or differential), for complete
recovery from any tape set, set the Cycles retention to four. o Set the Days retention
based on standard retention requirements.
Days
A day is a 24-hour time-period defined by the start time of the job. Each 24-hour time period is complete whether a backup
runs or not. This way, a day is considered a constant.
Cycles
A cycle is defined as all backup jobs required to restore a system to a specific point-in-time. Traditionally, cycles are
defined as a complete full backup, all dependent incremental backups, differential backups, or log backups; up to, but not
including the subsequent full backup. A cycle is referenced as Active or Complete, which means that as soon as a full
backup completes successfully it starts a new cycle which is the active cycle. The previous active cycle is marked as a
complete cycle.
An active cycle is marked complete only if a new full backup finishes successfully. If a scheduled full backup does not
complete successfully, the active cycle remains active until such time that a full backup does complete. On the other hand,
a new active cycle begins and the previous active cycle is marked complete when a full backup completes successfully
regardless of scheduling. In this way, a cycle can be thought of as a variable value based on the successful completion or
failure of a full backup. This also helps to break away from the traditional thought of a cycle being a week long, or even a
specified period of time.
When setting retention in the policy copy, base it on the primary reason data is being protected. If it is for disaster
recovery, ensure the proper number of cycles are set to guarantee a minimum number of backup sets for full backup
restore. If you are retaining data for data recovery, then set the days to the required length of time determined by retention
policies. If the data recovery policy is for three months, 12 cycles and 90 days or 1 cycle and 90 days will still meet the
retention requirements.
With the release of Commvault Version 11 software, the default retention for a storage policy primary
copy is 15 days and 2 cycles. A secondary copy's default retention is 30 days and 4 cycles.
• Both Days and Cycles criteria must be met for aging to occur
• Data is aged in complete cycles
• Days criteria is not dependent on jobs running on a given day
Rule 1: Both CYCLES and DAYS criteria must be met
Commvault® software uses AND logic to ensure that both retention parameters are satisfied. Another way of looking at this
is the longer of the two values of cycles and days within a policy copy always determines the time data is retained for.
Example: Retention for a storage policy copy is set to 3 days and 2 cycles. This is not a typical example, but it's used to
logically prove the statement that both days and cycles criteria must be met for data to age. By Monday 3 full backups
have been performed. If Friday's full backup is aged, there would be 2 full backups left meeting our criteria of 2 cycles.
However, the days criteria calls for 3 days, and if the Friday full backup was aged, only 2 days would be counted. The
Friday full backup would therefore age on Tuesday.
Monday at 12 PM the data aging operation runs and determines no data can be marked aged
Tuesday at 12 PM the data aging operation runs and determines the Friday full backup can be marked aged
Backup data is managed within a storage policy copy as a cycle or a set of backups. This includes the full backup which
designates the beginning of a cycle and all incrementals or differentials backups. When data aging is performed and
retention criteria allows for data to be aged, the entire cycle is marked as aged. This process ensures that jobs will not
become orphaned resulting in dependent jobs (incremental or differential) existing without the associated full backup.
Example: This is another retention example used to prove the rule. Retention is configured for 7 days and 2 cycles. Full
backups are being performed on Fridays and Mondays, and incremental backups on all other days. On Saturday the
cycles criteria of 2 has been met since there are 3 full backups. If a cycle is removed there would be 2 left, a complete
cycle (Monday – Thursday) and the full backup on Friday night. However, since we prune entire cycles we would have to
age the Friday full backup and the incremental backups from Saturday and Sunday. This results in only 5 days, which
does not meet our day’s retention requirements of 7. So on Monday when the data aging operation runs (default 12PM
daily) there will now be 7 days and 2 cycles which will allow the first cycle to be aged.
Retention has been defined for 7 Days and 2 Cycles. When the data aging operation runs on Saturday, the
cycles criteria has been met but not the days criteria
Retention has been defined for 7 Days and 2 Cycles. When the data aging operation runs on Monday both
Cycles and Days criteria have been met and the first cycle will be marked as aged
A day is measured as a 24 hour time period from the start time of a data protection job. Days are considered constants,
since regardless of a backup being performed or completed successfully, the time period will always be counted. If a
backup fails, backups are not scheduled, or if power goes out, a day will still count towards retention. Therefore it is so
critical to measure retention in cycles and days. If retention was just managed by days and no backups were run for a few
weeks, all backup data may age off leaving no backups.
Example: Defining retention in both days and cycles is very important. For example, during a Friday night backup power is
lost in the building. Power is restored on Sunday resulting in two days elapsing and counting towards retention. Note that
since the Friday full backup failed, the cycle continues into the next scheduled full (following Friday).
A failure of a full backup on Friday due to a power outage results in a cycle continuing until a valid full is
completed
Spool Copy
Right-click the primary storage policy copy | Click Properties | Retention tab
The Spool Copy option is used for fast disk read/write access and its multi-streaming capabilities – when there is limited
capacity available on the disks. A spool copy is not a retention copy. Data is spooled to disk and then copied to a
secondary copy. Once the data is successfully copied to the secondary copy, the data on disk is pruned, immediately
freeing up space for new backups.
Extended Retention
Right-click the desired storage policy copy | Click Properties | Retention tab
Standard retention allows you to define the length of time based on cycles and days that you want to retain data.
Extended retention allows you to define specific retention in days that you want to keep full backups for. It allows you to
extend the basic retention by assigning specific retention to full backups based on criteria configured in the extended
retention settings. Basically, it allows you to set a grandfather, father, son tape rotation scheme.
Extended retention rules are not designed to be used with disk storage and will have significant negative effects on aging
and pruning of deduplicated data.
Example: You want to retain backups for 4 cycles and 28 days. You also want to retain a monthly full backup for three
months, a quarterly full backup for a year, and a yearly full backup infinitely.
It is NOT recommended to set zero cycles for a policy copy unless another copy has been configured
with at least one cycle defined.
• Synthetic full item carry forward – this method does not directly prune items that have
exceeded retention. Instead, upon deletion of an item either by the user or the agent, items are
carried forward with each synthetic full backup until its 'days' retention is exceeded. Once the
synthetic full ages based on storage policy copy retention, the item no longer exists. This method is
used for file system agents using V1 indexing and is configured in the Subclient Properties.
• Index masking – this method marks the item as unrecoverable by masking the item in the index.
This method requires V2 indexing. This method is implemented for file system agents using V2
indexing in the Subclient Retention tab and for Exchange Mailbox agent using Configuration
policies.
• Defensible deletion – some items, specifically email messages, must be destroyed when they are
deleted from the production mail server. Item based retention can provide defensible deletion of
items.
• Efficient media usage – Consider the benefit of managing one year of off-site data on
considerably fewer tapes. Typically, when data is sent off-site on tapes, the same stale data exists
each time a set of tapes is exported. If data is sent off-site weekly on tape, 52 versions of the same
stale item exists.
Example: Using item-based retention when secondary tape copies are created, only the items contained within the most
recent synthetic full backup are copied to tape. If the retention is set to 365 days, then each tape set contains all items
within the past year. This means with a standard off-site tape rotation of 30 days, 365 days of data exists on each set.
When an image file is generated, all objects that exist at the time of the scan phase of the backup job are logged in the
image file. This information includes date/time stamp and journal counter information, which is used to select the proper
version of the object when the synthetic full runs. If an object is deleted prior to the image file being generated, it is not
included in the image file and is not backed up in the next synthetic full operation. The concept of synthetic full backups
and deleted objects not being carried over in the next synthetic full is the key aspect of how object based retention works.
Synthetic full concept diagram
The synthetic full carry forward method is used for V1 file system subclients using subclient retention
rules.
The synthetic full backups themselves are retained based on the storage policy copy retention rules. So, if the storage
policy copy has a retention of 30 days and 4 cycles, then a synthetic full remains in storage until the job exceeds retention.
In this instance, the object is carried forward for 90 days and the last synthetic full that copies the object over is retained
for 30 days, the object therefore remains in storage from the time of deletion for 120 days – 90 day subclient retention and
30 days storage policy copy retention.
How long a deleted object is potentially retained in a secondary copy depends on the copy type. If the secondary copy is a
synchronous copy then the deleted object will always be retained for the retention defined in the secondary copy since all
synthetic full backups will be copied to the secondary copy. Selective copies however, allow the selection of full backups
at a time interval. If synthetic full backups are run daily and a selective copy is set to select the month end full, then any
items that are not present in the month end synthetic full will not be copied to the selective copy. To ensure all items are
preserved in a secondary copy, it is recommended to use synchronous copies and not selective copies.
By default, a cleanup process runs every 24 hours. This process checks the Retention Policy's 'Retain for' setting for
messages or the subclient retention for files and marks all items exceeding retention as invisible. It is important to note
that if the 'Retain for' setting or the subclient retention is changed, (i.e., decreasing the number of days), the next aging
process immediately follows the new retention value.
If Exchange Mailbox agent data is copied to secondary copy locations, the days setting defined in the Retention Policy is
not honored. Instead, standard storage policy copy retention determines how long the messages are retained. In other
words, the primary copy manages all items at a granular level and secondary copies manage the retention at the job level.
From a compliance standpoint, this is an important distinction and should be taken into consideration when defining data
retention and destruction policies.
If that the V2 index is lost and restored to a previous point-in-time, it is possible that previously masked items will be set to
visible. The next time the aging process runs, these items will be re-masked making them unrecoverable.
From a compliance standpoint, defensible deletion of items is crucial. There is the possibility that email messages or files
copied to secondary storage such as tape media, could potentially be recovered using the Media Explorer tool. To ensure
that this cannot occur, enable the 'Erase Data' checkbox for any storage policies managing Exchange Mailbox agent data.
Note that the 'Erase Data' option is enabled by default for all data management storage policies.
Subclient Retention
Right-click Subclient | properties | Advanced | Retention tab
Subclient retention should only be used for users' data. When using synthetic full backups, subclient retention can be
applied to both backup and archive operations only.
These settings only apply to files (or stubs) that are deleted from the system.
• Blocks the use of traditional full backups, only synthetic full backups are allowed.
• Enables the use of archiver or backup retention options.
Example: You enable 'Backup Retention' on the subclient Retention tab and set the 'After deletion keep items for <period
of time>' option time value to 1 month. The 1 month (30 day) count starts from the last time the deleted file appeared in a
data protection job's scan. Appearance in a data protection job scan means the file is considered to be "in image." An "in
image" file always has a copy in protected storage. A synthetic full backup job keeps the deleted file "in image" for the
specified time. Once the backup retention time has passed, storage policy retention is applied. The deleted file appears
last in the most recently completed synthetic full backup job. Storage policy copy retention then retains that job for its
cycle and days retention criteria. Synthetic full backup jobs must be run to enable aging and pruning of data from media.
Retention is cycle and time-based. Files or stubbed files are extended on media by both the archiver and backup retention
time based on their file modification time. Once this retention has been exceeded, the storage policy copy retention 'Days
and Cycles' criteria are applied. Synthetic full backups must be run to allow aging and pruning of data from media.
Note: A stub file supports non-browse recovery operations (i.e., stub recalls) and acts as a place holder to persist the
associated file on media through synthetic full backups. Stub files have the same modification time as the associated file.
Deleting a stub is equivalent to deleting the file.
The 'Retention of deleted files on media' is time-based only using the deleted file's modification time (MTIME). Based on
the MTIME, the deleted file is retained on media for the 'Archiver Retention' time plus the 'Backup Retention' time. So, if
'Archiver Retention' was set to 2 years and 'Backup Retention' set to 1 month, the total retention time on media for deleted
files would be 2 years and 1 month from the deleted file's last modification time.
Note: If 'Archiver Retention' is set to 'Extend Retention' indefinitely (default), 'Backup Retention' is un-selectable. To select
both options, you need to select the 'Archiver Retention' option to 'Extend Retention' for <a period of time>.
Retaining previous file versions essentially applies the same retention clock basis (file modification time) used for the
current version to all versions qualified by the criteria.
VIRTUALIZATION
Virtualization Primer
Virtualization has become the standard of data center consolidation whether on-premises or in the cloud. As the number of
virtual machines and the physical hosts they run on grows, a comprehensive protection strategy is required to ensure
proper protection. Commvault® software provides several protection methods for virtual environments on premises and in
the cloud. These methods provide a comprehensive enterprise hybrid protection strategy.
There are four primary methods Commvault® software can use to protect virtual environments:
Depending on the hypervisor application being used and the virtual machine's operating system, different features and
capabilities are available. The VSA interfaces with the hypervisor's APIs and provides capabilities inherent to the
application. As hypervisor capabilities improve, the Commvault VSA agent is enhanced to take advantage of new
capabilities.
Agent-Based Protection
Agent-based protection uses Commvault agents installed directly in the virtual machine. When an agent is installed in the
VM, it appears in the CommCell® console just like a regular client and the functionality is the same as an agent installed
on a physical host.
The main advantage with this configuration is that all the features available with Commvault agents are used to protect
data on the VM. For applications, using a Commvault agent provides complete application awareness of all data
protection operations including streaming log backups, granular item-based protection, archiving and content indexing.
The process for protecting virtual machines is similar to performing snapshots with the VSA agent directly interfacing with
the hosting hypervisor application. The VSA first quiesces the virtual machine and then the IntelliSnap feature uses vendor
API's to perform a hardware snapshot of the Datastore. The Datastore is then mounted on an ESX proxy and all VMs are
registered. Finally, the VMs are backed up and indexes are generated for granular level recovery. The snapshots can also
Commvault® Education Services Page 128 of 178
V11 SP18 Commvault® Engineer February 2020
be maintained for live browse and recovery. The backup copies are used for longer term retention and granular browse
and recovery.
Transport Modes
The VMware® VADP framework provides three transport modes to protect virtual machines:
Virtual machines are backed up through the VSA and to the MediaAgent. If the VSA is installed on a proxy server
configured as a MediaAgent with direct access to storage, LAN-Free backups can be performed. For best performance,
Commvault recommends that the VSA have a dedicated HBA to access the VMDK files. If an iSCSI SAN is used, we
recommend a dedicated Network Interface Card on the VSA for access to the SAN.
HotAdd Mode
HotAdd mode uses a virtual VSA in the VMware environment. This requires all data to be processed and moved through
the VSA proxy on the ESX server. HotAdd mode has the advantage of not requiring a physical VSA proxy and does not
require direct SAN access to storage. It works by 'hot adding' virtual disks to the VSA proxy and backing up the disks and
configuration files to protected storage.
A common method of using HotAdd mode is to use Commvault® deduplication with client-side deduplication, DASH Full
and incremental forever protection strategy. Using Change Block Tracking (CBT), only changed blocks within the virtual
disk have signatures generated and only unique block data are protected.
This mode is also useful when there is no physical connectivity between the physical VSA proxy and the Datastore
storage preventing the use of SAN transport mode. Some examples of such scenarios are when using NFS Datastores or
using ESX hosts local disk storage to host Datastores.
NBD Mode
NBD mode uses a VSA proxy installed on a physical host. The VSA connects to VMware and snapshots will be moved
from the ESX server over the network to the VSA proxy. This method requires adequate network resources. NBD mode is
the simplest method to protect virtual machines.
The following steps illustrate the process of backing up VMware® virtual machines:
1. Virtual Server Agent communicates with the hypervisor instance to locate virtual machines defined
in the subclient that requires protection.
2. Once a virtual machine is located, the hypervisor prepares the virtual machine for the snapshot
process.
3. The virtual machine is placed in a quiescent state. For Windows ® VMs, VSS is engaged to quiesce
disks.
4. The hypervisor then conducts a software snapshot of the virtual machine.
5. The virtual machine metadata is extracted.
6. The backup process then backs up all virtual disk files and VM configuration files.
7. Once the disks are backed up, indexes can optionally be generated for granular recovery.
8. Finally, the hypervisor deletes the snapshots.
Stream allocation when there are more VMs than data readers
Stream allocation when there are more data readers than VMs
DataStore Distribution
If VMs within a subclient exist across multiple Datastores, the coordinator assigns VMs to proxies, one VM per Datastore
until the maximum stream count is reached. Each VM is assigned to a different data mover proxy, balancing stream loads
across proxies based on proxy resources. This distributes the load across multiple Datastores, which improves backup
performance and maintain a healthy Datastore state. In addition to the subclient Data Readers setting, a hard limit can be
set for the maximum number of concurrent VMs that can be protected within a single Datastore using the
nVolumeActivityLimit additional setting.
VSA Proxies
Commvault® software uses VSA proxies to facilitate the movement of virtual machine data during backup and recovery
operations. The VSA proxies are identified in the instance properties. For Microsoft Hyper-V, each VSA proxy is
designated to protect virtual machines hosted on the physical Hyper-V server. For VMware, the VSA proxies is used as a
pooled resource. This means that depending on resource availability different proxies may be used to backup VSA
subclients each time a job runs. This method of backing up virtual machines provides for higher scalability and resiliency.
1. Number of proxies available to back up a VM – The fewer proxies available, the higher in the
queue the VM is. This also is dependent on transport mode. If the transport mode is set to Auto
(default), SAN have highest priority, followed by HotAdd and then NDB mode. If a specific transport
mode is defined in the subclient, only proxies that can protect the VM can be used – this could
affect the available number of proxies which could result in a higher queue priority.
2. Number of virtual disks – VMs with more virtual disks are higher in the queue.
3. Size of virtual machine – Larger VMs are higher in the queue.
Throttling can be hard set on a per proxy basis using the following registry keys:
2. Click Advanced.
• Cluster Shared Volume (CSV) owner – VMs are protected based on the VSA proxy that owns the
cluster.
• Cluster – If CSV owner is not in the proxy list, VMs are dispatched to any node in the cluster.
• Host – When the hypervisor host is a VSA proxy and in the proxy list, the host VSA proxy is used.
• Any Proxy – If the hypervisor host is not a proxy or not in the list, VMs are distributed to any
available proxy.
If the VSA coordinator proxy goes offline, VSA backup jobs managed by the coordinator are placed in a pending state. The
next proxy in the list assumes the role of the active coordinator proxy and jobs will return to a running state. Any VMs that
were in process of being protected are re-queued and restarted.
The data readers setting in the subclient is the primary governor to determine the maximum number of VMs or virtual
disks that can be protected at a given time. The load distribution attempts to balance VM backups across disk volumes.
However, if the VMs requiring protection reside on only a few volumes and the data readers is set too high, problems can
occur.
When a VM backup runs, a software snapshot is taken where block changes are cached, the frozen disk data is read from
the volume, and normal I/O still occurs on the volume. With these three actions occurring simultaneously, if there are too
many snap and backup operations occurring, significant performance degradation can occur. This can also cause major
issues during snapshot cleanup operations.
As a general rule of thumb, each disk volume should have two concurrent snap and backup operations as a starting point.
This number may vary greatly based on whether the disks are SAN or NAS, the size of the disks, and the performance
characteristics. Consider the significant performance difference between spinning versus solid state disks. The two data
readers per disk volume is a starting point. Adjust these numbers to a point where backup windows are being met. Also,
consider mapping subclients to specific disk volumes and adjusting the data readers based on the performance
characteristics of the disks.
It is possible to disable the skipping of page and swap files by creating the bSkipPageFileExtent additional setting on
the VSA proxy and by setting its value to 0 (zero).
If your subclient’s content is defined using auto-discovery rules, it is recommended to define VM filters
at the backup set level to ensure that none of the subclients back up the VM.
Example: A database server requires protection. For shorter recovery points and more granular backup and recovery
functionality, a database agent can be used to protect application database and log files. For system drives, the virtual
server agent can be used for quick backup and recovery. Disks containing the database and logs should be filtered from
the VSA subclient. The VSA will protect system drives and the application database agent will be used to protect database
daily and log files every 15 minutes. This solution provides shorter recovery points by conducting frequent log backups,
application aware backup and restores, and protects system drives using the virtual server agent.
The default subclient content tab contains a backslash entry, like the Windows® File System agents to signify the subclient
as a catch all. Any VMs not protected in other subclients are automatically protected by the default subclient. It is
recommended that the default subclient contents is not changed, activity is not disabled and the default subclient is
regularly scheduled to back up, even if there are no VMs in the subclient.
To avoid protecting VMs that do not need to be backed up, use the backup set level filters and add all VMs that don't
require protection. Complying with these best practices ensures that if a VM is added in the virtualization environment,
even if the Commvault® system administrator is unaware of the VM, it gets protected by the default subclient.
VM Content Tab
Right-click the desired subclient | Click Properties | Content tab
VSA subclient contents are defined using the Browse or Add buttons. Browse provides a vCenter like tree structure where
resources can be selected at different levels including Cluster or Datastore. For most environments, it is recommended to
select subclient contents at the cluster level. For smaller environments, or for optimal performance, defining subclient
contents at the Datastore level can be used to distribute the backup load across multiple Datastores.
The Add option is used to define discovery rules for VM content definition. Multiple rules can be nested such as all
Windows® VMs in a specific Datastore.
Discovery Rules
Right-click the desired subclient | Click Properties | Content tab | Add
You can refine the selection of virtual machines for subclient content by defining rules that identify specific virtual
machines based on their properties. These rules are used in conjunction with other discovery rules that identify virtual
machines based on operating system, server, and storage location.
DataStore Enter the DataStore name or a pattern. You can also click ... to open the Browse dialog box.
Guest OS Enter the exact name of the operating system or a pattern to identify an operating system
group (for example, Win* to identify any virtual machine that has a version of the Windows®
operating system).
Guest DNS Enter a hostname or a pattern to identify a hostname or domain (for example,
Hostname myhost.mycompany.com to identify a specific host or *mycompany.com to identify all hosts
on that domain).
Power State Select the power on status of virtual machines to be included in the subclient content. You can
select one of the following options:
• Powered On - to identify VMs that are powered on
• Powered Off - to identify VMs that are powered off
• Other - to identify VMs with a different power on status, such as Suspended
Notes Enter a pattern to identify virtual machines based on notes text contained in
vCenter annotations for the VM summary (for example, Test* to identify VMs with
a note that begins with "Test").
Custom Attribute Enter a pattern to identify virtual machines based on custom attributes in vCenter
annotations for the VM summary. You can enter search values for the names and
values of custom attributes. For example:
• Name Contains *resize* to identify VMs where the name of a custom
attribute contains the word "resize."
• Value Contains *128* to identify VMs where the value of a custom
attribute contains the number "128."
The VMware transport mode is configured in the General tab of the subclient. The default setting is Auto which will attempt
to use SAN or HotAdd mode and fall back to NBD mode if other modes are not available. To configure a specific transport
mode with no fall back, select the desired mode from the drop-down box.
Data Readers
Right-click the desired subclient | Click Properties | Advanced Options tab
The data readers setting in the advanced tab of the subclient properties is used to determine the number of streams used
for the subclient backup. This value must be set to meet backup windows while avoiding overloading DataStore, network,
and proxy resources.
Subclient Proxies
Right-click the desired subclient | Click Properties | Advanced Options tab
Proxies are defined in the VSA instance but can be overridden at the subclient level. This is useful when specific subclient
VM contents are not accessible from all VSA proxies. Proxies can be added, removed, and moved up or down to set proxy
priority.
Subclient or backup set filters can be used to filter virtual machines or virtual machine disks for both Hyper-V and VMware.
If auto-discovery rules are used to define content, it is recommended to apply filters at the backup set level to ensure that
no subclients protect the VM.
Backup Options
Right-click the desired subclient | Click Properties | Backup Options tab
There are several subclient options that are specific to the VMware® and Hyper-V® VSA subclient.
• Quiesce guest file system and applications – Configured in the Quiesce Options tab, this is
used to enable (default) or disable the use of VSS to quiesce disks and VSS aware application for
Windows® virtual machines.
• Application aware backup for item-based recovery – Configured in the Quiesce Options tab,
this is available only when using the IntelliSnap feature and is used to conduct application aware
snapshots of virtualized Microsoft SQL and Exchange servers.
• Perform Datastore free space check (VMWare only) – Configured in the Quiesce Options tab,
this sets a minimum free space (default 10%) for the Datastore to ensure there is enough free
space to conduct and manage software snapshots during the VM data protection process.
Virtual machine owners can be assigned automatically during virtual machine discovery, based on privileges and roles
defined in vCenter that indicate rights to virtual machines. When this feature is enabled, users and user groups who have
appropriate capabilities in vCenter and are also defined in the CommCell® console are automatically assigned as VM
owners in the client computer properties for the virtual machine.
This feature enables administrators and end users to access virtual machine data without requiring that they be assigned
as VM owners manually. Depending on the permissions and role a user has in vCenter, they can view virtual machine
data or recover VM data. Any user with Remove VM, VM Power On, and VM Power Off capabilities for a virtual machine is
assigned as an owner of that VM during VM discovery.
Owner IDs are only assigned during discovery for a streaming or IntelliSnap backup and are not modified by backup copy
or auxiliary copy operations.
Single sign on must be enabled on the vCenter and required vCenter capabilities must be configured for users and
groups.
Users or user groups defined in vCenter must also be defined in the CommCell interface, either through a local user
definition or a Name Server user definition (such as an Active Directory user or group).
• Live File Recovery – allows Commvault software to break open a backup or snapshot copy of a
virtual machine and recover individual files. This feature provides extended support for various file
system types. Use this feature to reduce backup times without sacrificing the capability to recover
individual files.
• Live Recovery for Virtual Machines – provides the ability to start a virtual machine almost
instantaneously while recovering it in the background. This provides an artificially enhanced RTO as
we do not have to wait for the full recovery operation to complete before accessing the virtual
machine.
• Live Mount – allows to power up virtual machines directly from the backup copy without having to
restore it or commit any changes. This allows access to the virtual machine for validation purposes,
testing or application level recovery via the provided mining tools.
• Live Sync – takes changed blocks from our standard VSA protection copy and overlays those blocks
to a warm standby VM at an alternate location thereby providing VM level replication. Live Sync can
be used to create and maintain warm recovery sites for virtual machines running critical business
applications.
Not all VSA features are supported on all hypervisors. For more information on supported features for your hypervisor,
refer to the Commvault Online documentation.
Live Mount
Expand Client Computer Groups | VSA instance | Right-click the desired VM | All Tasks | Live Mount
The Live Mount feature enables you to run a virtual machine directly from a stored backup. You can use this feature to
validate that backups are usable for a disaster recovery scenario, to validate the content on the backup, testing purposes,
or to access data from the virtual machine directly instead of restoring guest files.
Virtual machines that are live mounted are intended for short term usage and should not be used for production; changes
to live mounted VMs or their data are not retained when the virtual machine expires. The VM expiration period is set
through a Virtual Machine policy.
When a live mount is initiated, an ESX server is selected to host the virtual machine, based on the criteria set in the live
mount virtual machine policy. The backup is exposed to the ESX server as a temporary Datastore. The configuration file
for the live mounted VM is updated to reflect the name of the new VM, disks are redirected to the Datastore, and network
connections are cleared and reconfigured to the network selected for the live mounted VM. When this reconfiguration is
complete, the VM is powered on.
Situation: You are about to apply updates to a critical system and are concerned about the impacts on the
system.
Solution: Use Live Mount to power on the same system from the backups. Isolate it on its own network to avoid
duplicate hostname and IP address. Install and validate the update.
Live File Recovery provides expanded file system support, including ext4, and enables live browse of backup data without
requiring granular metadata collection during backups. This option supports restores of files and folders from backups of
Windows VMs and of UNIX VMs that use ext2, ext3, ext4, XFS, JFS, or Btrfs file systems.
Live File Recovery can also be used to reduce backup times. This is a trade-off; using this feature reduces backup time
but increases the time required to browse files and folders. It is only supported for backups to disk storage targets.
To recover files or folders from a backup, you can enable backup data to be mounted as a temporary NFS Datastore that
can be used to browse and restore file and folders. The process is similar to an ISO file that you right-click and mount on a
Windows computer. The operating system virtually mounts the ISO file and cracks it open to display the content. In the
case of Live File Recovery, the Windows MediaAgent locates the virtual machine's blocks in the disk library. These blocks
are presented to the Windows operating system through a virtual mount driver. The VM file system is then cracked open
and the content is displayed in the console.
For Linux virtual machine, the file system cannot be mounted by the Windows MediaAgent. It requires a virtual Linux
MediaAgent on which the File Recovery Enabler for Linux (FREL) component must be installed.
For Service Pack 6 and earlier, a Linux VMware template containing the MediaAgent and FREL (downloadable from
Commvault cloud) needs to be deployed. Refer to the Commvault Online Documentation VMWare section.
Since Service Pack 7, simply deploy a Linux VM and install the MediaAgent code. If the system requirements are in place,
the FREL component is automatically installed with the MediaAgent software.
Enabling or disabling the Live File Recovery method is achieved by the 'Collect File Details' backup option of a subclient. If
it is check, traditional file recovery is used. If unchecked, Live File Recovery is used.
The default, for a new backup or schedule, is to use Live File Recovery.
If 'Collect File Details' was enabled, but you still want to use Live File Recovery, configure the following additional setting
key on the VSA proxy:
Performing a Live File Recovery is achieved through the usual guest files and folders recovery screens. The difference is
in the system mechanics.
Live VM Recovery
Right-click the desired subclient or backup set | Click All Tasks | Browse and Restore | Virtual Server tab
The Live Recovery feature enables virtual machines (VMs) to be recovered and powered on from a backup without waiting
for a full restore of the VM. This feature can be used to recover a VM that has failed and needs to be placed back in
production quickly, or to validate that a backup can be used in a disaster recovery scenario.
Basically, the disk library is presented to the virtualization environment. Then the VM is powered on from the disk library.
While it runs, the VM get moved back into the production Datastore using a storage 'vMotion' operation. All these tasks
are accomplished automatically by Commvault® software.
Live Sync
The Live Sync feature enables incremental replication from a backup of a virtual machine (source VM) to a synchronized
copy of the virtual machine (destination VM).
The Live Sync operation opens the destination VM and applies changes from the source VM backups since the last sync
point. It is important to understand that since it is achieved from the backups, Live Sync is not a real-time synchronization.
The Live Sync feature can initiate replication automatically after backups or on a scheduled basis (for example, daily or
once a week), without requiring any additional action from users. Using backup data for replications minimizes the impact
on the production workload by avoiding the need to read the source VM again for replication. Additionally, in cases where
corruption on the source VM is replicated to the destination VM, users can still recover a point-in-time version of the
source VM from older backups.
If no new backups have been run since the last Live Sync, the scheduled Live Sync does not run.
When using Live Sync, it is recommended to use an incremental forever strategy. Run a first full backup, which gets
replicated to the destination. Then, only run incremental backups to apply the smallest changes possible to the
destination. Periodically, such as once a week, run a synthetic DASH full backup to consolidate backups in a new full
backup, without impacting the replication. If you execute a real full backup, the entire machine must replicate to the
destination.
Before you configure Live Sync, configure the vCenter client in the CommCell® console. If the destination uses a different
vCenter server, it must also be defined as a vCenter client. Then run the initial VM backups. The VM must be backed up
once and can then be added to a Live Sync schedule.
By default, Live Sync replicates from backups in the primary copy of a storage policy. It is possible to modify this behavior
to restore from a secondary copy. This can be useful when the VM is backed up to a disk library that is replicated to a
remote site where the replicated machine resides.
When Live Sync is configured to use an auxiliary copy or backup copy, the Live Sync operation uses the copy as the
source rather than the primary backup. If the 'After Backup Job Completes' option is selected in the schedule, Live Sync
automatically waits until the data is ready on the secondary copy before running the Live Sync job.
The Live Sync Monitor tool is used to monitor and control live sync replication. In addition to the replication status of VMs,
replication can be enabled/disabled and VM failover/failback can be initiated.
From the Live Sync Monitor, the failover of a virtual machine can be initiated. It can be defined as a planned failover, for
testing purposes for instance, or unplanned, such as in a disaster situation. Once a VM was failed over, a failback
operation can be executed. In a failback, the VM from the failover location gets backed up and synced back to the primary
site.
• Test Boot VM – Powers on the replicated VM. It is useful to test and ensure that it is useable in the
case of a disaster. The destination VM is not modified to avoid any conflicts with the production VM.
• Planned Failover – The planned failover is useful to test the complete failover scenario or to
conduct maintenance on the primary site. A planned failover achieves the following tasks:
1. Powers off the source VMs.
2. Performs an incremental backup of the source VMs
3. Runs Live Sync to synchronize the destination VMs with the latest changes
4. Disables Live Sync
5. Powers on the destination VMs with the appropriate network connections and IP addresses
• Unplanned Failover – The unplanned failover is used in a real disaster scenario where the primary
site is unavailable. In this scenario, the unplanned failover does not care about the primary site and
achieves the following tasks:
1. Disable Live Sync
2. Powers on the destination VMs with the appropriate network connections and IP addresses
Crash Consistent
Crash Consistent backups are based on point-in-time software snapshots and backup operations of a virtual machine that
allows the VM to be restored to the point in which it was snapped. When the snapshot occurs, all blocks on the virtual
disks are frozen for a consistent point-in-time view. The application is not aware that this process is occurring.
There are several issues when performing crash consistent snapshot and backup operations. The first issue is that if an
application is running on the virtual machine, it is not aware the snapshot is being taken. VSA communicates with the
hosting hypervisor to initiate snapshots at the VM level and there is no communication with the application. Any I/O
processes being conducted by the application will continue without any knowledge that the snap has been performed.
This can cause issues during restore operations as the application data will be restored to the exact point where the
software snapshot was conducted.
Example: a database application is conducting a maintenance to defrag and reorganize data within its files. In the middle
of this process, the software snapshot occurs. When the VM is restored, it will be placed in the state of the maintenance
period.
Another issue in this case would be larger than normal snapshots as all the block changes are cached to keep the
production virtual disk in a consistent state. This will cause a longer than normal cleanup process when the snapshot is
released and may cause storage space issues on the production volume.
Application Consistent
With Application Consistent protection, the application itself is aware that it is being snapped. This awareness allows for
the data to be protected and restored in a consistent and usable state. Application aware protection works by
communicating with the application to quiesce data or by using scripts to properly quiesce the data. Application consistent
protection is not critical for file data but is critical for application databases.
• Commvault® agents
• Application Aware VSA Backup
• Application Consistent VSA Backup
• Scripting Database Shutdowns
Commvault Agents
An agent installed in the VM will directly communicate with the application running in the VM. The agent communicates
with the application to properly quiesce databases. A streaming backup of application data is then conducted. If the
application data is on an RDM volume, the application agent can be used with the IntelliSnap feature to quiesce the data
and snap the volume. A proxy host can be used to back up the data avoiding load on the VM or hypervisor. Using
application agents in the VM also provide database and log backup operations and a simplified restore method using the
standard browse and recovery options in the CommCell® console. Commvault agents in the hosting VM are
recommended for mission-critical high I/O applications.
The main advantage with this configuration is that all the features available with Commvault agents are used to protect
data on the VM. For applications, using a Commvault agent provides complete application awareness of all data
protection operations including streaming log backups, granular item-based protection, archiving and content indexing.
Application aware VSA backups inserts an 'application plugin' into the VM during a VSA backup and IntelliSnap® feature.
When a VM backup runs, the plugin quiesces the application using a VSS snapshot. The VSA coordinator then
communicates with the hypervisor to conduct a VM snapshot. If IntelliSnap is used, a hardware snapshot is taken on the
Datastore and then the software snapshot and VSS snap is released.
Nutanix AVH Yes Yes Yes (only with (only with Linux
(streaming) Windows proxy) proxy)
Nutanix AVH Yes Yes Yes (only with (only with Linux
(IntelliSnap) Windows proxy) proxy)
OpenStack Yes Yes Yes (only with (only with Linux
(streaming) Windows proxy) proxy)
To enable application aware VSA backups, a user account with administrative privileges for the application must be used.
This account can be entered at the instance or subclient level. When the VSA backup runs, the system detects if any
supported agents are installed in the VM and automatically installs the application plugin. After the backup completes, the
plugin remains in the VM for subsequent backup operations. Application data recovery is conducted using the agent in the
CommCell® console, providing full agent level recovery options.
1. It validates that the MediaAgent software is installed on the VSA proxy server
2. It validates that the Snap Copy is created for the storage policy
3. It discovers if a supported application is installed in the VM
4. It pushes the application plugin
5. It protects the application
Database Dumps
In many organizations, DBAs continue to rely and database dumps for their backups. Although this is not the most efficient
method of protecting databases and is not truly a backup, it does result in a consistent state dump of a production
database. If the dump files are being backed up, application aware restores can be conducted. This will require someone
with knowledge of the application in order to restore the database in an online state.
INTELLISNAP® TECHNOLOGY
Hardware based snapshot technology provides the ability to use optimized hardware and disk appliances to snap data on
disk arrays providing quick recovery by reverting or mounting the snapshots. This protection method significantly reduces
protection and recovery times while requiring minimal additional disk storage to maintain snaps. Since minimal storage is
required to hold snapshots, they can frequently be conducted to provide multiple recovery points to minimize potential
data loss. Snapshot technology can also be used to snap and replicate data to additional disk storage using minimal
bandwidth, providing physical data separation and a complete disaster recovery solution.
Technology is rapidly evolving, and more capabilities are being added to snap hardware with every new generation.
However, hardware-based snapshot technologies without an enterprise data protection software to manage the snaps
have several disadvantages. IntelliSnap® Technology overcomes these limitations by providing a single interface to
conduct, manage, revert, and backup snapshots.
The following lists the key highlights for the IntelliSnap feature:
• Granular recovery - Snapshots can be mounted for Live Browse and indexed during backup
operations for granular recovery of objects within the snap. Whether using live browse or a restore
from a backup, the method to restore the data is consistent. Using the proper iDataAgent you can
browse the snapped data and select objects for recovery. This process is especially useful when
multiple databases or virtual machines are in the same snap and a full revert cannot be done. In
this case, just the objects required for recovery can be selected and restored.
• Clone support – Commvault software supports clone, mirror and vault capabilities for certain
hardware vendors and is adding support for additional vendors as its software continues to evolve.
• Simplified management – Multiple hardware vendors supported by the IntelliSnap feature can all
be managed through the Commvault interface. Little additional training is involved since the same
subclient and storage policy strategies used for backing up data are extended when using
snapshots. Just a few additional settings are configured to enable snapshots within the CommCell ®
environment.
The IntelliSnap feature is rapidly evolving to incorporate increased capabilities as well as expanded
hardware support. Check Commvault documentation for a current list of supported features,
applications and vendors.
Copy on Write
The copy on write method uses snapshots to gather reference markers for blocks on the snapped volume. A ‘copy on write
(COW)’ cache is created which caches the original blocks when the blocks are overwritten. This requires a readwrite-write
operation to complete. When a block update of a snapped volume is required, the original block is read from the source
volume. Next the original block is written to the cache location. Once the original block has been cached, the new block is
committed to the production volume overwriting the original block. This method has the advantage of keeping production
blocks contingent in the volume which provides faster read access. The disadvantage is the read-write-write processes
increases I/O load on the disks.
• Fast hardware snapshots result in shortened VM quiesce times and faster software snapshot
deletes. This is ideal for high transaction virtual machines.
• Live browse feature allows administrators to seamlessly mount and browse contents of virtual
machines for file and folder based recovery.
• Revert operations can be conducted in the event of DataStore corruption. For NetApp arrays,
individual virtual machine reverts can also be conducted.
• Hardware snapshots can be mounted to an ESX proxy server for streaming backup operations
eliminating the data movement load on production ESX hosts.
The IntelliSnap for VSA snap and backup process uses the following steps:
1. VSA communicates with the hypervisor to locate VMs and initiate snap operations.
2. The hypervisor quiesces virtual machines listed in the subclient contents.
3. Hypervisor initiates software snapshots.
4. The IntelliSnap feature uses MediaAgent processes to initiate a hardware snapshot of the volume.
5. Once the snapshot is complete, the VSA proxy communicates with the hypervisor to remove the
software snapshots.
6. VMs are mounted to a hypervisor proxy for backup operations.
File System
File system block level backups are used to protect large volumes where the number of objects in the volume make it
impractical to conduct traditional indexed based backups which require a scan, backup, and index phase to complete.
Exchange Database
Exchange database block level backups are used to conduct database mining for mailbox data without requiring a staging
area for the database. Since the block level backup appears to Commvault software as a snapshot, it can be mounted and
mined directly from the Content Store.
3. The primary snap copy of the storage policy manages the snap.
4. A backup copy operation is used to copy the snapshot to protected storage.
5. The VSS snapshot is released
IntelliSnap® Configuration
Array Configuration
Hardware arrays are configured from the Array Management applet which can be accessed from the Control Panel or from
the Manage Array button in the subclient. All configured arrays will be displayed in the Array Management window.
Multiple arrays can be configured, each with their specific credentials. For some arrays, a Snap Configuration tab is
available to further customize the array options.
A primary snap copy can be added for any Storage Policy by right-clicking the policy. Select All Tasks and then Create
New Snapshot Copy. The copy can be given a name, define a data path location to maintain indexing data, and retention
settings can be configured.
Retention can be configured to maintain a specific number of snapshots, retain by days or retain by cycles. Note that if the
days or cycles criteria is going to be used, it is critical have a complete understanding of how days and cycles criteria
operate.
Snapshot Retention
Just like traditional protection methods, storage policies are used to manage the retention of snapshots. There are three
methods retention can be configured for snapshot data:
• Cycles retention
It is important to note that although snap operations can be scheduled as full, incremental or differential, a snapshot will
always be the same. The type of backup is in fact applied to the subsequent snap backup copy job, which copies the
content of the snapshot to Commvault® storage. For instance, if an incremental job was selected, only changes since the
last snap backup copy job are sent to the Commvault library.
Days Retention
The days retention rule determines how many days of snapshots are retained. Careful planning should be done before
configuring the number of days for snap retention to ensure there is adequate disk cache space. This factor is determined
by the number of snaps performed and the incremental block change rates. Performing hourly snapshots with a high
incremental change rate and a two day retention may require more cache space then performing daily snapshots with low
change rates and a seven day retention.
Cycles Retention
The days retention rule determines how many days of snapshots are retained. Careful planning should be done before
configuring the number of days for snap retention to ensure there is adequate disk cache space. This factor is determined
by the number of snaps performed and the incremental block change rates. Performing hourly snapshots with a high
incremental change rate and a two day retention may require more cache space then performing daily snapshots with low
change rates and a seven day retention.
Retention Dependencies
Cycles can also be used to manage snapshots. When using this option, it is important to ensure backup copies are
properly running to protect all full and incremental jobs. When using cycles to define snapshot retention, the basic
retention rules of cycles applies just as if a backup operation was conducted. This means that if the cycles criteria is set to
two, then a third full snapshot needs to run before the first full snap and any incremental or differential snaps will be
released from disks.
Subclient Configuration
To protect production data using IntelliSnap technology, the client must be enabled for the IntelliSnap feature and a
subclient must be configured defining the content to be snapped and the IntelliSnap feature must be enabled for the
subclient.
To enable the IntelliSnap feature for the client: select the client properties, click the Advanced button and check the Enable
IntelliSnap option.
Once the IntelliSnap feature has been enabled for the client the IntelliSnap tab will be used to enable snapshot operations.
Enabling the IntelliSnap check box designates the contents of the subclient to be snapped when schedules for the
subclient are executed. The snap engine must be selected from the drop-down box. Use the Manage Array button to
configure a new array, if one has not already been configured. A specific proxy can be designated for backup copy
operations. This proxy must have the appropriate software and hardware configurations to conduct the backup copies.
Refer to Commvault's documentation for specific hardware and software requirements for the array and application data
that is being snapped.
Once IntelliSnap operations have been configured for the subclient, ensure the subclient is associated with a snap
enabled Storage Policy.
When defining content for the subclient, ensure that only data sitting on the array volume is defined, since no snapshot
can be conducted on data outside of the array.
When defining content for the subclient, ensure that only data sitting on the array volume is defined, since no snapshot
can be conducted on data outside of the array.
Commvault® Education Services Page 163 of 178
V11 SP18 Commvault® Engineer February 2020
PERFORMANCE
Performance Overview
Commvault® software is a high-performance solution for protecting all data in any environment within defined protection
windows. The software also provides many settings to improve performance. Before considering tuning Commvault
software, it is important to understand capabilities and limitations of all hardware and software deployed within an
environment.
There is no such thing as a static data center. Network infrastructures are constantly changing, new servers are added,
mission critical business systems are moving to hybrid cloud, or public cloud infrastructures. Before considering
Commvault tunables, it is first important to understand your environment including the capabilities and limitations of the
infrastructure; specifically the ability to transfer large amounts of data of production or backup networks.
When making modifications to an environment, changes that may positively impact one aspect of the environment can
negatively affect another aspect. This is also true about Commvault settings. For example, enabling multiplexing when
writing to tape drive can improve backup speeds. However, it may have a negative impact on restores if dissimilar data
types are multiplexed to the same tape. Another example is using Commvault deduplication and setting a high number of
data streams. Since client side deduplication is being used, there will be a low impact to the network. But if the
deduplication database needs to be sealed, the next set of backup operations may result in oversaturating the network
while re-baselining blocks in storage.
Performance Benchmarks
Benchmarks can be divided into two kinds, component and system. Component benchmarks measure the performance of
specific parts of a process, such as the network, tape or hard disk drive, while system benchmarks typically measure the
performance of the entire process end-to-end.
Establishing a benchmark focuses your performance tuning and quantifies the effects of your efforts. Building a
benchmark is made up of the following 5 steps:
For example, a backup job over a network to a tape library takes two hours to complete. You think it should take a lot less
and you spend time, effort, and money to improve your network and tape drives and parallel the movement of data. The
job now takes 1.8 hours to complete. You gained a 10% improvement.
Looking at the job in more detail we find that the scan phase of the job is taking 1.5 hours and the rest is the actual data
movement. Switching the scan method reduces the scan phase time to 12 minutes. The job now takes .4 hours. You
gained a 78% improvement.
Knowing what phases a job goes through and how much each phase impacts the overall performance can help you focus
your time, effort, and money on the real problems.
Each hardware component is going to have a theoretical performance limit and a practical one. Attempting to get
improvement beyond these limits without changing the resources involved is a waste of time. Consider using newer vs.
older technologies, such as tape drives.
Large data movements are usually done during non-production hours for two reasons – one, they can degrade production
work, and two, production work can degrade the movement of data. You want to minimize competition for resources to get
a fair benchmark of what performance is achievable. In those cases, where competition cannot be eliminated, you must
accept the impact to performance or invest in more resources.
Periodic Test
A single measurement is not a benchmark. Tape devices have burst speeds that are not sustainable over the long run.
Networks have various degrees of bandwidth availability over a period of time. A single snapshot check of bandwidth will
not give you a realistic expectation. Do periodic testing over the actual usage of a resource to determine its average
performance. Try to level out the peaks and valleys - or at least try to identify what causes these variations.
Multiple measurements scattered over a day can also help in establishing if an unexpected external process is impacting
the environment. For example, if you have a database server that is slowly backing up at night, but when you sample
during the day, it is achieving expected performances, you can suspect an external process impacting the backup, such
as a database administrator dumping the database and copying it to another server at the same time in this example.
Write it down
The hardest lessons are the ones you must learn twice. Once you’ve established your acceptable and/or expected
performance levels for each resource and end-to-end, write them down and use them as the baseline for comparing future
performance.
Environment Considerations
Before modifying Commvault® software settings to improve performance, consider environmental capabilities and
limitations. Ensure the environment is optimized to the best of your team’s abilities. Commvault software can move data at
high rates of speed, but it will ultimately be limited by bottlenecks on servers and network devices.
TCP/IP
TCP/IP is the most common network transmission protocol. Factors that can degrade TCP/IP performance are:
• Latency - Packet retransmissions over distance take longer and negatively impact overall
throughput for a transmission path.
• Concurrency - TCP/IP was intended to provide multiple users with a shared transmission media. For
a single user, it is an extremely inefficient means to move data.
• Line Quality - Transmission packet sizes are negotiated between sender/receiver based on line
quality. A poor line connection can degrade a single link’s performance.
• Duplex setting - Automatic detection of connection speed and duplex setting can result in a half-
duplex connection. Full duplex is needed for best performance.
• Switches - Each switch in the data path is a potential performance degrader if not properly
configured.
• Firewalls – Firewall is the first line of defense against hackers, malware, and viruses. There are
hardware firewall appliances and software firewalls, such as operating system firewalls. Firewalls
can have minor to moderate impacts on transfer performances.
SCSI/RAID
SCSI is the most common device protocol used and provides the highest direct connection speed. An individual SCSI
drive’s speed is determined by spindle speed, access time, latency, and buffer. Overall SCSI throughput is also
dependent on how many devices are on the controller and in what type of configuration. The limitation of SCSI is the
distance between devices and the number of devices per controller.
• RAID arrays extend the single addressable capacity and random access performance of a set of
disks. The fundamental difference between reading and writing under RAID is this: when you write
data in a redundant environment, you must access every place where that data is stored; when you
read the data back, you only need to read the minimum amount of data necessary to retrieve the
actual data--the redundant information does not need to be accessed on a read. Basically – writes
are slower than reads.
• RAID 0 (striping) or RAID 1 (mirror) or RAID 1+0 with narrow striping are the fastest configurations
when it comes to sequential write performance. Wider striping is better for concurrent use. A RAID 5
configured array can have poor write performance. The tradeoff in slower write performance is
redundancy should a disk fail.
Fine tuning a RAID controller for sequential read/write may be counterproductive to concurrent
read/write. If backup/archive performance is an issue, a compromise must be arranged.
iSCSI/Fibre Channel
iSCSI or Fibre Channel protocol (FCP) is essentially serial SCSI with increased distance and device support. SCSI
commands and data are assembled into packets and transmitted to devices where the SCSI command is assembled and
executed. Both protocols are more efficient than TCP/IP. FCP has slightly better statistics than iSCSI for moving data.
Performance tuning is usually setting the correct ‘Host Bust Adapter’ configuration (as recommended by the vendor for
sequential I/O) or hardware mismatch. Best performance is achieved when the hardware involved is from the same
vendor. Given that configuration and hardware is optimum, then for both iSCSI and FCP, performance is inhibited only by
available server CPU resources.
Disk I/O
Performing I/O to disks is a slow process because disks are physical devices that require time to move the heads to the
correct position on the disk before reading or writing. This re-positioning of the head is exacerbated by having many files
or having fragmented files. You can significantly improve read performance of the source data by de-fragmenting the data
on a regular basis.
Anti-Virus
Anti-viruses are intelligent software protecting a system against corrupted data by periodically scanning files systems and
ensuring that every file accessed or opened by any processes running on the system is a legitimate file (and not a virus).
You can easily imagine that when a backup runs and protects every system files, the anti-virus validation significantly
decrease backup performances. It might also access and lock Commvault files, such as log files. It is recommended on all
systems on which Commvault software is installed, to add exclusions to the anti-virus software for Commvault® software
folders, so that when Commvault related processes are in action, they do not trigger the anti-virus validation process.
Stream Management
Data Streams are used to move data from source to destination. The source can be production data or Commvault
protected data. A destination stream will always move to Commvault protected storage. Understanding the data stream
concept will allow a CommCell® environment to be optimally configured to meet protection and recovery windows.
Stream settings are configured in various places within the CommCell® console including the storage policy, MediaAgent,
subclient, and library. The system always uses the lowest setting. If a MediaAgent is configured to receive as many as 100
streams and one storage policy is writing through the MediaAgent and is configured to use 50 streams, then only 50
streams will be sent through the MediaAgent.
During a data protection job, streams originate at the source file or application that is being protected. One or more read
operations is used to read the source data. The number of read operations is determined by the number of subclients and
within each subclient, the number of data readers or data streams, depending on which agent is managing the data. Once
the data is read from the source it is processed by the agent and then sent to the MediaAgent as job streams. The
MediaAgent then processes the data, arranges the data into chunks and writes the data to storage as device streams.
The data is written to storage based on the number of writers, for a disk library, or devices (tape drives) for a tape library.
Subclients Subclients are independent jobs, meaning each subclient will have one or more
streams associated with each job.
Multi-stream • Most subclients can be multi-streamed. For subclients that do not support multiple
subclients streams, multiple subclients are used to multi-stream data protection jobs.
• Data readers are configured in the General tab of the subclient.
• Data Streams are configured in the storage device tab for MS-SQL and Oracle
subclients.
Non-Subclient Agents such as the new Exchange Mailbox agent manage streams at the object level.
based agents For Exchange, each mailbox is protected as a single stream.
The default subclient data readers setting is still used as the primary stream governor
for the maximum number of concurrent objects that can be protected.
Job streams are active network streams moving from source (client or MediaAgent) to
Job Streams
destination (MediaAgent).
The Job controller shows the total number of job streams currently in use in the
bottom of the window and the job stream ‘high watermark’ for the CommCell
environment.
Add the ‘Number of Readers in Use’ field in the job controller to view the number of
streams being used for each active job.
Device Streams Configured in the Storage Policy properties.
Determines how many concurrent write operations will be performed to a library. This
number should be set to equal the number of drives or writers in the library to
maximize throughput.
Multiplexing is used to consolidate multiple job streams into single device streams.
Drives For a removable media library writing data sequentially to devices, there will be one
device stream per drive.
Writers For a disk library where random read/write operations can be performed the number
of writers should be set to allow the maximum throughput without creating bottlenecks
in your network, MediaAgents, or disks.
If you are currently meeting protection windows, then there is no need to modify anything. Improving windows from six to
four hours when your window is eight hours just creates more work and a more complex environment. The following
recommendations are intended to improve performance when protection windows are NOT being met.
o Disk Library (not using Commvault deduplication) – ensure the library can handle higher
number of write operations. Increase the number of mount path writers so the total number
of writers across all mount paths equals the number of device streams.
o Disk Library (with Commvault deduplication) – if not using Client Side Deduplication enable
it. Each deduplication database can manage 50 or more concurrent streams. If using Client
Side Deduplication, after the initial full is complete most data processing will be done locally
on each Client. This means minimum bandwidth, MediaAgent, and disk resource will be
required for data protection operations.
• Tape Library – If tape write speeds are slow, enable multiplexing. Note: enabling multiplexing can
have a positive effect on data protection jobs but may have a negative effect on restore and
auxiliary copy performance.
• Commvault Deduplication:
o Ensure the deduplication database is on high speed disks. Use the SIDB2 utility tool to
simulate database performance before implementing. Check Commvault ® documentation on
guidance for using this tool. o For Primary backups use Client Side Deduplication and DASH
Full backups.
o For Secondary copies use DASH Copy backups to a destination disk target enabled for
deduplication.
o Increase block size to improve performance. Note: block size is hardware dependent. Before
changing the block size ensure all NICs, HBAs. Switches, routers, MediaAgent OS, and
storage devices at your primary and alternate sites (including DR sites) support the block
size setting.
General recommendations:
• Ensure all data is properly being filtered. Use the job history for the client to obtain a list of all
objects being protected. View the failed items at the log to determine if files are being skipped
because they are open or if they existed at time of scan and not at time of backup. This is common
with temp files. Filters should be set to eliminate failed objects as much as possible.
• For file systems and application with granular object access (Exchange, Domino, SharePoint)
consider using data archiving. This will move older and infrequently accessed data to protected
storage which will reduce backup and recovery windows.
Database applications
• For large databases that are being dumped by application administrators consider using Commvault
database agents to provide multi-streamed backup and restores.
• When using Commvault database agents for instances with multiple databases consider creating
multiple subclients to manage databases.
• For large databases consider increasing the number of data streams for backing up database. Note:
For multistreamed subclient backups of SQL and Sybase databases, the streams cannot be
multiplexed. During auxiliary copy operations to tape if the streams are combined to a tape they
must be pre-staged to a secondary disk target before they can be restored.
• For MS-SQL databases using file/folder groups, separate subclients can be configured to manage
databases and file/folder groups.
o Determine which virtual machines DO NOT require protection and do not back them up.
o Consider consolidating the number of storage policies. Each policy copy will manage its own
set of media. The more policies and policy copies you have, the more tapes you will need to
manage data for all copies.
o If most jobs on a tape have aged but a few jobs have not, the tape will not recycle. Use the
media refresh option to copy un-aged jobs to new media so the tape can recycle.
• Considerations for Disk Usage:
o Use Commvault deduplication.
• Recovery Time Objective (RTO) is the time to recover a business system after a disruption or
disaster.
• Recovery Point Objective (RPO) is the time interval in which recovery points are created. Each
recovery point is created by a backup, snapshot or replication interval. The RPO corresponds to the
acceptable amount of data loss a business system can tolerate.
RTO and RPO should not have a single requirement. Different types of disruptions or disasters will have an impact on time
to recover and data loss that may occur. Consider these four basic levels of disaster and how they may affect recovery
objectives:
• Disruption of business system: This can affect a single system such as database or Email where end
users can still function but will not have access to the system. High availability solutions such as
clustering, virtualization, data mirroring, and data replication should be considered. For critical
business systems a disruption of a business system should have very short RTO & RPO
requirements defined.
• Limited site disaster: This may affect the datacenter, routers, switches, or other components that
can have a larger effect on end users ability to perform their jobs. Consider the classic air
conditioner leak that may force power to be cut or systems shut down. Users can still have access
to facilities but their ability to access business systems may be down for longer periods of time. In
this case the RTO may be defined higher, but RPO should still be relatively low.
• Site Disaster: This will force the shutdown of the entire building. End users can work from home or
take the day off. This scenario can be quite difficult to define accurate RTO and RPO requirements
since the disaster site may be a result of circumstances beyond your control. Consider a gas pipe
leak which forces power to be cut from the building for safety reasons. Power being restored to the
building is out of your hands. This is a strong reason to have an active DR facility. In this case the
RTO and RPO would be based on the readiness and availability of equipment at the DR facility and
the frequency in which data is sent there.
• Regional Disaster: Major regional disasters can have a large impact on a business’s ability to
continue. This scenario will not only affect the IT department’s ability to restore services but can
also have an impact on the users ability to access the services. A DR facility is a requirement for a
regional disaster and it should be located at a proper distance based on perceived risk of the types
of disaster that may occur. The bigger picture here is Business Continuity as it will be more on
management to ensure the continuation of business.
Within the Commvault® software suite there are methods that RTO and RPO can be improved. The following section
explains some of the ways you can configure your CommCell environment to improve recovery windows.
o Consider prioritizing data for RPO requirements and define the data as a separate subclient
and assign separate schedules. For example, a critical database with frequent changes can
be configured in a separate subclient and scheduled to run transaction logs every fifteen
minutes. To provide short off-site RPO windows consider running synchronous copies with
the automatic schedule enabled.
o Consider using hardware snapshots with the Commvault IntelliSnap feature to manage and
backup snapshots.