NetBackup10 AdminGuide Cassandra
NetBackup10 AdminGuide Cassandra
Administrator's Guide
Release 10.0
NetBackup™ for Cassandra Administrator's Guide
Last updated: 2022-02-27
Legal Notice
Copyright © 2022 Veritas Technologies LLC. All rights reserved.
Veritas, the Veritas Logo, and NetBackup are trademarks or registered trademarks of Veritas
Technologies LLC or its affiliates in the U.S. and other countries. Other names may be
trademarks of their respective owners.
This product may contain third-party software for which Veritas is required to provide attribution
to the third party (“Third-party Programs”). Some of the Third-party Programs are available
under open source or free software licenses. The License Agreement accompanying the
Software does not alter any rights or obligations you may have under those open source or
free software licenses. Refer to the Third-party Legal Notices document accompanying this
Veritas product or available at:
https://www.veritas.com/about/legal/license-agreements
The product described in this document is distributed under licenses restricting its use, copying,
distribution, and decompilation/reverse engineering. No part of this document may be
reproduced in any form by any means without prior written authorization of Veritas Technologies
LLC and its licensors, if any.
The Licensed Software and Documentation are deemed to be commercial computer software
as defined in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19
"Commercial Computer Software - Restricted Rights" and DFARS 227.7202, et seq.
"Commercial Computer Software and Commercial Computer Software Documentation," as
applicable, and any successor regulations, whether delivered by Veritas as on premises or
hosted services. Any use, modification, reproduction release, performance, display or disclosure
of the Licensed Software and Documentation by the U.S. Government shall be solely in
accordance with the terms of this Agreement.
http://www.veritas.com
Technical Support
Technical Support maintains support centers globally. All support services will be delivered
in accordance with your support agreement and the then-current enterprise technical support
policies. For information about our support offerings and how to contact Technical Support,
visit our website:
https://www.veritas.com/support
You can manage your Veritas account information at the following URL:
https://my.veritas.com
If you have questions regarding an existing support agreement, please email the support
agreement administration team for your region as follows:
Japan [email protected]
Documentation
Make sure that you have the current version of the documentation. Each document displays
the date of the last update on page 2. The latest documentation is available on the Veritas
website:
https://sort.veritas.com/documents
Documentation feedback
Your feedback is important to us. Suggest improvements or report errors or omissions to the
documentation. Include the document title, document version, chapter title, and section title
of the text on which you are reporting. Send feedback to:
You can also see documentation information or ask a question on the Veritas community site:
http://www.veritas.com/community/
https://sort.veritas.com/data/support/SORT_Data_Sheet.pdf
Contents
Index .................................................................................................................... 33
Chapter 1
Introduction
This chapter includes the following topics:
■ Protecting Cassandra
Deployment and
Cassandra Configuration Cassandra Backup
Configuration
Introduction 6
Protecting Cassandra data using NetBackup
The following table describes the purpose of different components of the Cassandra
backup and recovery solution.
Table 1-1
Components Purpose
NetBackup primary server All the jobs are executed from the NetBackup
primary server.
Introduction 7
Protecting Cassandra data using NetBackup
Components Purpose
■ Efficient reconciliation
Data for same keys from different nodes
are transferred to the same node in the
backup nodes.
Reconciliations happen in-parallel within
each data staging servers without any
inter-node communication.
■ Record synthesis
While iterating over the records, columns
of the same key from different SStables
are merged.
■ Semantic Deduplication
Stale and duplicate records (replicas) are
identified and removed.
■ The data is backed up in parallel streams wherein the data nodes stream data
blocks simultaneously to multiple data staging servers and from there to multiple
backup hosts. The job processing is accelerated due to multiple backup hosts
and parallel streams. The data staging servers help in optimizing the data being
backed up thus achieving data deduplication.
■ The communication between the Cassandra cluster and NetBackup is enabled
using the Cassandra backup and recovery component that gets deployed on
the data staging servers and the Cassandra cluster.
■ For NetBackup communication, you need to configure a BigData policy and add
the related backup hosts.
■ You can configure a NetBackup media server, client, or primary server as a
backup host. Also, based on Cassandra data size, you can add or remove
backup hosts and data staging servers. You can scale up your environment
easily by adding more backup hosts.
■ The communication between the Cassandra cluster, data staging servers, and
backup hosts happens over SSH.
■ The NetBackup Parallel Streaming Framework enables a thin client-based,
agentless backup wherein the backup-restore operations are performed on the
backup hosts. The NetBackup thin client binary (Cassandra backup and recovery
component) is automatically pushed to the Cassandra cluster during the
Introduction 8
Protecting Cassandra
Protecting Cassandra
On a very high level, you need:
■ NetBackup primary server
■ NetBackup media server
■ A backup host that is NetBackup primary, NetBackup media server or a
NetBackup client.
Refer to the NetBackup compatibility list for the supported primary and media server
configurations. The backup host that is NetBackup media server or a NetBackup
client for Cassandra is supported only on an RHEL. NetBackup Appliance,
NetBackup Flex Appliance and NetBackup FlexScale is supported as a NetBackup
primary, media server, or as a client that can act as a backup host.
You need to follow the high-level steps for protecting Cassandra cluster:
1. Verify pre-requisites for Cassandra protection.
2. Run tpconfig on the NetBackup primary server.
3. Create cassandra.conf file with configuration details on the primary server.
4. Add required paths and hosts in the Allowed list.
Terminology Definition
Cassandra Backup The NetBackup thin client which gets deployed on data staging servers and Cassandra
Recovery component cluster to aid in backup and restore operations.
Introduction 9
NetBackup for Cassandra terminologies
Terminology Definition
Data staging servers NetBackup requires a set of servers for backup of Cassandra cluster in addition to the
NetBackup primary, and backup hosts. These servers are typically 5% of the total number
of servers in the Cassandra cluster. These servers are used to deduplicate the data from
Cassandra cluster during backup and optimize the backup process. They are also used
as staging-server for the data to be backed up and restored.
Parallel streams The NetBackup parallel streaming framework allows data blocks from multiple nodes to
be backed up using multiple backup hosts simultaneously.
Backup host The backup host acts as a proxy client. All the backup and the restore operations are
executed through the backup host.
You can configure media servers, clients, or a primary server as a backup host.
3 All the servers/nodes in the cassandra cluster must support one non-root host
user id which can be used by NetBackup to connect to all the node using ssh.
This host user id and its password must be specified in the command of
tpconfig while configuring the cassandra cluster.
4 Similarly, one non-root host user id must be supported on all the nodes of the
DSS cluster. This host user id and its password must be specified in the
command of tpconfig while configuring the DSS cluster.
./tpconfig -add -application_type cassandra -application_server
DSS cassandra cluster name -application_server_user_id DSS app
user id -password DSS app password -host_user_id DSS host user
-host_password DSS host password -host_RSA_key DSS host rsa key
-requiredport 80 command.
8 On the Backup Selections tab, enter the following parameters and their values
as shown:
■ Application_Type=cassandra
The parameter values are case-sensitive.
■ Backup_Host=IP_address or hostname
You can specify multiple backup hosts.
■ Add the key word /ALL_KEYSPACES
Note: The file name cassandra.conf must have all characters in lower case. This
file is a JSON file and can be edited anytime manually and saved at the same
location. Verify the JSON format with an online formatter, to avoid any JSON
formatting errors when reading this file in NetBackup.
This file can have entries for multiple Cassandra clusters. All the Cassandra clusters
must be listed in this file whether they are being backed up or being used for doing
an alternate restore.
Sample Cassandra.conffile:
Configuring Cassandra Backup and Recovery solution 14
Setting up cassandra.conf file on the primary server
{
"productionCluster": {
"multi_72": {
"nodes": [
"10.221.104.71",
"10.221.104.72",
"10.221.104.73",
"10.221.104.74",
"10.221.104.77"
],
"prodClusternodekeyHashes": {
"10.221.104.71": "7b69ed1bbe095b2c5fcd34c26806793f8740ebcb24e0c7
bbd9a9bbae9e848923",
"10.221.104.72": "a41dfc6a7b33f5fa02d7226e871a900666cd65beeca148
a77d0aabe9ed33e7ff",
"10.221.104.73": "1a41c78e68effd51e6eaf8cde265421cb81475bf836593
8be146a271f444ce35",
"10.221.104.74": "ebec0750d15ea1f0dfca993e8425d0106ef5aa0bf6e30d
5bfa6a3aad84313bbd",
"10.221.104.77": "ba8f8b33a46bc88780288d87b5cb32116773a3929c2f4c
f33bd324e9516c5fdb"
},
"dataCenterName": "datacenter1",
"nodeDownThresholdPercentage": 25,
"dssClusterName": "dss_multi_72"
},
"multi_82": {
"nodes": [
"10.221.104.171",
"10.221.104.172"
],
"prodClusternodekeyHashes": {
"10.221.104.171": "8a69ed1bbe095b2c5fcd34c26806793f8740ebcb24e0c
7bbd9a9bbae9e848964",
"10.221.104.172": "b21dfc6a7b33f5fa02d7226e871a900666cd65beeca14
8a77d0aabe9ed33e7ab"
},
"dataCenterName": "datacenterwest",
"nodeDownThresholdPercentage": 20,
"dssClusterName": "dss_multi_82"
}
},
Configuring Cassandra Backup and Recovery solution 15
Setting up cassandra.conf file on the primary server
"dssCluster": {
"dss_multi_72": {
"dssClusterInfo": {
"cbrNode": "10.221.104.75",
"nodes": [
"10.221.104.75",
"10.221.104.76"
],
"dssClusternodekeyHashes": {
"10.221.104.75": "14d0288c869d7021a2c855124c4ee5367e3cb6ede8ffc4d
74a883ff655ba0c57",
"10.221.104.76": "ebd134c712ba8c2f8a75ba3c2ce1baf80bbbe199ed50476
e2c36f8e84adce294"
}
},
"settings": {
"jobCleanupTimeoutSec": 3600,
"dssMinRam": "90909",
"dssMinStoragePerBkupNode": "10485",
"concurrentCompactions": "8",
"sstableloaderMemsize": "4096M",
"concurrentTransfers": "2",
"scriptHome": "/tmp/.backups",
"workingDir": "/home",
"dssDist": "/tmp/cbrpack",
"cph": "1",
"optThreshold": "32",
"securityMode": "userProvided",
"verbose": "5",
"maxLogSize": "1",
"maxStreamsPerBackupHost": "10"
}
},
"dss_multi_82": {
"dssClusterInfo": {
"cbrNode": "10.221.104.175",
"nodes": [
"10.221.104.175",
"10.221.104.176"
],
"dssClusternodekeyHashes": {
"10.221.104.175": "28d0288c869d7021a2c855124c4ee5367e3cb6ede8ffc4
d74a883ff655ba0c21",
Configuring Cassandra Backup and Recovery solution 16
Setting up cassandra.conf file on the primary server
"10.221.104.176": "a8d134c712ba8c2f8a75ba3c2ce1baf80bbbe199ed5047
6e2c36f8e84adce214"
}
},
"settings": {
"jobCleanupTimeoutSec": 28800,
"dssMinRam": "90909",
"dssMinStoragePerBkupNode": "10485",
"concurrentCompactions": "8",
"sstableloaderMemsize": "4096M",
"concurrentTransfers": "2",
"scriptHome": "/tmp/.backups",
"workingDir": "/home",
"dssDist": "/tmp/cbrpack",
"cph": "1",
"optThreshold": "32",
"securityMode": "userProvided",
"verbose": "5",
"maxLogSize": "1",
"maxStreamsPerBackupHost": "10"
}
}
}
}
Enter the RSA key of the CBR node. To obtain the RSA key, log in to the CBR node
with the host credentials you plan to use with the Data staging servers and run the
cat /etc/ssh/ssh_host_rsa_key.pub |awk '{print $2}' |base64 -d
|sha256sum |awk '{print $1}' command.
Table 2-1
Key Description
Key Description
prodClusternodeKeyHashes Lists all the nodes in the nodes key with the
public SHA 256 RSA key.
dssClusternodekeyHashes Lists all the nodes in the nodes key under the
dssClusterInfo with the public RSA key.
Key Description
Key Description
■ Restore combinations
■ Ensure to have enough free space on all the Data staging servers in the DSS
cluster to run a restore operation. Free space must be two times greater than
the largest object being recovered.
Note: You can query the catalogs to find the object details before running a
recovery. If enough space is not available on the DSS cluster, NetBackup fails
the recovery job.
■ Make sure that Cassandra service is up and running on all the data staging
servers.
■ Ensure to have enough free space on the target Cassandra cluster.
■ The target Cassandra cluster must be fully functional with access control to the
appropriate users.
■ For NetBackup 10.0, backup and restore are supported by CLI. Create policy,
submit backup and job monitoring is supported by java GUI.
Recovery to original Cassandra cluster, keyspace, column family.
■ To recover Cassandra data back to the original location, ensure that the original
Cassandra cluster is up and running and also, all the nodes are connected.
■ The images which need to be recovered must be identified.
■ Ensure that all the images of one backup operation are selected.
■ Ensure to run bpclimagelist command on the NetBackup primary server and
get a list of images for a particular Cassandra cluster.
The output shows a list of backup images for the given Cassandra cluster.
■ Whenever you upgrade cassendra or make any schema change, initiate a full
backup before any incremental backup job.
■ Choose the images from the bpclimagelist command such that all the images
of one full backup are selected. From the list of images for restore identify the
lesser timestamp as the start time and the higher timestamp as end time.
Performing backups and restores of Cassandra 22
Pre-requisites for Cassandra Restore
■ To check the contents of the images you selected please run the following
command on NetBackup primary server.
The output shows a list of backup up files in the backup images which are
selected as per start and end time.
■ When you can see the key spaces and column families that you want to restore,
then run the restore command on the NetBackup primary server.
■ You must specify a rename file to the bprestore command for Cassandra
restores. Create a file with the following contents as the rename file and pass
the path of this file to bprestore command.
• Rename file:
{
“recoveryOptions” : “BIGDATA_CASSANDRA”
}
■ You must also need to provide the restore selections in case you want to restore
the entire cluster specify the restore selection as follows restore selection file.
■ To do the actual restore operation you need to run the following command on
the NetBackup primary server Bprestore command.
-S Master_Server_Name
-C <Cassandra cluster name> (Client_Name specified during Backup)
-D <Restore host name>
-s mm/dd/yyyy hh:mm
-e mm/dd/yyyy hh:mm (Date Time range)
-t 44 (For Bigdata Policy Type -t 44)
-f <Restore selection file>
Performing backups and restores of Cassandra 23
Configurations for Cassandra Restore
-R <Rename file>
-cassandra_restore
■ Restore selections for a granular restore operation specifies the keyspace and
column family.
Restore selection
{
“restoreSelections” : {
“<keyspace name>” : [“<column family name>”]
}
}
■ Provide the credentials of the target DSS cluster using the tpconfig
command.
Note: This command is the same as above with the DSS cluster names in
it.
Performing backups and restores of Cassandra 24
Restore combinations
■ If your target Cassandra cluster is different from the backup source, add the
Cassandra configuration details in the cassandra.conf file on the primary
server .
For example:
bprestore -S emidas105.vxindia.veritas.com
-C Test_Cluster72 -D emidas105.vxindia.veritas.com
-s 03/09/2021 17:17 -e 03/09/2021 17:17 -t 44 -L /
input/cassandra_progress.log -f /input/cassandra_filelist_cluster
-R /input/cassandra_rename_cluster -cassandra_restore
Note: Number of jobs = minimum ((backup hosts * streams per backup host),
number of DSS nodes)
Restore combinations
The following are supported restore combinations.
"keyspace": {
"name": "ks_oldname",
"newName": "ks_newname"
},
"columnFamilies": [],
"strategy": {}
"keyspace": {
"name": "old_ks_name",
"newName": "new_ks_name"
},
"columnFamilies": [],
"strategy": {}
},
"columnFamilies": [
"name": "cf_name",
"newName": "cf_newname"
],
"strategy": {}
Note: Whenever NetBackup restores a keyspace, it restores with durable writes set to true. If you want to change
this attribute, you can change it in Cassandra, after the restore is complete.
Chapter 4
Troubleshooting
This chapter includes the following topics:
■ Common errors
Common errors
Table 4-2
Error Description
Error 3237: backup fails Ensure that the entire of RSA config,,
host config in the cassadnra.con file
are correct.