0% found this document useful (0 votes)
71 views

Safety First Using Clickhouse Backup For ClickHouse Backup and Restore 2023 10 25

Uploaded by

Giorgi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Safety First Using Clickhouse Backup For ClickHouse Backup and Restore 2023 10 25

Uploaded by

Giorgi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

The Altinity

Safety Cat

Safety First
Using clickhouse-backup for
ClickHouse® Backup and
Restore
Eugene Klimov
Robert Hodges
https://altinity.com

© 2023 Altinity, Inc. 1


A brief message from our sponsor…

Robert Hodges Eugene Klimov


Database geek with 30+ years Clickhouse-backup maintainer
on DBMS. Kubernaut since with 20+ years in software.
2018. Day job: Altinity CEO Day job: Cloud Engineer

ClickHouse support and services including Altinity.Cloud


Authors of Altinity Kubernetes Operator for ClickHouse, Altinity
clickhouse-backup and other open source projects
© 2023 Altinity, Inc. 2
Why do we back
up databases?

© 2023 Altinity, Inc. 3


Backups solve a number of important problems
● Catastrophic failures that delete all data
● Accidental deletion of a database or table
● Debugging problems using production data
● Upgrade testing prior to schema or version change
● Loading schema and configuration for new installations

© 2023 Altinity, Inc. 4


Welcome to ClickHouse, a real-time analytic database
Understands SQL
Event Dashboards
Runs on bare metal to cloud Streams
ClickHouse
Interactive
Shared nothing architecture ELT
Graphics

Stores data in columns Object APIs


Storage
Parallel and vectorized execution
Scales to many petabytes It’s the core engine for
low-latency analytics
Is Open source (Apache 2.0)

© 2023 Altinity, Inc. 5


What do we need to protect in ClickHouse?

Config files /etc/clickhouse-server

Certs and Keys

ClickHouse Schema /var/lib/clickhouse/metadata

Server Object
Data /var/lib/clickhouse/data
Storage
RBAC Metadata /var/lib/clickhouse/access

Logs ZooKeeper

© 2023 Altinity, Inc.


6
What do we need to protect in ClickHouse?

Config files /etc/clickhouse-server

Certs and Keys

ClickHouse Schema /var/lib/clickhouse/metadata

Server Object
Data /var/lib/clickhouse/data
Storage
RBAC Metadata /var/lib/clickhouse/access

Logs ZooKeeper

© 2023 Altinity, Inc.


7
Common backup/restore options for ClickHouse

Tool Description Configs Schema Data RBAC

Replication Use ReplicatedMergeTree 🗙 🗙 ✅ 🗙

ClickHouse Copier Works with ZooKeeper to copy 🗙 🗙 ✅ 🗙


cluster data

Altinity clickhouse-backup Standalone backup utility for all ✅ ✅ ✅ ✅


project ClickHouse versions

ClickHouse BACKUP & Built-in SQL operations in 🗙 ✅ ✅ 🗙


RESTORE ClickHouse (recent versions)

© 2023 Altinity, Inc. 8


Introducing
clickhouse-backup

© 2023 Altinity, Inc. 9


The clickhouse-backup utility at a glance
Language GitHub Proejct
Golang https://github.com/Altinity/clickhouse-backup

GitHub Stars License Distributions


1040 Apache 2.0 ● RPM - aarch64, x86_64
● Mac OS X Tarball - amd64
● Linux Tarball - amd64, arm64
Original Author Maintainer ● Debian - amd64, arm64
Alex Akulov Eugene Klimov ● Docker - amd64, arm64

© 2023 Altinity, Inc. 10


Step 1: Install clickhouse-backup on ClickHouse host
# Grab the latest release from GitHub.
wget
https://github.com/Altinity/clickhouse-backup/releases/downloa
d/v2.4.2/clickhouse-backup-linux-amd64.tar.gz
# Unpack.
tar -xf clickhouse-backup-linux-amd64.tar.gz
# Install.
sudo install -o root -g root -m 0755 \
build/linux/amd64/clickhouse-backup /usr/local/bin
# Try it out.
/usr/local/bin/clickhouse-backup -v

© 2023 Altinity, Inc. 11


Step 2: Prepare config.yml file
# Grab the latest release from GitHub.
sudo -u clickhouse mkdir /etc/clickhouse-backup
sudo -u clickhouse clickhouse-backup \
default-config > /etc/clickhouse-backup/config.yml
sudo -u vi /etc/clickhouse-backup/config.yml

Fill in values in sections:


● general:
● clickhouse:
● s3:

© 2023 Altinity, Inc. 12


Step 3: Now let’s learn how to create a backup
1 2

clickhouse-backup create mybackup clickhouse-backup upload mybackup

/var/lib/clickhouse/backup
ClickHouse Backup Remote
Server Storage
mybackup/

metadata/ metadata.json shadow/

© 2023 Altinity, Inc. 13


Step 4: And how to restore it
4 3

clickhouse-backup restore mybackup clickhouse-backup download mybackup

/var/lib/clickhouse/backup
ClickHouse Backup Remote
Server Storage
mybackup/

metadata/ metadata.json shadow/

© 2023 Altinity, Inc. 14


Backing up and restoring with clickhouse-backup

DEMO TIME!

© 2023 Altinity, Inc. 15


Examples of backup commands
# Back up everything locally.
sudo -u clickhouse clickhouse-backup create mybackup \
--rbac --configs

# Back up a single table locally.


sudo -u clickhouse clickhouse-backup create \
mybackup_table_local -t default.ex2

# Back up and upload a database to remote backup storage.


sudo -u clickhouse clickhouse-backup create_remote \
mybackup_database_remote -t 'default.*'

© 2023 Altinity, Inc. 16


Quick primer on hard links

$ echo "hello, world" > foo /home/mylogin/foo /home/mylogin/bar


$ ln foo bar
$ ls --inode foo bar
4206300 bar 4206300 foo
$ cat bar
Inode
hello, world
$ rm foo
$ ls --inode bar
4206300 bar hello, world

Tip: Cross device and remote hard links are not possible. Hard links only work within a single file system.

© 2023 Altinity, Inc. 17


How does the backup command work under the covers?
ALTER TABLE default.ex2 ALTER TABLE default.ex2 UNFREEZE
FREEZE with name 'df02...'; WITH NAME 'df02...'

lin ove
m
ks
3

Re
1
/var/lib/clickhouse/backup/
Make hard /var/lib/clickhouse/shadow/
ClickHouse links
mybackup/data/
df02. . . /data/default.ex2/<part> default/ex2/default/<part>

/var/lib/clickhouse/data/
default.ex2/<part>
Save hard links to
backup

2
File File
File File
© 2023 Altinity, Inc. 18
Examples of restore commands
# Restore all data from already downloaded backup.
sudo -u clickhouse clickhouse-backup restore mybackup

# Restore a single table from local backup.


sudo -u clickhouse clickhouse-backup restore \
mybackup -t default.ex2

# Download and restore a single database.


sudo -u clickhouse clickhouse-backup restore_remote \
mybackup -t 'default.*'

© 2023 Altinity, Inc. 19


So how does restore work? Create the
tables and 1
Download
backup and
2 other schema
CREATE TABLE default.ex2 ... write files
objects

ALTER TABLE default.ex2 ATTACH 4


PART
/var/lib/clickhouse/backup/
ClickHouse /var/lib/clickhouse/data/ mybackup/data/
default/ex2/detached/<part> default/ex2/default/<part>

/var/lib/clickhouse/data/ Move
Create links
default/ex2/<part> links
to part

File File
File File

© 2023 Altinity, Inc. 20


More restore commands
# Restore everything: schema, data, users, config files.
sudo -u clickhouse clickhouse-backup \
restore mybackup --rbac --configs Tip: Server restart
required for these
commands
# Restore only configuration files.
sudo -u clickhouse clickhouse-backup restore \
mybackup --configs-only

# Restore only RBAC metadata.


sudo -u clickhouse clickhouse-backup restore \
mybackup --rbac-only

© 2023 Altinity, Inc. 21


Managing backups
# Listing your backups.
sudo -u clickhouse clickhouse-backup list
sudo -u clickhouse clickhouse-backup list local
sudo -u clickhouse clickhouse-backup list remote

# Deleting backups.
sudo -u clickhouse clickhouse-backup delete local mybackup
sudo -u clickhouse clickhouse-backup delete remote mybackup

© 2023 Altinity, Inc. 22


Creating an incremental backup
# Create a full backup to get things started.
sudo -u clickhouse clickhouse-backup create_remote \
full_backup -t 'default.*'
sudo -u clickhouse clickhouse-backup delete \
local full_backup Only the
differences
with remote
# Now create an incremental backup.
backup are
sudo -u clickhouse clickhouse-backup \
uploaded!
create_remote --diff-from-remote=full_backup \
incremental_backup1 -t 'default.*'
sudo -u clickhouse clickhouse-backup delete \
local incremental_backup1

© 2023 Altinity, Inc. 23


Restoring from an incremental backup.
# Restore test1 from the latest incremental backup.
sudo -u clickhouse clickhouse-backup \
restore_remote incremental_backup1 \
-t 'default.test1'

Command traverses all


backups to find data

© 2023 Altinity, Inc. 24


Advanced Topics

© 2023 Altinity, Inc. 25


Managing backup storage
How to clean up orphan data in /var/lib/clickhouse/shadow:
sudo -u clickhouse clickhouse-backup clean
How to clean up a broken remote backup (missing or bad metadata.json file):
sudo -u clickhouse clickhouse-backup clean_remote_broken
How to keep backups from accumulating using automatic retention:
general:
allow_empty_backups: false
backups_to_keep_local: 1
backups_to_keep_remote: 1

© 2023 Altinity, Inc. 26


Tips for managing remote storage
Handling remote storage in sharded clusters:
● Use macros (e.g. {shard}) in `path` section of remote storage settings to avoid
deleting backup from other shards by accident when computing backup
retention
How to enable parallel upload and download for object storage in config.yml:
general:
download_concurrency: 3
upload_concurrency: 3

© 2023 Altinity, Inc. 27


More on incremental backups
● A minimal increment item for calculation of increment is data part name
● Increment will grow if you frequently use OPTIMIZE … FINAL or ALTER
TABLE … UPDATE / DELETE
○ They make a lot of new data parts for exists data
● Increment calculates only in upload stage
● Create command always create full backup
○ Parts which are present in base backup marked as required in metadata/db/table.json
● During download required parts will download from base remote backup to
local disk.
○ ClickHouse-backup creates hard links in backup_name/shadow folder to make it complete

© 2023 Altinity, Inc. 28


Enabling the REST API for backups
It’s easy! clickhouse-backup can work as a daemon with REST API

clickhouse-backup server

Check out the api: section in config.yml. Tips:

● enable_metrics: true - /metrics endpoint with Prometheus format


● enable_pprof: true - /debug/pprof endpoints for memory heap and
CPU profiling
● create_integration_tables: true - create system.backup_list
system.backup_actions

© 2023 Altinity, Inc. 29


An example REST request
$ curl http://localhost:7171/backup/list |jq
{
"name": "my_backup",
"created": "2023-10-25 02:48:25",
"size": 828848,
"location": "remote",
"required": "",
"desc": "tar, regular"
}

© 2023 Altinity, Inc. 30


Working with REST API via SQL
Check out the api: section in config.yml
create_integration_tables: true will create system.backup_list
system.backup_actions tables

INSERT INTO system.backup_actions(comand)


VALUES(‘create_remote backup_name’),(‘delete local backup_name’);

SELECT * FROM system.backup_actions;

SELECT * FROM system.backup_list;

© 2023 Altinity, Inc. 31


More things to learn about with clickhouse-backup
● Backup and restore on a sharded cluster
○ See Examples.md#how-to-make-backup--restore-sharded-cluster
● Converting MergeTree to ReplicatedMergeTree
○ See Examples.md#how-to-convert-mergetree-to-replicatedmergetree
● Using shell scripts for (list, upload, download, delete) to integrate any remote
storage type remote_storage: custom
○ See https://github.com/Altinity/clickhouse-backup/tree/master/test/integration/

© 2023 Altinity, Inc. 32


And a final tip for health and happiness…

Test your backups


before you need them!

© 2023 Altinity, Inc. 33


Wrap-up

© 2023 Altinity, Inc. 34


Roadmap
● Better Support for incremental backups
● Backing up MergeTree on S3 object storage (now in beta)
● Add support for embedded BACKUP/RESTORE incremental backups

Current detailed backlog:

https://github.com/Altinity/clickhouse-backup/milestones

© 2023 Altinity, Inc. 35


Help us to make the clickhouse-backup project better!!!

https://github.com/Altinity/clickhouse-backup
Try it out!
Tell your friends!
Log issues!
Send us pull requests!

© 2023 Altinity, Inc. 36


Summary
● Backups solve problems from disaster recovery to making test copies
● Clickhouse-backup is well tested and rich in features:
○ Full server backups including schema, data, RBAC metadata, and config files
○ Incremental backups
○ Many remote storage options
○ Retentions
○ Server API
● Clickhouse-backup uses hard linking tricks to back up and restore MergeTree
● Future releases will handle backup of S3-backed MergeTree (in beta)

Talk sample code: https://github.com/Altinity/clickhouse-sql-examples/tree/main/clickhouse-backup

© 2023 Altinity, Inc. 37


The Altinity
Safety Cat

Thank you!
Eugene Klimov - Robert Hodges
https://altinity.com

Altinity.Cloud
Altinity Stable Builds
Altinity Kubernetes Operator for ClickHouse
© 2023 Altinity, Inc. 38

You might also like