10. Storage - Part 2
10. Storage - Part 2
Architecture
Infrastructure Building Blocks
and Concepts
Storage – Part 2
(chapter 10)
Network Attached Storage (NAS)
• Object storage stores and retrieves data using a REST API calls over HTTP,
is served by a webserver, and is designed to be highly scalable
• All large public cloud providers offer object storage services
AWS has S3
Azure has Blob Storage as part of storage accounts
GCP provides Object Storage
• SDS virtualizes all physical storage into one large shared storage pool
Data can be stored in a variety of storage systems while being presented and managed
as one storage pool to the servers consuming the storage
• From the shared storage pool, software provides data services like:
Deduplication
Compression
Caching
Snapshotting
Cloning
Replication
Tiering
Software Defined Storage
• Example:
A newly deployed database server can invoke an SDS policy that mounts storage
configured to have its data striped across a number of disks, creates a daily snapshot,
and has data stored on tier 1 disks
• APIs can be used to provision storage pools and set the availability,
security and performance levels of the virtualized storage
• Using APIs, storage consumers can monitor and manage their own storage
consumption
Storage availability
Redundancy and data replication
• To increase availability in a SAN, components like HBAs and switches can be installed
redundantly
• Using multiple paths between HBAs and SAN switches, failover can be instantiated
automatically when a failure occurs
• Multiple storage systems can be used. Using replication, changed disk blocks from the
primary storage system are continuously sent to the secondary storage system, where
they are stored as well
Redundancy and data replication
• Synchronous replication:
Each write to the active storage system and the replication to the passive storage
system must be completed before the write is confirmed to the operating system
Ensures data on both storage systems is synchronized at all times and data is never lost
When the physical cable length between the two storage systems is more than 100 km,
latency times get too long, slowing down applications, that have to wait for the write
on the secondary storage system to finish
Risk: a failing connection between both storage systems a write is never finished, as the
data cannot be replicated. This effectively leads to downtime of the primary storage
system
Redundancy and data replication
• Asynchronous replication:
After data has been written to the primary storage system, the write is immediately
committed to the operating system, without having to wait for the secondary storage
array to finish its writes as well
Asynchronous replication does not have the latency impact that synchronous
replication has
Disadvantage: potential data loss when the primary storage system fails before the data
has been written to the secondary storage system
Backup and recovery
• Backups are copies of data, used to restore data to a previous state in case
of data loss, data corruption or a disaster recovery situation
• Backups are always a last resort, only used if everything else fails, to save
your organization in case of a disaster
• A well-designed system should have options to repair incorrect data from
within the system or by using systems management tools (like database
tools)
Backup and recovery
• Backups should not be used to view the status of information from the
past
It should be possible to retrieve these statuses from the system itself
No data should ever be deleted in a typical production system
Older data could be archived to a secondary system or database
Backup and recovery
• 3-2-1 rule:
Keep three copies of your data
on two different media types
with one copy stored at a separate location
Backup and recovery
• Test the restore procedure at least once a year to ensure restores work as
planned
Include building up new hardware
Have restore procedures tested by a third party, or at least by people that have not
performed a restore before
In case of a real disaster we cannot assume that systems managers are able to restore
data again
• Full backup
A complete copy of all data
Full backups are only created at relatively large intervals (like a week or a month)
Creating them takes much time, disk or tape space, and bandwidth
Restoring a full backup takes the least amount of time
• Incremental backup
Save only newly created or changed data since the last backup, regardless of whether it
is a previous incremental backup or a full backup
Restoring an incremental backup can take a long time
Especially when the last full backup is many incremental backups ago
• Differential backup
Save only newly created or changed data since the last full backup
Restoring a differential backup is quite efficient, as it implies restoring a full backup and
only the most recent differential backup
Backup schemes
• Backup data retention time is the amount of time in which a given set of
data will remain available for restore
• Defines how long backups are kept and at which interval
• In practice, a Grandfather-Father-Son (GFS) based schedule is often used:
Each day a backup is made
After a week, there are seven backups, of which the oldest backup is renamed to a
weekly backup
After the second week, the same is done and the daily backups of the week before are
deleted
Now there are eight backups: seven daily, two weekly
Every four weeks, the weekly backup is renamed as a monthly backup and the weekly
backups are reused
The daily backups are the son, the weekly backups are the father, and the monthly
backups are the grandfather
Archiving
• Data must be kept in such a way that it is guaranteed the data can be read
after a long time
Digital format (like a Microsoft Word file or a JPG file)
Physical format (like a DVD or a magnetic tape)
Storage environment (temperature, humidity)
• Transfer data that is to be kept for a long time to the latest storage media
standard every 10 years
Storage performance
Disk performance
• Seek time is the time it takes for the head to get to the right track
Average seek times:
3 ms for high-end disks
9 ms for low-end disks
IOPS
• In RAID sets multiple disks are used to form one virtual disk (LUN)
• Writing data on multiple disks introduces some delay, known as the RAID
penalty
• Penalties for various RAID configurations are
RAID 0: no penalty
RAID 1: penalty of 2
RAID 10: penalty of 2
RAID 5: penalty of 4
RAID 6: penalty of 6
Interface throughput
• The type and amount of cache needed depends on what applications need
A web server, for instance, will mostly benefit from read-cache, whereas most
databases are better off with write cache
Storage tiering
• The more tiers are used, the more effort it takes to manage the tiers
• Automated tiering usually checks for file access times, file creation date, and
file ownership, and automatically moves data to the storage medium that fits
best
• Storage tiering is especially important when storing data in the public cloud,
as it has different storage costs for each tier
Load optimization