0% found this document useful (0 votes)
41 views

Data Replication: Cameron Worrell

1) Data replication copies data from a production site to a remote disaster recovery site either continuously or at set intervals to provide a complete copy of data for recovery purposes in the event of an outage or disruption. 2) There are two primary types of replication - host to host, where replication software runs on source and target hosts, and disk to disk, where replication runs on external storage arrays. 3) Key considerations for designing a replication solution include recovery objectives, environment specifications, the amount and change rate of data to be replicated, and connectivity between sites. Understanding these factors is essential for properly sizing and selecting an appropriate replication technology and configuration.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Data Replication: Cameron Worrell

1) Data replication copies data from a production site to a remote disaster recovery site either continuously or at set intervals to provide a complete copy of data for recovery purposes in the event of an outage or disruption. 2) There are two primary types of replication - host to host, where replication software runs on source and target hosts, and disk to disk, where replication runs on external storage arrays. 3) Key considerations for designing a replication solution include recovery objectives, environment specifications, the amount and change rate of data to be replicated, and connectivity between sites. Understanding these factors is essential for properly sizing and selecting an appropriate replication technology and configuration.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Data Replication

Cameron Worrell

Data Replication
Introduction
Businesses depend more than ever on information and the applications that provide it. Specific back-office applications have become so critical that an outage of even a few hours can jeopardize a companys livelihood. Line of business applications warrant additional measures to ensure their availability remains consistent, regardless of disruptive events. This requirement is often addressed through the deployment of data replication technologies. Data replication is a process that copies data to a remote location either continuously or at defined intervals. This provides a complete copy of production data at a remote location for Disaster Recovery (DR) purposes. This remote location is typically a secondary data center or DR provider site. This white paper discusses several strategies and technologies that are used for data replication, as well as some of the considerations that should be made prior to implementing them.

Types of Replication
The two primary types of replication this paper focuses on are Host to Host and Disk to Disk. The key difference among these types is where the replication process runs. Each of these replication types has advantages and disadvantages, which are discussed in this section. The type that is deployed depends on business requirements for recovery and the source environment to be replicated. Replication technologies leverage sophisticated functions to intelligently copy data to a remote location. After a complete data set has been replicated to the target, only changed data is replicated, which helps to reduce bandwidth requirements. The initial copy of data to the remote storage is often referred to as seeding. After the data has been seeded, replication can function in synchronous or asynchronous modes.

Synchronous Replication Mode


In synchronous mode, disk writes are replicated to the target disk within the same transaction as writes to the source disk. All disk writes must occur and be acknowledged on both the source and target disks before a host can move on to the next disk write. Due to this behavior, application performance requirements must be carefully considered when deploying synchronous replication. Distance from source side typically plays a role in determining if Synchronous can be implemented. On the positive side, synchronous replication continuously provides a real-time copy of replicated data, allowing for a complete recovery at all times.

Asynchronous Replication Mode


In asynchronous mode, data is replicated to the target storage without requiring an acknowledgment before additional writes can occur. This improves performance but introduces more risk because the remote copy is not always current while the data is being transmitted across the wire to the remote location. Asynchronous can also be implemented as point-in-time snapshots where the data is completely copied to the target storage at predetermined intervals.

Host to Host Replication


Host to Host Replication is also referred to as processor-based replication. In this type, the replication process runs on the source and target systems. Due to this, it is possible that the replication process might introduce contention with other applications running on the source system. This is accomplished through an agent that runs on each system to track data changes and replicate them to the remote host over an IP connection. This type of replication can be performed at the OS level or at the application level. Host to Host is the most commonly implemented replication because it is a software-only solution. The diagram below shows Host to Host Replication at a high level. Diagram: Host to host Replication

Production
Source Server

Recovery
Target Server

Replication Process

TCP/IP Connection

Replication Process

Source Data

Replicated Data

Target Data

Host to Host Replication utilizes resources on the source and target servers that can impact performance. It also requires that a remote system is up and running at all times for replication to occur. A significant benefit of Host to Host Replication is that it is storage agnostic; therefore, it can be deployed regardless of the type of storage the systems are utilizing; such as internal, external, SAN, NAS, and so on.

Disk to Disk Replication


A second type of replication is Disk to Disk, where the replication process runs on an external storage device such as a SAN or NAS. This type of replication is normally implemented on vendor disk arrays such as EMC, Hitachi, IBM, HP, and so on. Each vendor provides a software application that is proprietary to its storage arrays. Although the replication software is unique to each vendor, they are all designed to intelligently copy data to a secondary storage device. Because most disk arrays utilize fibre channel connections, a storage router is required to extend the connection over a WAN link. The next diagram shows Disk to Disk replication at a high level. Diagram: Disk to Disk Replication

Production
Storage Array
Storage Router Replication Process Storage Router

Recovery
Storage Array
Replication Process

Source Server

Target Server

WAN Circuit

Source Data

Replicated Data

Target Data

Disk to Disk replication utilizes resources on the external storage hardware and is transparent to the host. Because the replication process is run on the external storage, a dedicated target host is not required. An emerging trend in the replication market utilizes third-party hardware to perform the replication. This entails the addition of an appliance into the SAN architecture that has mirrored access to the storage links. This appliance replicates write activity to the remote appliance and storage array. This approach can be utilized to replicate dissimilar storage arrays or reduce cost by allowing for the implementation of lower-cost storage at the target site. The following table lists some of the advantages and disadvantages of the replication types and modes. Replication Technologies Advantages and Disadvantages Advantages Disadvantages Software Only Solution Uses system resources (CPU, Mem, etc.) Host to Host
Storage Agnostic

Disk to Disk

Synchronous

Provides single access point for replicated data Transparent to systems being replicated Better performance can be achieved due to HW optimization Dedicated target systems may not be required Can provide real-time data copy at remote location

Needs to be deployed to each system Dedicated target systems required Additional HW and SW investments often requiredd Software often proprietary to disk vendor May be incompatible between different storage and vendors

Performance can be adversely impacted Higher bandwidth requirements May not be cost-effective Some data loss may occur

Asynchronous

Can provide near real-time data copy at remote location More cost-effective

Design Considerations
When replication has been determined, several key metrics must be understood to size the replication solution. These are as follows: 1. Recovery objectives 2. Environment specifications 3. Amount of data to be replicated 4. Change rate of data to be replicated A formal analysis should be conducted upfront to assess these metrics and other appropriate objectives. The following sections discuss each of these areas in more detail.

Recovery Objectives
The target objectives for recovery should come from the business owners in the form of a Business Impact Analysis (BIA) or equivalent data source. The BIA helps to quantify risk levels such as acceptable downtime parameters, as well as financial impact for business functions. The findings also include two metrics: Recovery Point Objective (RPO) and Recovery Time Objective (RTO). RPO is the maximum amount of lost data that a business can sustain because of an outage. RPO is measured in units of time. RTO is the maximum allowable time between an outage and the resumption of business. RPO and RTO play a key role in the design process.

Environment Specifications
Whether the existing storage environment is being utilized or new equipment is being purchased, the specifications of the hardware and software will be a large factor in what replication capabilities exist. For Host-Based replication, platform and OS play a large role in determining which software is appropriate. For Disk-to-Disk replication, the make and model of the array impacts what type of replication solution can be deployed. Based on the businesses recovery objectives, upgrades or migration might be necessary for the production storage environment. Multiple copies of the data are recommended to prevent a data corruption from propagating to the target storage. This should be accounted for upfront when storage capacity is allocated at the target location. Always consider the use of a gold copy that can recover from a recovery event. Another environmental consideration is connectivity from a storage and network perspective. Data replication technologies might require only a basic IP connection with sufficient bandwidth or might require complex storage extension components. Bandwidth requirements can range from T-1(1.5mbps) to an OC-48. Determining replication bandwidth requirement is critical to a successful design and is discussed further in the following section.

Data Amount and Change Rate


The quantity and change rate of the data impact both the sizing of the source/target storage environments as well as bandwidth and connectivity between them. The amount should be determined based on the total critical data to be replicated. This should not include less critical data that can be restored via tape or re-created. After the amount of

data is determined, the change rate of the data must be identified. The data change rate should be captured as both a daily total as well as peak. One method to get a rough estimate of daily change rate is by referencing daily backup log files. If incremental or differential backups are being performed daily, a comparison against a full backup can provide the daily change rate. With the RPO, data amount and data change rate, the bandwidth requirement is a simple calculation. After the data has been seeded at the target site, only changes are replicated. An example of this calculation is provided here. Sample Bandwidth Calculation Metrics
RPO: 1 Hour Amount of Data: 1000 Gigabytes (1TB) Daily Change Rate: 5% (500MB)

Calculated Requirements
Daily replication volume: 1000GB*.05 = 500Megabytes or 4000 Megabits or 4096000 Kilobits Max potential of replicated data = 500MB or 4096000 Kilobits in 1 Hour Bandwidth Required: 4096000kbit/3600sec = 1,137 Kilobits/sec or DS-1 speed circuit

The preceding formula is useful for projecting estimated requirements but assumes a linear change rate. In reality, data write activity can vary greatly depending on the application and many other factors. Although this method provides a good indicator, it is recommended that a separate analysis is performed to obtain the change rate at a more granular level over several days and during peak utilization. This can provide an accurate utilization trend and expose any spikes in disk writes that need to be accounted for in the capacity planning phase of the design process. Many tools are available to capture these data points. Storage vendors typically offer a service to perform the data collection and analysis. The result of the assessment is utilized to determine hardware and network requirements for replication. The following sample chart shows change rate graphically over a period of time. Sample Data Change Rate Analysis Chart

Megabytes / Sec

Disk Write Spike

Bandwidth Available for Replication

Disk Write Activity

In reviewing the chart, it is apparent that the change rate exceeds the available bandwidth capacity for a period of time. Unless the bandwidth is increased, this translates to an increase in RPO until activity slows down and the replication process can catch up. Obtaining this data upfront allows for proper capacity planning from a bandwidth and hardware perspective. Data compression can significantly reduce bandwidth requirements, but the amount of compression that can be achieved depends on the type of data. For planning purposes, a compression ratio of 2:1 or less should be assumed if a formal test is not possible.

Recovery and Testing Considerations


When replication is configured and functional, it must be maintained and tested. The replicated environment is an extension of production and must be considered in day-today operations, change management, upgrades and so on. Recovery of replicated environments still requires that appropriate plans and procedures are in place. Steps are typically required to prepare the remote data copy and applications for use. If there are interdependencies between applications, the data must be consistent to a point in time. Bringing systems up to a consistent point in time may involve transaction re-creation or reprocessing. Network addressing and routing changes are also required to bring the systems online and make them available. When the production environment is operating from the replicas, normal backups should resume. Also, when the event or exercise has been completed, the replication and recovery process needs to be reversed to resume production at home.

Summary
Data replication technologies provide a great method for recovering mission critical data with little or no data loss. Replication can be deployed at the host or disk level and can operate in synchronous or asynchronous modes. Careful analysis is recommended upfront when replication is being explored by your business. The businesses recovery requirements (RPO/RTO) must be understood because they profoundly impact the design and implementation of the replication environment. Several key metrics must be quantified upfront to insure the replication design meets recovery requirements. These metrics should be derived from a formal analysis conducted internally or by a third party. Replicated environments must be integrated into production and tested regularly. If implemented properly, data replication technologies can provide seamless recovery information and systems if a full-scale disaster or a simple unplanned outage occurs.

You might also like