Data Replication: Cameron Worrell
Data Replication: Cameron Worrell
Cameron Worrell
Data Replication
Introduction
Businesses depend more than ever on information and the applications that provide it. Specific back-office applications have become so critical that an outage of even a few hours can jeopardize a companys livelihood. Line of business applications warrant additional measures to ensure their availability remains consistent, regardless of disruptive events. This requirement is often addressed through the deployment of data replication technologies. Data replication is a process that copies data to a remote location either continuously or at defined intervals. This provides a complete copy of production data at a remote location for Disaster Recovery (DR) purposes. This remote location is typically a secondary data center or DR provider site. This white paper discusses several strategies and technologies that are used for data replication, as well as some of the considerations that should be made prior to implementing them.
Types of Replication
The two primary types of replication this paper focuses on are Host to Host and Disk to Disk. The key difference among these types is where the replication process runs. Each of these replication types has advantages and disadvantages, which are discussed in this section. The type that is deployed depends on business requirements for recovery and the source environment to be replicated. Replication technologies leverage sophisticated functions to intelligently copy data to a remote location. After a complete data set has been replicated to the target, only changed data is replicated, which helps to reduce bandwidth requirements. The initial copy of data to the remote storage is often referred to as seeding. After the data has been seeded, replication can function in synchronous or asynchronous modes.
Production
Source Server
Recovery
Target Server
Replication Process
TCP/IP Connection
Replication Process
Source Data
Replicated Data
Target Data
Host to Host Replication utilizes resources on the source and target servers that can impact performance. It also requires that a remote system is up and running at all times for replication to occur. A significant benefit of Host to Host Replication is that it is storage agnostic; therefore, it can be deployed regardless of the type of storage the systems are utilizing; such as internal, external, SAN, NAS, and so on.
Production
Storage Array
Storage Router Replication Process Storage Router
Recovery
Storage Array
Replication Process
Source Server
Target Server
WAN Circuit
Source Data
Replicated Data
Target Data
Disk to Disk replication utilizes resources on the external storage hardware and is transparent to the host. Because the replication process is run on the external storage, a dedicated target host is not required. An emerging trend in the replication market utilizes third-party hardware to perform the replication. This entails the addition of an appliance into the SAN architecture that has mirrored access to the storage links. This appliance replicates write activity to the remote appliance and storage array. This approach can be utilized to replicate dissimilar storage arrays or reduce cost by allowing for the implementation of lower-cost storage at the target site. The following table lists some of the advantages and disadvantages of the replication types and modes. Replication Technologies Advantages and Disadvantages Advantages Disadvantages Software Only Solution Uses system resources (CPU, Mem, etc.) Host to Host
Storage Agnostic
Disk to Disk
Synchronous
Provides single access point for replicated data Transparent to systems being replicated Better performance can be achieved due to HW optimization Dedicated target systems may not be required Can provide real-time data copy at remote location
Needs to be deployed to each system Dedicated target systems required Additional HW and SW investments often requiredd Software often proprietary to disk vendor May be incompatible between different storage and vendors
Performance can be adversely impacted Higher bandwidth requirements May not be cost-effective Some data loss may occur
Asynchronous
Can provide near real-time data copy at remote location More cost-effective
Design Considerations
When replication has been determined, several key metrics must be understood to size the replication solution. These are as follows: 1. Recovery objectives 2. Environment specifications 3. Amount of data to be replicated 4. Change rate of data to be replicated A formal analysis should be conducted upfront to assess these metrics and other appropriate objectives. The following sections discuss each of these areas in more detail.
Recovery Objectives
The target objectives for recovery should come from the business owners in the form of a Business Impact Analysis (BIA) or equivalent data source. The BIA helps to quantify risk levels such as acceptable downtime parameters, as well as financial impact for business functions. The findings also include two metrics: Recovery Point Objective (RPO) and Recovery Time Objective (RTO). RPO is the maximum amount of lost data that a business can sustain because of an outage. RPO is measured in units of time. RTO is the maximum allowable time between an outage and the resumption of business. RPO and RTO play a key role in the design process.
Environment Specifications
Whether the existing storage environment is being utilized or new equipment is being purchased, the specifications of the hardware and software will be a large factor in what replication capabilities exist. For Host-Based replication, platform and OS play a large role in determining which software is appropriate. For Disk-to-Disk replication, the make and model of the array impacts what type of replication solution can be deployed. Based on the businesses recovery objectives, upgrades or migration might be necessary for the production storage environment. Multiple copies of the data are recommended to prevent a data corruption from propagating to the target storage. This should be accounted for upfront when storage capacity is allocated at the target location. Always consider the use of a gold copy that can recover from a recovery event. Another environmental consideration is connectivity from a storage and network perspective. Data replication technologies might require only a basic IP connection with sufficient bandwidth or might require complex storage extension components. Bandwidth requirements can range from T-1(1.5mbps) to an OC-48. Determining replication bandwidth requirement is critical to a successful design and is discussed further in the following section.
data is determined, the change rate of the data must be identified. The data change rate should be captured as both a daily total as well as peak. One method to get a rough estimate of daily change rate is by referencing daily backup log files. If incremental or differential backups are being performed daily, a comparison against a full backup can provide the daily change rate. With the RPO, data amount and data change rate, the bandwidth requirement is a simple calculation. After the data has been seeded at the target site, only changes are replicated. An example of this calculation is provided here. Sample Bandwidth Calculation Metrics
RPO: 1 Hour Amount of Data: 1000 Gigabytes (1TB) Daily Change Rate: 5% (500MB)
Calculated Requirements
Daily replication volume: 1000GB*.05 = 500Megabytes or 4000 Megabits or 4096000 Kilobits Max potential of replicated data = 500MB or 4096000 Kilobits in 1 Hour Bandwidth Required: 4096000kbit/3600sec = 1,137 Kilobits/sec or DS-1 speed circuit
The preceding formula is useful for projecting estimated requirements but assumes a linear change rate. In reality, data write activity can vary greatly depending on the application and many other factors. Although this method provides a good indicator, it is recommended that a separate analysis is performed to obtain the change rate at a more granular level over several days and during peak utilization. This can provide an accurate utilization trend and expose any spikes in disk writes that need to be accounted for in the capacity planning phase of the design process. Many tools are available to capture these data points. Storage vendors typically offer a service to perform the data collection and analysis. The result of the assessment is utilized to determine hardware and network requirements for replication. The following sample chart shows change rate graphically over a period of time. Sample Data Change Rate Analysis Chart
Megabytes / Sec
In reviewing the chart, it is apparent that the change rate exceeds the available bandwidth capacity for a period of time. Unless the bandwidth is increased, this translates to an increase in RPO until activity slows down and the replication process can catch up. Obtaining this data upfront allows for proper capacity planning from a bandwidth and hardware perspective. Data compression can significantly reduce bandwidth requirements, but the amount of compression that can be achieved depends on the type of data. For planning purposes, a compression ratio of 2:1 or less should be assumed if a formal test is not possible.
Summary
Data replication technologies provide a great method for recovering mission critical data with little or no data loss. Replication can be deployed at the host or disk level and can operate in synchronous or asynchronous modes. Careful analysis is recommended upfront when replication is being explored by your business. The businesses recovery requirements (RPO/RTO) must be understood because they profoundly impact the design and implementation of the replication environment. Several key metrics must be quantified upfront to insure the replication design meets recovery requirements. These metrics should be derived from a formal analysis conducted internally or by a third party. Replicated environments must be integrated into production and tested regularly. If implemented properly, data replication technologies can provide seamless recovery information and systems if a full-scale disaster or a simple unplanned outage occurs.