Introducing Data Science & Big Data Analytics For Business Transformation
Introducing Data Science & Big Data Analytics For Business Transformation
Copyright 2015 EMC Corporation. All Rights Reserved. Published in the USA. EMC believes the information in this publication is accurate as of
its publication date. The information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY
KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY
OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. The trademarks, logos,
and service marks (collectively "Trademarks") appearing in this publication are the property of EMC Corporation and other parties. Nothing
contained in this publication should be construed as granting any license or right to use any Trademark without the prior written permission of
the party that owns the Trademark.
EMC, EMC AccessAnywhere Access Logix, AdvantEdge, AlphaStor, AppSync ApplicationXtender, ArchiveXtender, Atmos, Authentica, Authentic
Problems, Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, Avamar, Bus-Tech, Captiva, Catalog Solution, C-Clip, Celerra,
Celerra Replicator, Centera, CenterStage, CentraStar, EMC CertTracker. CIO Connect, ClaimPack, ClaimsEditor, Claralert ,cLARiiON, ClientPak,
CloudArray, Codebook Correlation Technology, Common Information Model, Compuset, Compute Anywhere, Configuration Intelligence,
Configuresoft, Connectrix, Constellation Computing, EMC ControlCenter, CopyCross, CopyPoint, CX, DataBridge , Data Protection Suite. Data
Protection Advisor, DBClassify, DD Boost, Dantz, DatabaseXtender, Data Domain, Direct Matrix Architecture, DiskXtender, DiskXtender 2000,
DLS ECO, Document Sciences, Documentum, DR Anywhere, ECS, elnput, E-Lab, Elastic Cloud Storage, EmailXaminer, EmailXtender , EMC
Centera, EMC ControlCenter, EMC LifeLine, EMCTV, Enginuity, EPFM. eRoom, Event Explorer, FAST, FarPoint, FirstPass, FLARE, FormWare,
Geosynchrony, Global File Virtualization, Graphic Visualization, Greenplum, HighRoad, HomeBase, Illuminator , InfoArchive, InfoMover,
Infoscape, Infra, InputAccel, InputAccel Express, Invista, Ionix, ISIS,Kazeon, EMC LifeLine, Mainframe Appliance for Storage, Mainframe Data
Library, Max Retriever, MCx, MediaStor , Metro, MetroPoint, MirrorView, Multi-Band Deduplication,Navisphere, Netstorage, NetWorker,
nLayers, EMC OnCourse, OnAlert, OpenScale, Petrocloud, PixTools, Powerlink, PowerPath, PowerSnap, ProSphere, ProtectEverywhere,
ProtectPoint, EMC Proven, EMC Proven Professional, QuickScan, RAPIDPath, EMC RecoverPoint, Rainfinity, RepliCare, RepliStor, ResourcePak,
Retrospect, RSA, the RSA logo, SafeLine, SAN Advisor, SAN Copy, SAN Manager, ScaleIO Smarts, EMC Snap, SnapImage, SnapSure, SnapView,
SourceOne, SRDF, EMC Storage Administrator, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix, Symmetrix DMX, Symmetrix
VMAX, TimeFinder, TwinStrata, UltraFlex, UltraPoint, UltraScale, Unisphere, Universal Data Consistency, Vblock, Velocity, Viewlets, ViPR, Virtual
Matrix, Virtual Matrix Architecture, Virtual Provisioning, Virtualize Everything, Compromise Nothing, Virtuent, VMAX, VMAXe, VNX, VNXe,
Voyence, VPLEX, VSAM-Assist, VSAM I/O PLUS, VSET, VSPEX, Watch4net, WebXtender, xPression, xPresso, Xtrem, XtremCache, XtremSF,
XtremSW, XtremIO, YottaYotta, Zero-Friction Enterprise Storage.
Revision Date: December 2015
Revision Number:
MR-1WP-RPFD
RecoverPoint Fundamentals
This course covers an overview of the RecoverPoint solution, its offerings, architecture,
features, and functionality.
RecoverPoint Fundamentals
This module introduces the RecoverPoint solution by describing the industry and business
challenges surrounding it and the solutions offered by RecoverPoint to overcome those
challenges. It also covers some key application use cases for RecoverPoint which includes
integration of RecoverPoint with Oracle and VMware.
RecoverPoint Fundamentals
With all of the interest surrounding data loss and recovery, it is helpful to understand what
the EMC RecoverPoint product is and why it is a trusted solution. RecoverPoint is a backup
and recovery method that provides a comprehensive data protection solution for both
enterprise and commercial customers, providing integrated continuous data protection and
continuous remote replication to recover applications and data to any point in time.
RecoverPoint Fundamentals
EMC typically refers to data replication challenges as the pain points of remote replication.
They are:
The impact on an applications response time. For example, with a remotesynchronous solution, the application must wait for the acknowledgement from the
remote system before proceeding with the next dependent write.
Infrastructure and additional equipment are required to support the replication process
These pain points or data replication challenges are addressed by the RecoverPoint solution.
RecoverPoint Fundamentals
The benefits of the RecoverPoint solution are:The RecoverPoint solution provides continuous replication, where the replication is
performed synchronously or asynchronously over any distance, contributing to continuous
data protection. It also enables non-stop operation with full access to the production data
and the replicas.
The multisite support of RecoverPoint enables a centralized Disaster Recovery site
implementation and provides multiple replicas of production data to different target devices
or sites for additional data protection.
RecoverPoint makes Disaster Recovery and Operational Recovery easy for organizations
with its continuous data protection for recovery to any point in time, optimizing Recovery
Point Objective and Recovery Time Objective.
RecoverPoint reduces WAN bandwidth consumption and utilizes available bandwidth
optimally. Its built-in WAN optimization consists of compression and advanced bandwidth
reduction algorithms that reduce WAN bandwidth consumption up to 90%.
RecoverPoint protects storage arrays LUNs, allowing data replication of mixed array types
through VPLEX, where the target array can be different from the source array type.
It allows testing of the application data while replication continues unaffected. This allows
for development testing without interrupting production.
RecoverPoint Fundamentals
RecoverPoint allows you to keep your production data and its copies in any of your
locations, in a variety of configurations. It can be either local or remote.
Local copies can be used for operational recovery and remote copies can be used for
disaster recovery.
RecoverPoint Fundamentals
RecoverPoint for Virtual Machines provides local data protection and replication by creating
point-in-time copies. Using RecoverPoint for Virtual Machines, we can scale virtual machines
from tens to thousands, restore individual or multiple virtual machines.
RecoverPoint for Virtual Machines key features include:
Replicate VMs utilizing VMDK and RDM devices accessed by any type of storage
connectivity supported by VMware (i.e. FC, FCoE, iSCSI, NAS, DAS, SAS and SVD)
referenced in the VMware Hardware Compatibility List.
Replicate Consistency Groups (CG) that contain VMs for application-consistent recovery.
Provide local or remote data replication and replicate over any distance, sync or async.
Integrated with VMware vCenter and deployed in VMware ESXi 5.1 U1 or VMware 5.5
with vCenter vSphere Web Client environments.
RecoverPoint Fundamentals
With RecoverPoint for VMs, you can protect, replicate, and recover virtual machines in
VMware environments. You can find use cases in scenarios like natural disasters, accidents,
utility outages, and technical malfunctions. Also, by using RecoverPoint for VMs, you can
recover data from daily operational mishaps like data corruptions, virus attacks, and
operational errors.
RecoverPoint for VMs data replication also helps during system upgrades and data
migrations in the case of a data center move or expansion. It provides built-in orchestration
and automation, and can be integrated with VMware vCenter.
RecoverPoint Fundamentals
RecoverPoint Fundamentals
10
Integration of VMware Site Recovery Manager with RecoverPoint provides the ability to
recover virtual machines between sites non-disruptively. It makes use of heterogeneous
network based replication and RecoverPoint replica images to test disaster recovery failover
of the virtual machines, without any effect on production.
Virtual machines can be brought back online rapidly with no data loss when RecoverPoint is
used with VMware vCenter Site Recovery Manager to orchestrate and streamline data
protection and failover processes. It provides an on-demand failover to any point-in-time of
selected subset or all virtual machines.
It is the most flexible approach to protect virtualized data by replicating VMware vStorage
VMFS to recover a single virtual machine or the entire VMware ESX server. It automates
failback with vCenter server plug-in for RecoverPoint.
RecoverPoint Fundamentals
11
MetroPoint provides a solution for three data center availability. The MetroPoint topology is
enabled by combining EMC VPLEX Metro (active-active multi-site infrastructure) and
RecoverPoint together to allow continuous data replication to the remote third site.
Here we see three RecoverPoint clusters, sites A, B, and C. The clusters at site A and B are
attached to both legs of the VPLEX distributed virtual volume. This ensures replication
continues in the event of a site failure. The remote copy is stored at site C.
RecoverPoint Fundamentals
12
This module covered common data replication challenges that drive business requirements
with RecoverPoint solutions, highlighted advanced functionalities, and provided information
on key application use cases which includes RecoverPoint integration with Oracle and
VMware.
RecoverPoint Fundamentals
13
This module covers RecoverPoint architecture and components which sets the stage for the
concepts, terms and base infrastructure for the solution. This module also focuses on
comparison of RecoverPoint options.
RecoverPoint Fundamentals
14
RecoverPoint supports replication of up to four distinct sites, all with their own copy of the
data.
RecoverPoint Fundamentals
15
The RecoverPoint splitter is EMC software that is installed on one of the shown storage
systems.
The RecoverPoint splitter is used to split the application writes so that one copy is sent to
the RecoverPoint appliance and another copy is written to the production journal.
RecoverPoint array based write-splitters are built-into VNX, VMAX and VPLEX storage
systems.
In addition, RecoverPoint replication of XtremIO arrays uses Snap-Based replication
technology instead of write-splitter.
RecoverPoint Fundamentals
16
The VNX splitter runs in each storage processor of VNX and splits (mirrors) all writes to a
VNX volume. It sends one copy to the original target and the other copy to the
RecoverPoint appliance. Both RecoverPoint and RecoverPoint with VNX support the VNX
splitter.
RecoverPoint Fundamentals
17
VMAX is a high-end information storage system used to store and protect critical customer
information.
The VMAX splitter splits (mirrors) all writes to a VMAX volume. It sends one copy to the
original target and the other copy to the RecoverPoint appliance. The VMAX RecoverPoint
write splitter uses EMC Open Replicator to split writes to RecoverPoint. This provides a
method for copying device data from various types of arrays within a storage area network
(SAN) infrastructure to or from a VMAX storage array. It also offers an embedded, costeffective, and simple RecoverPoint solution for those that use the VMAX array.
RecoverPoint Fundamentals
18
VPLEX is a network based, virtualized storage product designed for data protection,
consistency, and availability.
A VPLEX splitter is built into the VPLEX engine and supports both local and remote
replication using any supported array type. It also supports virtualized environments and
clustered environments like VMware HA, Microsoft MSCS/Failover clustering, etc.
RecoverPoint Fundamentals
19
XtremIO uses Snap-based replication, which utilize the Snapshot feature. RecoverPoint
transfers the difference between each Snapshot to create Point in Time copies at the
Target. Normal RecoverPoint Asynchronous replication makes use of array based Write
Splitting. This is a method where writes are intercepted by the splitter, batched together,
and sent to the target. RecoverPoint with XtremIO supports both homogenous and
heterogeneous arrays. This allows for the Production copy (source) to be on an XtremIO
array and the Copy (Target) to be either.
When XtremIO is at the production, there is no write-splitter or extra installations required
on the array. When XtremIO is at the target, RecoverPoint uses snap based replication. The
snap based replication utilizes array based snaps for comparison and transfers the
difference between these snaps to the target.
RecoverPoint Fundamentals
20
RecoverPoint for virtual machines allows for the replication of virtual machines with virtualmachine-level granularity. This solution runs in VMware virtual environments.
One of the components of this solution is the RecoverPoint write-splitter, also known as the
ESXi splitter which is embedded in the vSphere hypervisor and vRPA. The ESXi splitter
sends Write-IOs to the production VMDK file and the RecoverPoint cluster. Splitters are
aggregated within a VMware cluster.
The ESXi splitter is a fully virtualized solution, protecting VMs with fine granularity. It
protects any application and supports all types of storage, both EMC and non EMC. As the
ESXi splitter operates from within the virtual layer, it can replicate any storage.
RecoverPoint Fundamentals
21
The RecoverPoint Appliance (RPA) is the data protection controller for RecoverPoint. RPA
nodes utilize private LAN and shared RPA volumes for communication using standard TCP
protocol. No FC/IP converters are needed to replicate across the WAN.
A set of RPAs constitute an RPA cluster. Each RPA cluster includes between two and eight
RPAs that are set during RecoverPoint system installation. The cluster size must be same at
all clusters in an installation.
In normal operation, all RPAs in a cluster are active all of the time. Consequently, if one of
the RPAs in a cluster goes down, the RecoverPoint system supports immediate switchover
of the functions to another RPA in the cluster.
RPAs perform hardware status notifications. If a hard drive or power supply on the
RecoverPoint appliance fails, a hardware event notification is raised in the system logs. If
configured, the event will also create a call-home event, sending a system request to EMC
Customer Service. Call-home events can be suppressed during maintenance periods to
avoid service request generation.
RecoverPoint Fundamentals
22
Virtual RecoverPoint Appliance (vRPAs) are virtual machines that run the RecoverPoint
software. They access the repository, journal, production, and copy volumes via the iSCSI
protocol, and therefore do not require any FC infrastructure. vRPA is a great way to get
many of the benefits of RecoverPoint without the need for physical appliances or a SAN
infrastructure.
One important consideration is that vRPAs are only available for use with the VNX storage
array. Another important consideration is that since RecoverPoint supports synchronous
replication over IP, if the WAN is sufficiently robust, vRPAs can be used for remote
synchronous replication. vRPAs can replicate any block data, regardless of how the hosts
are connected to the VNX.
RecoverPoint Fundamentals
23
RecoverPoint Fundamentals
24
A special volume must be dedicated on the SAN-attached storage for each RPA cluster. This
volume is called the Repository Volume. It stores the configuration information about the
RPAs, the cluster, and consistency groups. This enables a properly functioning RPA to
seamlessly assume the replication activities of a failing RPA from the same RPA cluster. This
volume is presented to each RPA, either via the SAN or using iSCSI for virtual RPAs.
RecoverPoint Fundamentals
25
Journal volumes hold snapshots of data to be replicated. Each journal volume holds as
many point-in-time images as its capacity allows, after which the oldest image is removed
to make space for the newest.
Journals consist of one or more volumes, presented to all RPAs in the cluster. Space can be
added to allow for a longer history to be stored, without affecting replication. The size is
determined by analyzing the environment and adjusted later.
The size of a journal volume is based on several factors:
The amount of time between point-in-time images (could be as small as each write)
Journal volumes are required on local and remote sites. Each copy of data in a consistency
group must contain one or more volumes that are dedicated to hold point-in-time history of
the data. The type and amount of information contained in the journal differs according to
the journal type. The maximum size of a journal volume should be 10 TB (per copy for a
consistency group).
Copy journals
Production journals
RecoverPoint Fundamentals
26
RecoverPoint copies are the volumes of a consistency group that are either a source or a
target of replication at a given RecoverPoint Appliance (RPA) cluster.
The following types of copies exist in a consistency group:
Production copy - consists of all volumes that are the sources of replication for a
consistency group, as well as the production journal volumes. A maximum of one
production copy and up to four non-production copies can be configured per consistency
group.
Local copies - consists of all volumes that are the targets of replication for a specific
consistency group. There can be only one production copy and one local copy.
Remote copies - consists of all the volumes that are the targets of replication for a
specific consistency group. If a local copy exists, there can be up to three remote copies
else there can be up to four remote copies.
RecoverPoint Fundamentals
27
Consistency groups are comprised of one or more replication sets. Each replication set
consists of a production volume and any local or remote copy volume to which it is
replicating. The number of replication sets in the system is equal to the number of
production volumes being replicated.
RecoverPoint Fundamentals
28
RecoverPoint Fundamentals
29
RecoverPoint 4.1.2 SP2 simplifies RecoverPoint licensing. Licensing details for the three
RecoverPoint products are shown in the table.
RecoverPoint/SE licenses are based per array and can only be purchased via VNX
software suites.
RecoverPoint/EX licenses are based per array for the VNX and per registered capacity for
the VMAX, VMAX10K, and VPLEX. Registered capacity means the amount of data on the
array that is being protected.
RecoverPoint/CL licenses are based on the replicated capacity, which simply means the
amount of data that is being replicated. RP/CL licenses are not tied to any arrays. This
makes the replication environment very flexible, since adding, changing, and refreshing
arrays does not require any change to the RecoverPoint license.
RecoverPoint Fundamentals
30
This module covered the architecture of the RecoverPoint system and its components. This
module also provided the overview of key concepts and terminologies related to
RecoverPoint and the comparison between various RecoverPoint licensing models.
RecoverPoint Fundamentals
31
This module covers the capabilities of RecoverPoint and the demonstration of its most
commonly used features.
RecoverPoint Fundamentals
32
A key feature of RecoverPoint is the ability to test point-in-time images of the production
data. The Test a Copy feature ensures the copies (images) in the journal can be used to
restore data, recover from disaster, or seamlessly take over production. This feature also
offloads task overhead from production.
The Test a Copy wizard available in Unisphere for RecoverPoint is used to verify that the
copy is a reliable and consistent with the production storage image.
The Test a Copy and Recover Production wizard available in Unisphere for RecoverPoint
guides the process of selecting a copy location, accessing a copy image, verifying the
image, and using a copy journal to roll the production storage back to a previous point-intime to correct the file or logical corruption.
Failing over a consistency group to a local copy or a remote copy allows system operations
to continue as usual from the copy. Hosts attached to the copy continue operations by
running applications. The same failover procedure can be used for planned maintenance at
the production site while the copy site takes over normal operations. When the production
storage has been restored or the planned maintenance has been completed, system
operations can be resumed at the original production source by failing over again.
The Test a Copy and Fail Over wizard available in Unisphere for RecoverPoint guides the
process selecting a copy image, testing it, and failing over to the copy or failing back to the
production from the selected image.
RecoverPoint Fundamentals
33
RecoverPoint Fundamentals
34
RecoverPoint integration with SRM enables users to test or failover SRM protection groups
to any point-in-time. The process is simple. Apply the point-in-time copy to use with SRM,
run the SRM test or failover procedure and the test or failover will occur using the point-intime specified.
RecoverPoint Fundamentals
35
SRDF and RecoverPoint are supported with the same source device. The figure
demonstrates SRDF R1 devices are replicated locally with RecoverPoint local replication at
the same time that the R1 is replicated via SRDF to the R2 device. The Any Point-in-Time
feature of RecoverPoint enables protection against local corruption, while SRDF provides the
disaster recovery solution. If SRDF is already configured, you can add the RecoverPoint
local protection. If RecoverPoint local protection is already configured, you can add SRDF
for remote protection.
Typical usage for this solution would be a customer that currently has a database
application set up with synchronous SRDF for compliance reasons. SRDF can protect against
site failure, but not against local corruption of the database. By adding local replication to
the R1 device, the customer gets DVR like recovery with minimal or zero RPO in case of
data corruption. This solution also provides a better RPO and RTO when compared with local
replication solutions such as TimeFinder Snap and Clones with backup of database logs.
RecoverPoint Fundamentals
36
Virtual provisioning is the ability to present an application with more capacity than that is
physically allocated to it in the storage array. The physical storage is then allocated to the
application on-demand from a shared pool of capacity. RecoverPoint replicates only the
allocated space that is in use by the application. For example, the application may believe it
has 8 GB allocated, but may only have 2 GB in use, RecoverPoint replicates only the 2 GB.
RecoverPoint Fundamentals
37
This module covered the capabilities of RecoverPoint and demonstrated its most commonly
used features.
RecoverPoint Fundamentals
38
RecoverPoint Fundamentals
39
Unisphere for RecoverPoint provides a single point of management for the entire
RecoverPoint system. Users familiar with any one of the Unisphere interfaces will find it
easy and intuitive to navigate through Unisphere for RecoverPoint. This significantly reduces
the learning curve for new users. Unisphere for RecoverPoint can run on any supported web
browser from any system that has TCP/IP connectivity to the RecoverPoint appliances.
Check the RecoverPoint release notes for a list of the supported browsers.
RecoverPoint Fundamentals
40
Unisphere for RecoverPoint is a web based client that replaces the RecoverPoint
Management Application available with previous versions of RecoverPoint. Users familiar
with any one of the EMC Common User Interface Toolkit (ECUIT) based interfaces will find it
easy and intuitive to navigate through the Unisphere for RecoverPoint GUI. When logged in,
users are immediately presented with a topology of their RecoverPoint system, as well as
health and traffic information. All protection and recovery operations are wizard based with
clearly defined, intuitive steps.
RecoverPoint Fundamentals
41
RecoverPoint Fundamentals
42
RecoverPoint offers a Command Line Interface (CLI) for use with both replication and
system maintenance tasks. This interface allows multiple users to access the RecoverPoint
cluster at the same time. Replication tasks such as failover commands can be scripted using
this interface.
The Maintenance User capability allows for checking system health and for log collection.
Please refer to the RecoverPoint CLI Reference Guide for more detailed CLI use information.
RecoverPoint Fundamentals
43
The VMware vSphere Web Client is used for management and monitoring of VMs.
RecoverPoint for VMs has a plugin for the web client which is automatically installed on the
vCenter server during the RecoverPoint for VMs cluster deployment.
RecoverPoint Fundamentals
44
For new RecoverPoint installations, once the License Authorization Code (LAC) is received,
simply login to support.emc.com, fill out the necessary information, and download the
license file. Everything is self-service and does not require a service request. All RP
licensing records are available online, which makes it quick and easy to verify licensing
purchases and prove compliance during company audits.
Please note that upgrades from previous RecoverPoint licenses to RecoverPoint 4.1.2 SP2
require a service request. After the initial upgrade, all further upgrades and changes can be
done using the eLicensing service.
RecoverPoint Fundamentals
45
RecoverPoint also includes another management interface called Deployment Manager that
is used to install, maintain, or upgrade RecoverPoint. This tool can be used for both physical
and virtual RPAs. Deployment Manager can also be used to install and maintain earlier
versions of RecoverPoint.
RecoverPoint Fundamentals
46
This module covered RecoverPoint Management options, RecoverPoint for VMs management
and licensing.
RecoverPoint Fundamentals
47
This course covered an overview of the RecoverPoint solution, its offerings, architecture,
features, and functionality.
This concludes the course.
RecoverPoint Fundamentals
48