Disaster Recovery - Best Practices White Paper
Disaster Recovery - Best Practices White Paper
Introduction
Performance Indicators for Disaster Recovery
High−Level Process Flow for Disaster Recovery
Management Awareness
Identify Possible Disaster Scenarios
Build Management Awareness
Obtain Management Sign−Off and Funding
Disaster Recover Planning Process
Establish a Planning Group
Perform Risk Assessments and Audits
Establish Priorities for Your Network and Applications
Develop Resiliency Design and Recovery Strategy
Prepare Up−to−Date Inventory and Documentation of the Plan
Develop Verification Criteria and Procedures
Implementation
Resiliency and Backup Services
Assess Network Resiliency
Review and Implement Backup Services
Vendor Support Services
Related Information
Introduction
A disaster recovery plan covers both the hardware and software required to run critical business applications
and the associated processes to transition smoothly in the event of a natural or human−caused disaster. To
plan effectively, you need to first assess your mission−critical business processes and associated applications
before creating the full disaster recovery plan.
This best−practice document outlines the steps you need to take to implement a successful disaster recovery
plan. We'll look at the following critical steps for best−practice disaster recovery: Management Awareness,
Disaster Recovery Planning, Resiliency and Backup Services, and Vendor Support Services.
Management Awareness
Management Awareness is the first and most important step in creating a successful disaster recovery plan. To
obtain the necessary resources and time required from each area of your organization, senior management has
to understand and support the business impacts and risks. Several key tasks are required to achieve
management awareness.
The following are examples of possible disasters: fire, storm, water, earthquake, chemical accidents, nuclear
accidents, war, terrorist attacks and other crime, cold winter weather, extreme heat, airplane crash (loss of key
staff), and avalanche. The possibility of each scenario depends on factors such as geographical location and
political stability.
Note: Most disasters are caused by fire and we therefore recommend you start with fire as your first case
study.
Assess the impact of a disaster on your business from both a financial and physical (infrastructure) perspective
by asking the following questions:
• Mission Critical: Network or application outage or destruction that would cause an extreme
disruption to the business, cause major legal or financial ramifications, or threaten the health and
safety of a person. The targeted system or data requires significant effort to restore, or the restoration
process is disruptive to the business or other systems.
• Important: Network or application outage or destruction that would cause a moderate disruption to
the business, cause minor legal or financial ramifications, or provide problems with access to other
systems. The targeted system or data requires a moderate effort to restore, or the restoration process is
disruptive to the system.
• Minor: Network or application outage or destruction that would cause a minor disruption to the
business. The targeted systems or network can be easily restored.
Develop a recovery strategy to cover the practicalities of dealing with a disaster. Such a strategy may be
applicable to several scenarios; however, the plan should be assessed against each scenario to identify any
actions specific to different disaster types. Your plan should address the following: people, facilities, network
services, communication equipment, applications, clients and servers, support and maintenance contracts,
additional vendor services, lead−time of Telco services, and environmental situations.
Your recovery strategy should include the expected down time of services, action plans, and escalation
procedures. Your plan should also determine thresholds, such as the minimum level at which can the business
operate, the systems that must have full functionality (all staff must have access), and the systems that can be
minimized.
It's important that you test and review the plan frequently. We recommend documenting the verification
process and procedures, and designing a proof−of−concept−process. The verification process should include
an experience cycle; disaster recovery is based on experience and each disaster has different rules. You may
want to call on experts to develop and prove the concept, and product vendors to design and verify the plan.
Implementation
Now it's time to make some key decisions: How should your plan be implemented? Who are the critical staff
members, and what are their roles? Leading up to the implementation of your plan, try to practice for disaster
recovery using roundtable discussions, role playing, or disaster scenario training. Again, it's essential that your
senior management approves the disaster recovery and implementation plans.
• Network links
♦ Carrier diversity
♦ Local loop diversity
♦ Facilities resiliency
♦ Building wiring resiliency
• Hardware resiliency
♦ DNS resiliency
♦ DHCP resiliency
♦ Other services resiliency
All system and application backup strategies depend upon network connections. Disaster handling requires
communication services, and the impact of a disaster could be greatly limited by having available
communication services.
The following table shows possible backup services (across the top row) for a primary connection (down the
left column). Based on your location, some of the services may not be available, or may only be available with
limited bandwidth. An X represents a possible backup services solution; an O represents a limited backup
services solution; and a blank box represents an option that is not sufficient as a backup service solution.
PLC (E1,
IP
T1, Frame SDH / Communication
Services ISDN ATM POTS VSAT Microwave
fractional) Relay SMDS by Light
IP Services
X X X X X X X X X X
PLC (E1, T1,
fractional)
X X X X X O O O O
ISDN
X X X X X O O O O
Frame Relay
X X X X X
A backup service (marked with an X) should offer 60 percent of the bandwidth requirements of the primary
service. The backup service must be compatible, and in some cases additional interfaces for routers, switches,
adapters, and protocols are required.
Most vendors have experience handling disaster situations and can offer additional support. Cisco offers a
wide range of Service & Support Solutions ( registered customers only) and can assist with limiting downtime in
the case of an unexpected outage.
Related Information
• Technical Support − Cisco Systems
All contents are Copyright © 2006−2007 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.