Module 7 Business Continuity Management
Module 7 Business Continuity Management
Business Continuity
Management
MODULE 7: BUSINESS CONTINUITY
MANAGEMENT
Table of Contents
Module 7: Business Continuity Management............................................................................1
SECTION 1: OVERVIEW ................................................................................................................7
MODLULE 7: BUSINESS CONTINUITY MANAGEMENT (6%) ..................................................7
Objective: .....................................................................................................................................7
Task Statements ..........................................................................................................................7
Knowledge Statements ...............................................................................................................8
Relationship of Task Statements with Knowledge Statements ................................................8
Task and Knowledge Statements Mapping ...............................................................................8
Knowledge Statement Reference Guide ..................................................................................10
SECTION 2: CONTENTS ..............................................................................................................15
Chapter 1: Business Continuity Management, Business Continuity Planning and Disaster
Recovery Planning .......................................................................................................................15
Learning objectives....................................................................................................................15
1.1 Introduction ..........................................................................................................................15
1.2 Definitions of key terms .......................................................................................................17
1.3 Key concepts of Disaster Recovery, Business Continuity Plan and Business Continuity
Management ..............................................................................................................................19
1.3.1 Contingency Plan (CP) ................................................................................................19
Components of contingency planning .............................................................................19
1.3.2 Business Continuity Plan vs. Disaster Recovery Plan ..........................................20
1.3.3 Business Continuity Management .........................................................................20
1.4 Objectives of BCM and BCP ..............................................................................................21
1.4.1 Objectives of Business continuity plan: ......................................................................21
1.4.2 Objectives of Business Continuity Management (BCM) ...........................................21
1
1.5 What is a disaster? ..............................................................................................................23
1.5.1 Types of disasters .......................................................................................................23
1. Natural disasters ..........................................................................................................24
2. Man-made disasters.....................................................................................................24
1.5.2 Phases of disaster .......................................................................................................24
1. Crisis phase ..................................................................................................................25
2. Emergency response phase ........................................................................................25
3. Recovery phase ...........................................................................................................25
4. Restoration phase ........................................................................................................25
1.5.3 Examples of disaster ...................................................................................................25
1.5.4 Impact of disaster ........................................................................................................26
1.6 Summary..............................................................................................................................27
1.7 Questions .............................................................................................................................27
1.8 Answers and Explanations .................................................................................................29
Chapter 2: Strategies for Development of Business Continuity plan .................................31
Learning Objectives ...................................................................................................................31
2.1 Introduction: .........................................................................................................................31
2.2 Pre-requisites in developing a Business Continuity Plan..................................................32
2.2.1 Phase 1: Business Impact Analysis ...........................................................................33
2.2.2 Phase 2: Risk assessment and methodology of risk assessment ............................35
2.2.3 Phase 3: Development of BCP ..............................................................................43
2.2.4 Phase 4: Testing of BCP and DRP ........................................................................50
2.2.5 Phase 5: Training and Awareness .........................................................................53
2.2.6 Phase 6: Maintenance of BCP and DRP ...............................................................55
Role of IS Auditor in testing of BCP: ...............................................................................56
2.3 Incident Handling and Management. .................................................................................56
2.3.1 Incident Response .......................................................................................................56
2.3.2 Incident Classification..................................................................................................57
2
2.3.3 Norms and procedure for declaring an Incident as a Disaster: ................................58
Collection of data under IRP ............................................................................................58
Reactions to incidents ......................................................................................................58
Incident Notifications ........................................................................................................58
Documenting an Incident .................................................................................................59
Incident containment strategies .......................................................................................59
Recovering from incident .................................................................................................59
After action review ............................................................................................................59
Incident response plan review and maintenance: ..........................................................59
2.4 Invoking a DR Phase/BCP Phase ......................................................................................60
2.4.1 Operating Teams of contingency planning ................................................................60
2.4.2 DRP scope and objectives ..........................................................................................60
2.4.3 Disaster recovery phases............................................................................................61
2.4.4 Key Disaster recovery activities ..................................................................................62
2.4.5 DRP ..............................................................................................................................62
2.4.6 Disaster Recovery Team.............................................................................................63
General Responsibilities ..................................................................................................63
General Activities..............................................................................................................63
Administrative Responsibilities ........................................................................................64
Supply Responsibilities ....................................................................................................64
Public Relations Responsibilities .....................................................................................65
Management Team Call Checklist ..................................................................................65
Hardware Responsibilities ...............................................................................................65
Software Responsibilities .................................................................................................66
Network Responsibilities ..................................................................................................67
Operations Responsibilities .............................................................................................67
Salvage Responsibilities ..................................................................................................68
New Data Center Responsibilities ...................................................................................69
3
New Hardware Responsibilities .......................................................................................69
Resumption of Normal Operations ..................................................................................70
2.5 Documentation: BCP Manual and BCM Policy .................................................................70
2.5.1 BCM Policy...................................................................................................................71
2.5.2 BCP Manual .................................................................................................................72
Elements of BCP Manual .................................................................................................72
2.6 Data backup, Retention and Restoration practices ...........................................................74
2.6.1 Back up strategies .......................................................................................................74
Types of Backup ...............................................................................................................75
2.6.2 Recovery strategies .....................................................................................................76
Strategies for Networked Systems ..................................................................................76
Strategies for Data communications ...............................................................................78
Strategies for Voice Communications .............................................................................79
2.7 Types of recovery and alternative sites .............................................................................79
2.7.1 Mirror Site/ Active Recovery Site ................................................................................80
Mirror site ..........................................................................................................................80
Hot Site .............................................................................................................................80
Cold Site............................................................................................................................80
Warm Site .........................................................................................................................81
2.7.2 Offsite Data protection.................................................................................................81
Data Vaults .......................................................................................................................81
Hybrid onsite and offsite vaulting ....................................................................................81
2.8 System Resiliency Tools and Techniques .........................................................................82
2.8.1 Fault Tolerance ............................................................................................................82
2.8.2 Redundant array of inexpensive disks (RAID) ...........................................................83
2.9 Insurance coverage for BCP ..............................................................................................83
2.9.1 Coverage......................................................................................................................84
2.9.2 Kinds of Insurance .......................................................................................................84
4
(a) First-party Insurances: Property damages ................................................................84
(b) First-party Insurances: Business Interruption ...........................................................85
(c) Third-party Insurance: General Liability.....................................................................85
(iv) Third-party Insurance: Directors and Officers ..........................................................85
2.10 Summary............................................................................................................................86
2.11 Questions ...........................................................................................................................86
2.12 Answers and Explanations ...............................................................................................88
Chapter 3: Audit of business Continuity plan .........................................................................91
Learning Objectives ..............................................................................................................91
3.1 Introduction ..........................................................................................................................91
3.2 Steps of BCP Process.........................................................................................................91
3.2.1 Step 1: Identifying the mission or business-critical functions....................................92
3.2.2 Step 2: Identifying the resources that support critical functions ...............................92
1. Human Resources .......................................................................................................92
2. Processing capability ...................................................................................................92
3. Automated applications and data ................................................................................93
4. Computer-based services............................................................................................93
5. Physical infrastructure..................................................................................................93
6. Documents and papers ................................................................................................93
3.2.3 Step 3: Anticipating potential contingencies or disasters ..........................................94
3.2.4 Step 4: Selecting contingency planning strategies ....................................................94
3.2.5 Step 5: Implementing the contingency strategies ......................................................94
1. Implementation .............................................................................................................94
2. Back up .........................................................................................................................94
3. Documentation .............................................................................................................95
4. Assigning responsibility................................................................................................95
5. No. of BC Plans and responsibility ..............................................................................95
3.2.6 Step 6: Testing and Revising ......................................................................................95
5
3.3 Audit and Regulatory requirements ....................................................................................95
3.3.1 Role of IS Auditor in BCP Audit ..................................................................................96
3.3.2 Regulatory requirements .............................................................................................96
Using COBIT best practices for evaluating regulatory compliances .............................97
3.3.3 Regulatory compliances of BCP .................................................................................98
Basel Committee on E Banking .......................................................................................99
Indian legislations .............................................................................................................99
3.4 Using best practices and frameworks for BCP ..................................................................99
3.4.1 COBIT 5 .......................................................................................................................99
DSS04: Manage continuity ........................................................................................... 100
BAI04: Manage Availability and Capacity .................................................................... 103
3.4.2 ISO 22301: Standard on Business Continuity Management ................................. 106
3.3.3 ITIL ........................................................................................................................ 106
3.3.4 SSAE 16 ............................................................................................................... 107
3.4.3 Audit Tools and Techniques .................................................................................... 107
3.4.4 Service Level Agreement ......................................................................................... 108
3.5 Services that can be provided by an IS Auditor in BCM ................................................ 108
3.6 Summary........................................................................................................................... 109
3.7 References ....................................................................................................................... 110
3.8 Questions .......................................................................................................................... 111
3.9 Answers and Explanations .............................................................................................. 113
SECTION 3: APPENDIX ............................................................................................................ 115
Checklists and control matrix ................................................................................................. 115
Appendix 1: Checklist for a Business Continuity Plan and Audit ......................................... 115
Appendix 2: RCM and audit guidelines for DRP and BRP................................................... 119
Appendix 3: Sample of BCP Audit Finding ........................................................................... 125
6
SECTION 1: OVERVIEW
MODLULE 7: BUSINESS CONTINUITY
MANAGEMENT (6%)
Objective:
Provide assurance or consulting services to confirm whether the Business continuity
management (BCM) strategy, processes and practices meet organisation requirements to
ensure timely resumption of IT enabled business operations and minimize the business
impact of a disaster.
Task Statements
7.1 Distinguish between Disaster recovery plan, Business Continuity Plan and BCM.
7.2 Evaluate the organisation business continuity plan to assess the adequacy and capability to
continue essential business operations during the period of an IT or non-IT disruptions.
7.3 Applying industry best practices and regulatory requirements as relevant for BCM such as
COBIT/ISO, etc.
7.4 Map business continuity management practices to organisation requirements, objectives and
budgets.
7.5 Review the organisation processes of business resilience in the context of BCM.
7.6 Identify the business and operational risks inherent in an entity’s disaster recovery/business
continuity plan.
7.7 Assess the process of business Impact analysis.
7.8 Identifying recovery strategies and their adequacy to meet business needs.
7.9 Assess impact of RPO/RTO on Computer setup and IT Service Design.
7.10 Assess adequacy of operations and end-user procedures for managing disruptions and
incident management.
7.11 Perform various types of tests for different aspects of Business continuity.
7.12 Assess adequacy of documentation and maintenance process of BCM.
7.13 Assess service level management practices and the components within a service level
agreement.
7.14 Review monitoring of third party compliance with the organisation controls as relevant to BCM.
7.15 Evaluate adequacy of BCP processes and practices to confirm it meets business continuity
requirements.
7.16 Evaluate organisation BCM practices to determine whether it meets organisation
requirements
Knowledge Statements
7.1 DRP, BCP and BCM processes and practices and related documentation.
7.2 Industry best practices as relevant such as COBIT, ISO standard for BCP/DRP.
7.3 IT deployment in organisations and business continuity requirements at various levels of IT
such as hardware, network, system software, database software, application software, data,
facilities, human resources, etc.
7.4 System resiliency tools and techniques (e.g., fault tolerant hardware, elimination of single
point of failure, etc.).
7.5 Business impact analysis (BIA) related to disaster recovery planning.
7.6 Development and maintenance of BCM, BCP and DRP.
7.7 Problem and incident management practices (e.g., help desk, escalation procedures,
tracking).
7.8 Analyzing SLA reports and relevant provisions.
7.9 Backup & Recovery strategies, Recovery Window, RPO and RTO.
7.10 Data backup, storage, maintenance, retention and restoration practices.
7.11 Regulatory, legal, contractual and insurance issues related to BCM.
7.12 Types of alternate processing sites and methods (e.g. hot sites, warm sites, cold sites).
7.13 Processes used to invoke the disaster recovery plans and BCP as relevant.
7.14 Testing methods for DRP/BCP and BCM.
7.15 Auditing the BCP-DRP plans.
9
7.12 Assess adequacy of documentation 7.1 BCM, BCP, DRP and related
and maintenance process of BCM. documentation.
7.10 Data backup, storage, maintenance,
retention and restoration practices.
7.13 Assess Service level management 7.8 Identifying recovery strategies and their
practices and the components within a adequacy to meet business needs.
service level agreement.
7.14 Review monitoring of third party 7.11 Regulatory, legal, contractual and
compliance with the organisation controls as insurance issues related to BCM.
relevant to BCM.
7.15 Evaluate adequacy of BCP processes 7.9 Backup & Recovery strategies,
and practices to confirm it meets business Recovery Window, RPO and RTO.
continuity requirements. 7.10 Data backup, storage, maintenance,
retention and restoration practices
7.12 Types of alternate processing sites
and methods (e.g., Near site, hot sites,
warm sites, cold sites)
7.1 BCM, BCP, DRP and relate
documentation.
7.16 Evaluate organisation BCM practices to 7.14 Testing methods for DRP/BCP and
determine whether it meets organisation BCM.
requirements. 7.15 Auditing the BCP and DRP plans.
7.13 Processes used to invoke the disaster
recovery plans and BCP as relevant.
KS 7.1 DRP, BCP, BCM processes and practices and related documentation.
10
KS 7.2 Industry best practices as relevant such as COBIT, ISO standard for BCP/DRP
KS 7.4 System resiliency tools and techniques (e.g., fault tolerant hardware, elimination of single
point of failure, etc.)
KS 7.7 Problem and incident management practices (e.g., help desk, escalation procedures,
tracking)
11
Key Concepts Reference
Understand the need of having an Incident Management 1.2, 1.3, 1.5, 2.2
Process.
KS 7.9 Backup & Recovery strategies, Recovery Window, RPO and RTO
KS 7.12 Types of alternate processing sites and methods (e.g. hot sites, warm sites, cold sites)
KS 7.13 Processes used to invoke the disaster recovery plans and BCP as relevant.
12
Key Concepts Reference
Understand the processes to initiate a DR process for 2.4, 3.3, 2.9
restoration of uptime of IT Services at the time of the
happening of a critical incident.
Understand the process to initiate a BCP following a DR 2.4, 3.3, 2.9
Process for complete restoration of Core Business Operations.
13
SECTION 2: CONTENTS
CHAPTER 1: BUSINESS CONTINUITY
MANAGEMENT, BUSINESS CONTINUITY
PLANNING AND DISASTER RECOVERY
PLANNING
Learning objectives
The objective of this chapter is to provide knowledge about the key concepts of Business Continuity
Management (BCM), Business Continuity Planning (BCP), Disaster Recovery Planning (DRP),
Incident Responses, Contingency plan and disaster. It is important to understand these concepts
as they form the base for the content that is explained in Chapters 2 and 3. DISA candidate is
expected to have understanding of the key terms related concepts as this is critical for designing,
implementing or reviewing business continuity. A good understanding and working knowledge in
this area will help DISAs to provide assurance and consulting services in this area.
1.1 Introduction
Information is said to be the currency of the 21st century and it is considered the most valuable
asset of an organisation. This is more so in case of organisations which use and are heavily
dependent on Information Technology (IT). Organisations in this modern era run their business
based on information which are processed using Information and Communication Technology
(ICT). The ICT plays a central role in the operation of the business activities. For example, the
stock market is virtually paperless. Banks and financial institutions have become online, where the
customers rarely need to set foot in the branch premises. There is a heavy dependence on real
time information from information technology assets for conducting business. Information is a
critical factor for continued success of the business. This dependence on Information is more
explicit in the most organisations which are now dependent on IT for performing their regular
business operations. We can understand the criticality of IT by imagining impact of failure or non-
availability of IT in case of following types of organisations:
Bank using Core banking solution with a million accounts, credit cards, loans and
customers.
Companies using centralized ERP software having operations in multiple locations.
An airline serving customers on flights daily using IT for all operations.
Pharmacy system filling millions of prescriptions per year (some of the prescriptions are
life-saving).
Automobile factory producing manufacturing hundreds of vehicles daily using automated
solution.
Railways managing thousands of train routes and passengers through automated
ticketing and reservation.
The above situations clearly demonstrate the heavy dependence on IT systems. IT can fail due to
multiple factors. Hence, organisations should have appropriate contingency plans for resuming
operations from disruption. The disruption of business operation can be due to unforeseen man-
made or natural disaster and this may lead to loss of productivity, revenue and market share among
many other impacts. Hence, organisations have to take necessary steps to ensure that the impact
from such disasters is minimized and build resilience which ensures continuity of critical operation
in the event of disruptions. Modern organisations cannot think of running their business operations
without IT. IT is prone to increased risks which can lead to failure of IT thus impacting operations.
Hence, it is becoming increasingly important for organisations to have a business contingency plan
for their Information Systems.
The criticality of the plan can be determined based on the level of impact on critical business
operations due to failure or non-availability of IT impacting service delivery. The failure of IT could
be caused due to any or more of the following:
Hacker break-in
Sabotage or theft
Organisations worldwide are more and more dependent on computers, in assisting and carrying
out the decision making processes and in recording business transactions. An organisation is
extremely dependent on several I.S. resources like computers, employees, Servers and
16
Chapter 1: BCM, BCP and DRP
communication links. If any of these resources are not available, the organisation will not be able
to function at its full strength. The longer one or more of these resources are unavailable, the longer
it takes the organisation to get back to its original state. Sometimes, organisations can never get
back to its original state. As a result, it becomes important to have a tested plan for the disaster
recovery, more importantly, in the information system area to ensure business continuity. The
organisations that think ahead have a better chance of survival always.
Business Continuity Planning: Business continuity planning is the process of developing prior
arrangements and procedures that enable an organisation to respond to an event in such a manner
that critical business functions can continue within planned level of disruption. The end result of
the planning is called a Business Continuity Plan.
Business Impact Analysis: The process of analyzing functions and the effect that a business
disruption might have upon them.
Crisis: An abnormal situation which threatens the operations, staff, customers or reputation of the
organisation.
Disaster: A physical event which interrupts business processes sufficiently to threaten the viability
of the organisation.
17
Disaster Recovery Planning: A disaster recovery plan (DRP) is a documented process or set of
procedures to recover and protect a business IT infrastructure in the event of a disaster.
Emergency Management Team (EMT): This team comprising of executives at all levels including
IT is vested with the responsibility of commanding the resources which are required to recover the
enterprises operations.
Incident: An event that has the capacity to lead to loss of or a disruption to an organisation’s
operations, services, or functions – which, if not managed, can escalate into an emergency, crisis
or disaster.
Incident Management Plan: A clearly defined and documented plan of action for use at the time
of an incident, typically covering the key personnel, resources, services and actions needed to
implement the incident management process.
Minimum Business Continuity Objective (MBCO): This refers to the minimum level of services
and/or products that is acceptable to the organization to achieve its business objectives during an
incident, emergency or disaster. As per ISO 22301:2012, clause 3.28, MBCO is the minimum level
of services and/or products that is acceptable to the organizations to achieve its business
objectives during a disruption. MBCO is used to develop test plan for testing BCP.
Maximum Acceptable Outage (MAO): This is the time frame during which a recovery must
become effective before an outage compromises the ability of an Organization to achieve its
business objectives and/or survival. This refers to the maximum period of time that an organization
can tolerate the disruption of a critical business function, before the achievement of objectives is
adversely affected. MAO is also known as maximum tolerable outage (MTO), maximum downtime
(MD). Maximum Tolerable Period Downtime (MTPD).
Vulnerability: The degree to which a person, asset, process, information, infrastructure or other
resources are exposed to the actions or effects of a risk, event or other occurrence.
18
Chapter 1: BCM, BCP and DRP
19
4. Disaster recovery plan (DRP)
DR Plan includes tasks like plan for disaster recovery, crisis management, recovery operations etc.
Disaster Recovery Plan is the set of plans which are to be executed initially at the moment of crisis.
These plans include measures to control the disaster, mitigate them and to initiate the recovery of
the resources that is needed for the continuity of business. These plans are targeted to
initiate/recover the resources that have been affected by a disaster. These are the first plans that
would be executed at the time of disaster.
There are three basic strategies that encompass a disaster recovery plan: preventive measures,
detective measures, and corrective measures. Preventive measures will try to prevent a disaster
from occurring. These measures seek to identify and reduce risks. They are designed to mitigate
or prevent an event from turning into a disaster. These measures may include keeping data backed
up and off site, using surge protectors, installing generators and conducting routine inspections.
Detective measures are taken to discover the presence of any unwanted events within the IT
infrastructure. Their aim is to uncover new potential threats. They may detect or uncover unwanted
events. These measures include installing fire alarms, using up-to-date antivirus software, holding
employee training sessions, and installing server and network monitoring software. Corrective
measures are aimed to restore a system after a disaster or otherwise unwanted event takes place.
These measures focus on fixing or restoring the systems after a disaster and may include keeping
critical documents in the DRP or securing proper insurance policies.
organisation's assets, BCM requires plans and strategies that should cater for and allow
responses, contingency plans and procedures to recover as quickly as possible. BCM looks at an
entirety of the businesses of the entity as a whole. It is a continuous process whereby risks which
are inherent to the business are closely monitored and mitigated.
The primary objective of a Business Continuity Plan (BCP) is to enable an organisation to continue
to operate through an extended loss of any of its business premises or functions.
• Reduce likelihood of a disruption occurring that affects the business through a risk
management process.
• Enhance organisation’s ability to recover following a disruption to normal operating
conditions.
• Minimize the impact of that disruption, should it occur.
• Protect staff and their welfare and ensure staff knows their roles and responsibilities.
• Tackle potential failures within organisation’s I.S. Environment
• Protect the business.
• Preserve and maintain relationships with customers.
• Mitigate negative publicity.
• Safeguard organisation’s market share and/or competitive advantage.
• Protect organisation’s profits or revenue and avoid financial losses.
• Prevent or reduce damage to the organisation’s reputation and image.
21
Need for BCM at business Level
The need for BCM arises because of the following present day requirements of business
A disaster can be defined as an unplanned interruption of normal business process. It can be said
as a disruption of business operations that stops an organisation from providing critical services
caused by the absence of critical resources. An occurrence of disaster cannot always be foreseen;
hence we need to be prepared for all the types of disasters that can arise, handle them effectively
in the shortest time.
Vulnerabilities are also a major cause or source of disasters. Vulnerabilities are weaknesses
associated with an organisation’s assets. These weaknesses may be exploited by a threat causing
unwanted incidents that may result in loss, damage or harm to those assets. Vulnerability in itself
does not cause harm; it is merely a condition or set of conditions that may allow a threat to affect
an asset. Vulnerability is weakness of an asset or group of assets, which can be exploited by a
threat. As part of risk assessment exercise, it is important to understand the various vulnerabilities
and threats and coupled with probability, organisation can assess the impact. This will be evaluated
in business impact analysis and mitigated appropriately by implementing appropriate counter
measures. In contemporary academia, disasters are seen as the consequence of inappropriately
managed risk. These risks are the product of a combination of both hazard/s and vulnerability.
Hazards that strike in areas with low vulnerability will never become disasters, as is the case in
uninhabited regions.
23
environment. A disaster can be defined as any tragic event stemming from events such as
earthquakes, floods, catastrophic accidents, fires, or explosions. It can cause damage to life and
property and destroy the economic, social and cultural life of people. For a clearer understanding
of the concept of disasters, disasters can be classified into two major categories as:
1. Natural disasters
2. Man-made disasters
1. Natural disasters
Natural Disasters are those which are a result of natural environment factors. A natural disaster
has its impact on the business’s that is present in a geographical area where the natural disaster
has struck. Natural disasters are caused by natural events and include fire, earthquake, tsunami,
typhoon, floods, tornado, lightning, blizzards, freezing temperatures, heavy snowfall, pandemic,
severe hailstorms, volcano etc.
2. Man-made disasters
Man-made disasters are artificial disasters which arise due to the actions of human beings. Artificial
disasters has its impact on a business entity specific to which it has occurred. Artificial disasters
arising due to human beings Include Terrorist Attack, Bomb Threat, Chemical Spills, Civil
Disturbance, Electrical Failure, Fire, HVAC Failure, Water Leaks, Water Stoppage, Strikes, Hacker
attacks, Viruses, Human Error, Loss Of Telecommunications, Data Center outrage, lost data,
Corrupted data, Loss of Network services, Power failure, Prolonged equipment outrage, UPS loss,
generator loss and anything that diminishes or destroys normal data processing capabilities.
1. Crisis phase
2. Emergency response phase
3. Recovery phase
4. Restoration phase
24
Chapter 1: BCM, BCP and DRP
1. Crisis phase
The Crisis Phase is under the overall responsibility of the Incident Control Team (ICT). It comprises
the first few hours after a disruptive event starts or the threat of such an event is first identified; and
is caused by, for example:
• Ongoing physical damage to premises which may be life threatening, such as a fire;
or
• Restricted access to premises, such as a police cordon after a bomb incident. During
the crisis phase, the fire and other emergency evacuation procedures (including
bomb threat and valuable object removal procedures) will apply; and the emergency
services should be summoned as appropriate.
3. Recovery phase
The Recovery Phase may last from a few days to several months after a disaster and ends when
normal operations can restart in the affected premises or replacement premises, if appropriate.
During the recovery phase, essential operations will be restarted (this could be at temporary
premises) by one or more recovery teams using the BCP; and the essential operations will continue
in their recovery format until normal conditions are resumed.
4. Restoration phase
This phase restores conditions to normal. It will start with a damage assessment, usually within a
day or so of the disaster, when the cause for evacuation or stopping of operations has ended,
normal working will be restarted. During the restoration phase, any damage to the premises and
facilities will be repaired.
25
Serious fire during working Hours All phases in full
Serious fire outside during working hours All the phases, however, no staff and public
evacuation
Very minor fire during working hours Crisis Phase only, staff and public evacuation
but perhaps no removal of valuable objects,
Fire Service Summoned to deal with the fire
Gas mail leak outside during working hours, Only emergency response phase is
repaired after some hours appropriate
• Total destruction of the premises and its contents. For example as a result of a terrorist
attack;
• Partial damage, preventing use of the premises. For example through flooding; or
• No actual physical damage to the premises but restricted access for a limited period, such
as enforced evacuation due to the discovery nearby of an unexploded bomb.
The impact of a disaster may result in one or more of the following:
ii. Loss of human life: The extent of loss depends on the type and severity of the disaster.
Protection of human life is of utmost importance and, the overriding principle behind
continuity plans.
iii. Loss of productivity: When a system failure occurs, employees may be handicapped in
performing their functions. This could result in productivity loss for the organisation.
iv. Loss of Revenue: For many organisations like banks, airlines, railways, stock brokers,
effect of even a relatively short breakdown may lead to huge revenue losses.
v. Loss of Market share: In a competitive market, inability to provide services in time may
cause loss of market share. For example, a prolonged non-availability of services from
services providers, such as Telecom Company or Internet Service Providers, will cause
customers to change to different service providers.
vi. Loss of goodwill and customer services: In case of a prolonged or frequent service
disruption, customers may lose confidence resulting in loss of faith and goodwill.
vii. Litigation: Laws, regulations, contractual obligation in form of service level agreement
govern the business operations. Failure in such compliance may lead the company to
legal litigations and lawsuits.
When considering the impact of a disaster, it should be remembered that it will never happen at a
convenient time; and is always unpredictable. There is no way of knowing:
1.6 Summary
This chapter has provided an overview of the key concepts relating to management of BCP, DRP
and Incident Responses. Together, these are to be implemented as part of Business Continuity
management. The ultimate objective of a BCM is to recover from a crisis as fast as possible and
at the lowest possible cost. Business Continuity is applicable to organisations of all sizes and types
of business. Business Continuity is most crucial to organisations which use IT Resources for their
critical business functions. We have also understood that the need for BCP arises due to a
disruptive event which could be result in a disaster. Disaster is an event that causes interruption to
the ongoing business functions which is either natural or man-made. The phases of a disaster crisis
phase, emergency response phase, recovery phase and restoration phase have also been
discussed.
1.7 Questions
1. An organisation's disaster recovery plan should address early recovery of:
3. Which of the following BEST describes difference between a DRP and a BCP? The DRP:
27
A. works for natural disasters whereas BCP works for unplanned operating incidents
such as technical failures.
B. works for business process recovery and information systems whereas BCP works
only for information systems.
C. defines all needed actions to restore to normal operation after an un-planned
incident whereas BCP only deals with critical operations needed to continue working
after an un-planned incident.
D. is the awareness process for employees whereas BCP contains procedures to
recover the operation.
4. The MOST significant level of BCP program development effort is generally required
during the:
5. Disaster recovery planning for a company's computer system usually focuses on:
A. Risk
B. Vulnerability
C. Disaster
D. Resilience
7. Which of the following strategy does not encompass disaster recovery plan?
A. Preventive
B. Detective
C. Corrective
D. Administrative
C. Reduce the costs involved in reviving the business from the incident
D. Mitigate negative publicity
A. Crisis Phase
B. Emergency Response Phase
C. Recovery Phase
D. Restoration Phase
A. Loss of Productivity
B. Loss of Revenue
C. Loss of Human Life
D. Loss of Goodwill & Market Share
3. C. The difference pertains to the scope of each plan. A disaster recovery plan recovers
all operations, whereas a business continuity plan retrieves business continuity (minimum
requirements to provide services to the customers or clients). Choices A, B and D are
incorrect because the type of plan (recovery or continuity) is independent from the sort of
disaster or process and it includes both awareness campaigns and procedures.
4. A. A company in the early stages of business continuity planning (BCP) will incur the
most significant level of program development effort, which will level out as the BCP
program moves into maintenance, testing and evaluation stages. It is during the planning
stage that an IS Auditor will play an important role in obtaining senior management's
commitment to resources and assignment of BCP responsibilities.
29
5. D. It is important that disaster recovery identify alternative processes that can be put in
place while the system is not available.
7. D. There are three basic strategies that encompass a disaster recovery plan: preventive
measures, detective measures, and corrective measures. Preventive measures will try to
prevent a disaster from occurring. These measures seek to identify and reduce risks.
Detective measures are taken to discover the presence of any unwanted events within
the IT infrastructure. Their aim is to uncover new potential threats. Corrective measures
are aimed to restore a system after a disaster or otherwise unwanted event takes place.
9. D. Restoration phase will start with a damage assessment, usually within a day or so of
the disaster, when the cause for evacuation or stopping of operations has ended, normal
working will be restarted. During the Restoration Phase, any damage to the premises and
facilities will be repaired.
10. C. Protection of human life is of utmost importance and, the overriding principle behind
continuity plans. Rest all are to be considered later.
30
CHAPTER 2: STRATEGIES FOR DEVELOPMENT
OF BUSINESS CONTINUITY PLAN
Learning Objectives
This chapter forms the core of Business Continuity Management (BCM). The objective of this
chapter is to provide understanding on how to design a Business Continuity Plan (BCP). The key
steps of BCP such as Business Impact Analysis, performing risk assessment and designing tests
for the BCP are explained. BCP requires planning in advance and planning requires extensive
documentation and communication so that implementation happens as per plan. Specific aspects
of BCP which need to be documented are explained with sample contents, steps and procedures.
These cover BCP manual, backup and recovery strategies, recovery and alternate sites, other
strategies and types of insurance requirements. Critical events such as how and when to invoke a
disaster recovery plan and the specific tasks and responsibilities are explained. This chapter it
explains what management has to do in case of BCP. A thorough knowledge of these topics will
help DISAs will help them to perform a BCP Audit or providing consulting services on any/all
aspects of BCP. BCP audit is explained in next chapter.
2.1 Introduction:
An organisation’s ability to weather losses caused by unexpected events depends on proper
planning and execution of such plans. Without a workable plan, unexpected events can cause
severe damage to information resources and assets. Normally businesses that don’t have a
disaster plan go out of business after a major loss like a fire, a break-in, or a storm. A formal policy
provides the authority and guidance necessary to develop an effective Business Continuity plan.
The Business Impact Analysis helps to identify and prioritize critical IT systems and components.
It also helps to identify the control measures to be in place to reduce the effects of system
disruptions, to increase system availability and to reduce contingency life cycle costs. Developing
planned recovery strategies ensures that critical information systems are recovered quickly and
effectively following a disruption. Business impact analysis helps organisations in choosing the
right recovery strategies. The BCP should contain detailed guidance and procedures to be followed
till the restoration of damaged system. Testing the plan identifies planning gaps, whereas recovery
plan helps in training the personnel for plan activation; both activities improve plan effectiveness
and overall organisation preparedness. BCP should be a living document that is updated regularly
as per defined policy.
Module 7
A BCP cannot be considered as a project which is completed within specified time. BCP has to be
a continuous process and has to be an integral part of the day to day business processes. BCP to
be effective has to be tested and updated regularly and at least once a year, if not more frequently.
Critical business functions may keep changing and hence a plan that does not keep pace with the
changes in the organisation is not of any use. Therefore, one may have a working BCP on a given
day, but the BCP has to be a continuous affair to ensure success.
The primary objectives of a BCP are to guide an organisation in the event of a disaster and to
effectively re-establish critical business operations within the shortest possible period of time with
minimal loss of data. The pre-requisite in developing a BCP includes planning for all phases and
making it part of business process by assigning responsibility to specific business process owners.
The goals of planning the project are to assess current and anticipated vulnerabilities, define the
requirements of the business and IT, design and implement risk mitigation procedures and provide
the organisation with a BCP that will enable it to react quickly and efficiently in event of a disaster.
The objectives of planning the project is to gain an understanding of the existing and planned future
IT environment of the organisation, define the scope of the project, develop the project schedule,
and identify risks to the project. In addition, a project sponsor/champion and steering committee
should be established. The project sponsor or champion should be a member of senior
management team with required authority to push the project to completion. The steering
committee should be responsible for guiding the project team. The committee should have
members from both functional and IT departments. Further, a project manager and / or a BCP
coordinator should be appointed to lead, monitor completion and maintain the project. In
implementing a BCP project, the key tasks are:
32
Chapter 2: Strategies for development of BCP
operating expense and P&L impact. If stakeholders do not see the big picture, they surely will not
accept the details.
BIA is the first phase in Contingency planning process. It provides data about systems and threats
faced. It also provides detailed scenarios/effects of attacks. Contingency Planning team conducts
Business Impact Analysis in the following stages:
34
Chapter 2: Strategies for development of BCP
Risk is an exposure to unwanted loss. In terms of Business Continuity, it is the risk of an incident
happening which may result in unwanted loss of an asset or delay in operations.
Risk Assessment is the systematic identification of all risks, their investigation and grading
relevant to each other and to the department, so that the management can be given a clear and
full understanding of the risks it faces.
Risk Assessment is an important phase in the development of a Business Continuity Plan (BCP).
The objective of Risk Assessment are to:
the maximum amount of time allowed for the recovery of the of the business function. This is
the amount of downtime of the business process that the business can tolerate and still remain
viable. If this time is exceeded, then severe damage to the organisation will result. From the
IT point of view, recovery usually means restoring support for the processing and
communication functions that are considered to be critical to the business and then restoring
support for other non-critical / ancillary systems. From the business perspective, recovery
means being able to execute the business functions that are at the key to the survival of the
business and then being able to execute the -non-critical/ ancillary functions. Another key
factor to consider when defining recovery is the timeframe. Recovery of the function and the
system is evaluated considering several time based factors such as:
i. Recovery Time Objective (RTO): RTO is the measure of the user’s tolerance to
downtime. . It indicates the earliest point in time at which the business operations must
resume after disaster. For example: Critical monitoring system must have very low RTO
or zero RTO. RTO may be measured in minutes or less.
ii. Service Delivery Objective (SDO): Service Delivery Objective (SDO) is the level of
services to be reached during the alternate process mode until the normal situation is
restored. This is directly related to the business needs.
iii. Recovery Point Objective (RPO): RPO is a measure of how much data loss due to a
node failure is acceptable to the business. A large RPO means that the business can
tolerate a great deal of lost data. Depending on the environment, the loss of data could
have a significant impact. A rule of thumb is that the lower the RPO, higher the overall
cost of maintaining the environment for recovery. An RPO of 5 minutes can lose data up
to 5 minutes of data, whereas 0 RPO will have no loss of data. Like RTO / SDO, RPO
may vary with services and system. However, it is important to understand the
dependencies between the systems and to be taken into consideration while determining
the critical systems. These objectives are not closely related – they may both be almost
zero, they may both be large, or one may be small but the other large. Once a company
decides what RPO and RTO are applicable to an application, the method for backup and
recovery of that application becomes much more evident.
Examples of need of various systems in different organisations and scenarios are illustrated here:
i. A stock exchange trading system must be restored very quickly and cannot afford to lose
any data. Since the price of the next trade depends upon the previous trade, the loss of a
trade will make all subsequent transactions wrong. In this case, the RTO may be measured
as a few minutes or less, but the RPO must be zero.
ii. A critical monitoring system such as those used by power grids, nuclear facilities, or
hospitals for monitoring patients must have a very small RTO, but the RPO may be large.
In these systems, monitoring must be as continuous as possible; but the data collected
becomes stale very quickly. Thus, if data is lost during an outage (large RPO), this perhaps
impacts historical trends; but no critical functions are lost. However, an outage must end
36
Chapter 2: Strategies for development of BCP
as quickly as possible so that critical monitoring can continue. Therefore, a very small RTO
is required.
iii. A Web-based online ordering system must have an RPO close to zero (the company does
not wish to lose any sales or, even worse, acknowledge a sale to a customer and then not
deliver the product). However, if shipping and billing are delayed by even a day, there is
often no serious consequence, thus relaxing the RTO for this part of the application.
iv. A bank’s ATM system is even less critical. If an ATM is down, the customer, although
aggravated, will find another one. If an ATM transaction is lost, a customer’s account may
be inaccurate until the next day when the ATM logs are used to verify and adjust customer
accounts. Thus, neither RPO nor RTO need to be small.
37
Module 7
Maximum Tolerable outages: Maximum tolerable outage is the maximum time the organisation
can support processing in alternate mode. After this point, different problems may arise, especially
if the alternate SDO is lower than usual SDO and the information pending to be up-to-date can
become unmanageable.
38
Chapter 2: Strategies for development of BCP
Interruption window: Interruption window is the time the organisation can wait from the point of
failure to the critical services/applications restoration. After this time, the progressive losses caused
by the interruption are unaffordable.
Natural
o Fire
o Flood
o Storm
o Lightning
o Power Failure
Deliberate/Intentional
o Bomb
o Sabotage
o Theft
o Strike
Accidental
o Outrage
39
Module 7
o Errors
o Disclosure
Organisations should also consider threats arising from various issues as:
Shared Premises
o Should agree who is responsible for what
o Are legally required to co-operate and co-ordinate during an emergency
Risk assessment has to be a coordinated activity with participation from personnel departments,
vendors and interested stakeholders. A sample checklist which can be used for risk assessment is
given below:
40
Chapter 2: Strategies for development of BCP
1. Risk Ranking
2. Value ranges
3. Formulae for comparing risks
4. Computer software if suitable
1. Risk Ranking
The ability of a company to cope with interruption of a business process determines the
TOLERANCE of the business process. This tolerance depends on the length of the disruption and
may also be linked to the time of the day or month the interruption occurs. In practice, tolerance is
usually expressed as a monetary amount – the cost to the company if business process is
interrupted for a given unit of time. This cost of interruption is inversely related to the tolerance.
The various business processes may be classified on their critical recovery time period.
i. Critical: These are functions that cannot be done manually under any circumstances. Unless a
company located identical capabilities to replace the damaged capabilities, these functions cannot
be performed. These functions have zero or very low tolerance to interruption and consequently,
the cost of interruption is high.
ii. Vital: These functions can be performed manually but only for a brief period of time. There is
relatively higher tolerance to interruption as compared to critical functions and consequently
somewhat lower cost of interruption. The function classified as vital can withstand a brief
suspension of operations but cannot withstand an extended period of downtime.
iii. Sensitive: These processes can be carried out by manual means for an extended period,
though with some difficulty. They may require additional staff to perform and when restored, may
require considerable amount of time to restore the data to current or usable form.
41
Module 7
iv. Non-Critical: These processes have a high tolerance to interruption and can be interrupted for
an extended period of time with little or no adverse consequences. Very little time is required to
restore the data to a current or usable form.
2. Value ranges
To assist in comparison, a range of values should be set for each of the following:
• Asset cost;
• Likelihood of threat happening;
• Vulnerability; and
• Assessment of the risk.
The following ranges can be used:
• 1 -Very Low;
• 2 - Low;
• 3 - Moderate;
• 4 - High;
• 5 -Very High.
Example: The risk of flooding to a premise has been assessed and the following values awarded:
42
Chapter 2: Strategies for development of BCP
Using the formula given above, the score for this particular risk is:
5 + 3 + 2 = 3.3
3
This translates into words as `Moderate' risk.
Thus, a threat, which had a high impact, might still be rated highly although its occurrence would
be relatively rare. Conversely, a frequent threat (especially where the organisation was vulnerable)
could be rated highly even though its impact was only minor. The example above and the above
formulae used are simple and should be adequate for uncomplicated Risk Assessments.
report to senior corporate management who are not intimately involved in the
process. Even where multiple sites are covered by the plan, the membership of this
team will be relatively constant. Only the leaders from the site specific recovery team
will vary.
ii) Crisis Management/Public Relations Team: Dealing with external agencies or
interest groups is an inevitable part of any post-disaster activity. The work performed
by this team may extend beyond the provision of detail on what has happened, and
what is being done to keep the business in operation. For example, this team may
also have to handle:
Business
Continuity Team
Recovery Crisis
Management Management
Team Team
Damage
Facilities Team Assessment
Team
Administration Hardware
Team Installation Team
44
Chapter 2: Strategies for development of BCP
To be effective, this team must have all of the necessary information at their disposal and include
appropriate senior corporate officials who are comfortable in dealing with the media and relaxed in
front of cameras. Mishandling public relations can severely damage an organisation’s reputation
and cause more harm than the disaster itself.
Dealing with the media is a skill which must be developed, and there are consulting firms which
specialize in this service. In addition to being comfortable in the handling of the media, this team
must have a predefined plan for issuing statements, an agreed location for making those
statements and an understanding of the level of information to be issued. Experience has shown
that saying nothing or “no comment” can be more costly to the organisation than providing full and
open disclosure. If the media cannot get the information from official sources, it is very good at
finding other, not necessarily reliable, information from other sources.
The crisis management team or public relations team must be thoroughly prepared to handle the
media and all other external communications. Appropriate spokespersons should be identified and
trained. The functionalities of teams under the control of Recovery management team and Crisis
management team are elaborated below:
45
Module 7
team staff may also be responsible for such matters as staff travel arrangements,
catering, petty cash control, telephone services, mail services, and some personnel
functions.
vi) Damage Assessment Team: Damage assessment will be one of the first activities
performed after a disaster occurs. Depending on the nature of the operations at the
site impacted, the performance of such an assessment may require a number of
different skill sets.
vii) Other teams which may be established include:
46
Chapter 2: Strategies for development of BCP
47
Module 7
Preliminary damage assessment: Once management has been notified of the problem,
a preliminary damage assessment should be performed. This need not be a detailed
assessment, but will provide an initial indication as to whether the plan needs to be
activated.
Put recovery site on standby: Where the BCP involves the use of a commercial
recovery center, that site should be put on notice. Most vendors favour being put on notice
and appreciate an advance warning.
Assemble damage assessment team: If the preliminary assessment is not conclusive,
the full damage assessment team can be assembled. If all staff is on site when disaster
strikes, this should be a relatively easy task. However if the incident occurs after office
hours it will be necessary to call staff at home to notify all team members of the problem.
Determining strategy: The identification of the most appropriate strategy will typically
require a decision by the recovery management team, based on the damage assessment
report. Once an appropriate strategy has been identified, the adoption of that solution
must be approved by the senior management.
Establish emergency command center: While it may not be necessary to establish
command center for the damage assessment team alone, once the decision has been
made to invoke the plan it will be necessary to activate that location. All members of the
48
Chapter 2: Strategies for development of BCP
Assemble and brief recovery teams: This effect the notification of all team members to
report to the command center. This should include:
Giving details of who is calling
Providing a brief synopsis of disaster status
Instruction to call all staff or alternates on the list of the person being called
Instruction on where to report, when and with what materials and
A record of all calls made should be retained.
Notify recovery site: The recovery to be used including any commercial sites, should be
notified of the decision to use the facility and requested to prepare the site in accordance
with the contract.
Arrange movement of backup materials: Once the decision is made to move to the
recovery site, all of the necessary materials should be recovered from off-site storage and
shipped to that location. The shipment of any required special forms from backup supplies
should also be coordinated at this time.
Notify impacted staff: Once the recovery operations are under way, the staff that will be
impacted but will not be required for recovery activities should also be notified. It is
preferable that they receive notification from the organisation rather than from the media.
File Insurance claims: It may not be possible to file the claims immediately, as further
damage assessment may still be required However as soon as the necessary information
is available, the claims should be prepared.
Detail procedures for recovery: A step by step instructions for recovering systems at
recovery site should be written down. Some of instructions are:
assemble and check site
check off site materials and install equipment
test operating system
recover applications
test applications
hire temporary staffs
49
Module 7
update to disaster (if the recovery site is not shadowing all data
processed at the primary site, data entry up to the state of disaster will
be entered again at the recovery site)
process backlog
configure networks and test network
establish external links
redirect mail
redirect communications
correct problem / monitor
establish controls
Primary site procedures: While the detailed recovery procedures are concentrated on
alternate facilities to restore the critical business operations, the primary site should be
built up again. Steps remain to be taken in that location following the damage assessment
and the decision to invoke the plan.
Return to normal operations: Once the primary site is refurbished or a new primary site
available, it is necessary to relocate to that site.
Post recovery reviews: Once the return to normal operations has been completed and
approved, the normal job schedules and operating instructions should be reintroduced. In
addition, a review of the recovery operations should be performed to identify any areas in
which the plan can be improved. This post mortem should be performed as soon as
possible to ensure that concerns and problems experienced are still clear in staffs’ minds.
The next step would involve the documentation of the plan which means creation of the BCP
Manual. A BCP Manual houses all the relevant steps of the plan that is needed to be followed
during a crisis. A BCP manual contains the Disaster Recovery Plans, Business Continuity Plan and
the contingency plans. The elements of a BCP Manual are discussed in 2.10.
50
Chapter 2: Strategies for development of BCP
hardware or data communications have occurred. The objectives of testing the disaster recovery
plan are:
BCP Testing
The effectiveness of BCP has to be maintained through regular testing. The five types of tests of
BCP are:
1. Checklist test
2. Structured walk through test
3. Simulation test
4. Parallel test
5. Full interruption test
1. Checklist test: In this type of test, copies of the plan are distributed to each business unit’s
management. The plan is then reviewed to ensure that the plan addresses all procedures and
critical areas of the organisation. In reality, this is considered as a preliminary step to real test and
is not a satisfactory test in itself.
51
Module 7
2. Structured walk through test: In this type of test, business unit management representatives
meet to walk through the plan. The goal is to ensure that the plan accurately reflects the
organisation’s ability to recover successfully, at least on paper. Each step of the plan is walked
through in the meeting and marked as performed. Major faults with the plan should be apparent
during the walkthrough.
3. Simulation test: In this type of test, all of the operational and support personnel who are
expected to perform during an actual emergency meet in a mock practice session. The objective
is to test the ability and preparedness of the personnel to respond to a simulated disaster. The
simulation may go to the point of relocating to the alternate backup site or enacting recovery
procedures, but does not perform any actual recovery process or alternate processing.
4. Parallel test: A Parallel test is a full test of the recovery plan, utilizing all personnel. The
difference between this and the full interruption test is that the primary production processing of
the business does not stop, the test processing runs in parallel to the real processing. The goal of
this type of test is to ensure that critical systems will actually run at the alternate processing backup
site. Systems are relocated to the alternate site, parallel processing backup site, and the results of
the transactions and other elements are compared. This is the most common type of disaster
recovery plan testing.
5. Full interruption test: During a full interruption test, a disaster is replicated event the point of
ceasing normal production operations. The plan is implemented Asif it were a real disaster, to the
point of involving emergency services. This is a very severe test, as it can cause a disaster on its
own. It is the absolute best way to test a disaster recovery plan, however, because the plan either
works or doesn’t.
52
Chapter 2: Strategies for development of BCP
I. What happened;
II. What was tested successfully; and
III. What needs to be changed?
If a test indicates that the BCP needs to be changed, the change should be made and the test
repeated until all aspects are completed satisfactorily. When all the components have been tested
satisfactorily, the whole BCP is ready for testing. It should not be assumed that because the
components work individually there is no need to test the whole BCP. Putting it all together may
reveal problems which did not show up in lower level testing. When preparing for testing, the
participants should be given all the information and instruction they need.
1. To train recovery ream participants who are required to execute plan segments in the event of
a disaster.
2. To train the management and key employees in disaster prevention and awareness and the
need for DRP.
The training of organisation’s user management in disaster recovery planning is crucial. A DRP
must have the continued support from organisation’s user management to ensure future effective
participation in plan testing and updating. It is not solely the responsibility of the Disaster Recovery
Coordinator to initiate updates to the disaster recovery plan. User management must be aware of
the basic recovery strategy; how the plan provides for rapid recovery of their information technology
systems support structure. It is the responsibility of each recovery team participant to fully read and
comprehend the entire plan, with specific emphasis on their role and responsibilities as part of the
recovery team. On-going training of the recovery team participants will continue through plan tests
and review of the plan contents and updates provided by the Disaster Recovery Coordinator.
From the start of the BCP development project, positive action to create awareness of the BCP
during the development, testing and training phases should be taken by holding briefings for all
staff at an early stage of BCP development to explain the reasons for the BCP and its benefits to
everyone and how it will be developed; The organisation should take care that all new staff are
briefed about the BCP as a part of their induction to the organisation. It should be remembered that
every test also trains the participants. If the full Recovery Teams are not used in each test, the
53
Module 7
participants should be rotated so that they all gain an adequate experience. At the end of the testing
phase, further training and experience requirements should be identified. The Recovery Team
leaders should be consulted for their opinions as they should have the best understanding of the
present abilities of their Team members.
Training methods
The training methods to be used may include:
1. Walkthrough session
2. Scenario workshop; or
3. Live test simulation
1. Walkthrough Session
For a walkthrough session, the participants sit round a table, each with a copy of the BCP (or
appropriate part of the BCP), and `walk' through it by reading and discussing each part in
sequence. Walkthrough sessions should be conducted at a quiet place without interruption
because the objective is to identify any weaknesses, errors and omissions by allowing participants'
thoughts to flow freely as they go through the plan. The only limit on discussion is that the whole
part must be read to the end. All components of the BCP should first be tested using this method
as it is highly likely to identify changes needed. One good walkthrough per component is usually
sufficient if the suggested changes are then reviewed and agreed by a few of the testers for one
Recovery Team. Links with other Teams should be noted and raised during their walkthroughs.
2. Scenario Workshop
This is similar to a walkthrough except that a scenario is devised before the workshop, and the key
members of recovery teams are involved, although it is preferable to include all teams.
54
Chapter 2: Strategies for development of BCP
thinking and doing. The objective is to identify errors, omissions and weaknesses, and to establish
whether the plan performs as intended. For this method to be effective, participants must:
1. is held outside normal working hours so that resources can be used without affecting
normal operations;
2. involves the use of the planned and contracted contingencies if this is practical;
3. requires relocation of some staff to another site if appropriate; and
4. has to be as near to real life as possible so that all aspects of the BCP, including
contingencies, are tested.
Because it is a test, some shortcuts may be taken, such as sending only a token number of people
to the contingency site, or doing only token amounts of work to prove that the operation has been
successfully recovered. However, even if shortcuts are used it must still be possible at the end of
the test to conclude that all aspects of the BCP are effective. A simulation of a live test, perhaps
more than the other methods, needs to be carefully planned to:
strategies and plans are complete, current and accurate; and Identifies opportunities for
improvement. Agreements may also need to reflect the changes. If additional equipment is needed,
it must be maintained and periodically replaced when it is no longer dependable or no longer fits
the organisation’s architecture. The BCM maintenance process demonstrates the documented
evidence of the proactive management and governance of the organisation’s business continuity
program; the key people who are to implement the BCM strategy and plans are trained and
competent.
The monitoring and control of the BCM risks faced by the organisation; and the evidence that
material changes to the organisation’s structure, products and services, activities, purpose, staff
and objectives have been incorporated into the organisation’s business continuity and incident
management plans. Similarly, the maintenance tasks undertaken in development of BCP are to:
Determine the ownership and responsibility for maintaining the various BCP strategies
within the organisation;
Identify the BCP maintenance triggers to ensure that any organisational, operational, and
structural changes are communicated to the personnel who are accountable for ensuring
that the plan remains up-to-date;
Determine the maintenance regime to ensure the plan remains up-to-date;
Determine the maintenance processes to update the plan; and
Implement version control procedures to ensure that the plan is maintained up-to-date.
56
Chapter 2: Strategies for development of BCP
Trigger (circumstances that cause IR team activation and IR plan initiation) are
to be defined.
What must be done to react to the particular situation are to be elaborated.
How to stop the incident if it is ongoing is also to be addressed along with the
way by which the Elimination of problem source can be achieved.
57
Module 7
Three broad categories of incident indicators are: possible, probable and definite.
Four types of possible actual incidents are: Presence of unfamiliar files, presence or execution
of unknown programs or processes, unusual consumption of computing resources and unusual
system crashes.
Four probable indicators of actual incidents are: Activities at unexpected times, presence of
new accounts, reported attacks and notification from IDS.
Five events of definite indicators of an incident are: Use of dormant accounts, modified or
missing logs, presence of hacker tools, notifications by partner or peer and notification by hacker.
Five events which indicate that an incident is underway are: Loss of availability, loss of
integrity, loss of confidentiality, violation of policy and violation of law.
Reactions to incidents
How and when to activate IR plans determined by IR strategy organisation chooses to pursue? In
formulating incident response strategy, many factors influence an organisation’s decision. IR plan
designed to stop incident, mitigate effects, and provide data that facilitates recovery. Two general
categories of strategic approach for an organisation as it responds to an incident are 1. Protect and
forget and 2. Apprehend and prosecute.
Incident Notifications
As soon as IR team determines an incident is in progress, appropriate people must be notified in
the correct order. Alert roster is the document containing contact information for individuals to be
notified during an incident. There are two ways to activate alert roster namely sequentially and
hierarchically. Alert message should contain the scripted description of incident containing enough
information for each responder to know what to do on alert process.
58
Chapter 2: Strategies for development of BCP
Documenting an Incident
Documenting the incident should begin immediately after incident is confirmed and notification
process is underway. It should record who, what, when, where, why, and how of each action taken
while incident is occurring. The purpose is to make the documentation to serves as case study to
determine whether right and effective actions were taken. It helps to prove that the organisation
did everything possible to prevent spread of the incident. It can also be used as simulation in future
training sessions on future versions of IR plan.
59
Module 7
60
Chapter 2: Strategies for development of BCP
must be made when, and if, a disaster occurs. It should also inform about the responsibility to keep
this document current. It should be approved by appropriate authority.
The overall objectives of this plan are to protect organisation’s computing resources and
employees, to safeguard the vital records of which Information Technology Systems and to
guarantee the continued availability of essential Information Technology services. The role of this
plan is to document the pre-agreed decisions and to design and implement a sufficient set of
procedures for responding to a disaster that involves the data center and its services.
A disaster is defined as the occurrence of any event that causes a significant disruption in
Information Technology capabilities. This plan assumes the most severe disaster, the kind that
requires moving computing resources to another location. Less severe disasters are controlled at
the appropriate management level as a part of the total plan.
The basic approach, general assumptions, and possible sequence of events that need to be
followed are stated in the plan. It will outline specific preparations prior to a disaster and emergency
procedures immediately after a disaster. The plan is a roadmap from disaster to recovery. Due to
the nature of the disaster, the steps outlined may be skipped or performed in a different sequence.
The general approach is to make the plan as threat-independent as possible. This means that it
should be functional regardless of what type of disaster occurs.
For the recovery process to be effective, the plan is organized around a team concept. Each team
has specific duties and responsibilities once the decision is made to invoke the disaster recovery
mode. The plan represents a dynamic process that will be kept current through updates, testing,
and reviews. As recommendations are completed or as new areas of concern are recognized, the
plan will be revised to reflect the current IT and business environment. The IS Auditor has to
review the process followed for preparation of the DRP and assess whether it meets the
requirements of the organisation and provide recommendations on any areas of
weaknesses identified.
61
Module 7
1. Disaster Assessment: The disaster assessment phase lasts from the inception of the disaster
until it is under control and the extent of the damage can be assessed. Cooperation with emergency
services personnel is critical.
2. Disaster recovery activation: When the decision is made to move primary processing to
another location, this phase begins. The Disaster Recovery Management Team will assemble and
call upon team members to perform their assigned tasks. The most important function is to fully
restore operations at a suitable location and resume normal functions. Once normal operations are
established at the alternate location, Phase 2 is complete.
3. Alternate site operation/data center rebuild: This phase involves continuing operations at the
alternate location. In addition, the process of restoring the primary site will be performed.
4. Return to primary site: This phase involves the reactivation of the primary site at either the
original or possibly a new location. The activation of this site does not have to be as rushed as the
activation of the alternate recovery site. At the end of this phase, a thorough review of the disaster
recovery process should be taken. Any deficiencies in this plan can be corrected by updating the
plan.
2.4.5 DRP
The DRP should contain information about the vital records details including location where it is
stored, who is in charge of that record etc. It contains information about what is stored offsite such
as:
62
Chapter 2: Strategies for development of BCP
General Responsibilities
The IT Disaster Recovery Management Team (MGMT) is responsible for the overall coordination
of the disaster recovery process from an Information Technology Systems perspective. The other
team leaders report to this team during a disaster. In addition to their management activities,
members of this team will have administrative, supply, transportation, and public relations
responsibilities during a disaster. Each of these responsibilities should be headed by a member of
the MGMT team.
General Activities
Assess the damage and if necessary, declare a disaster (damage assessment forms are
included in this plan)
Coordinate efforts of all teams
Secure financial backing for the recovery effort
Approve all actions that were not pre-planned
Give strategic direction
Be the liaison to upper management
Expedite matters through all bureaucracy
Provide counselling to those employees that request or require it
After The Disaster
63
Module 7
Administrative Responsibilities
The administrative function provides administrative support services to any team requiring this
support. This includes the hiring of temporary help or the reassignment of other clerical personnel.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
Supply Responsibilities
The supply function is responsible for coordinating the purchase of all needed supplies during the
disaster recovery period. Supplies include all computing equipment and supplies, office supplies
such as paper and pencils, and office furnishings.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
64
Chapter 2: Strategies for development of BCP
Activities by Phase
All Phases
Hardware Responsibilities
The responsibility of the Hardware Team is to acquire (along with the Facilities Team), configure
and install servers and workstations for organisational information Technology users.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
65
Module 7
Notify users
Ensure data is backed up
Relocate equipment
After The Disaster
Software Responsibilities
The responsibility of the Software Team is to maintain the systems software at the alternate site
and reconstruct the system software upon returning to the primary site. In addition, the Software
Team will provide technical support to the other teams.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
66
Chapter 2: Strategies for development of BCP
Network Responsibilities
The Network Team is responsible for preparing for voice and data communications to the alternate
location data center and restoring voice and data communications at the primary site.
Activities by Phase
Procedures during disaster recovery activation phase
Operations Responsibilities
The Operations responsibilities include the daily operation of computer services and management
of all backup tapes. When a disaster is declared, the team must secure the correct tapes for
transport to the alternate location. Once operations are established at the alternate location,
arrangements must be made with an offsite storage service.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
67
Module 7
The disaster recovery plan should contain Disaster Recovery Technical Support Team Call
Checklist. It should specify the contact information about Team leader as well as team members
with the details on which functionality he/she can be contacted. The disaster recovery plan should
contain details about Facility Team and its sub-teams like Salvage team, new data center, new
hardware team etc. and their respective responsibilities.
Salvage Responsibilities
The Salvage Team is responsible for minimizing the damage at the primary site and to work with
the insurance company for settlement of all claims. This depends on a quick determination of what
equipment is salvageable and what is not. Repair and replacement orders will be filed for what is
not in working condition. This team is also responsible for securing the disaster recovery data
center.
Activities by Phase
Procedures during Disaster Recovery Activation Phase
Activities by Phase
Procedures during Remote Operation/Data Center Rebuild Phase
1. Servers
2. Printers
3. Switches, Routers, Hubs
69
Module 7
4. Work stations
5. Environmental systems
6. UPS Equipment
Activities by Phase
Procedures during Disaster Recovery Activation Phase
70
Chapter 2: Strategies for development of BCP
Change control, preventative action, corrective action, document control and record
control processes;
Local Authority Risk Register;
Exercise schedule and results;
Incident log; and
Training Program
To provide evidence of the effective operation of the BCM, records demonstrating the operation
should be retained as per policy of the organisation and as per applicable laws, if any. These
records also include reference to all business interruptions and incidents, irrespective of the nature
and length of disruption. This also includes general and detailed definition of requirements as
described in developing a BCP. In this, a profile is developed by identifying resources required to
support critical functions, which include hardware (mainframe, data and voice communication and
personal computers), software (vendor supplied, in-house developed, etc.), documentation (user,
procedures), outside support (public networks, DP services, etc.), facilities (office space, office
equipment, etc.) and personnel for each business unit.
1. Purpose of the plan: Included in this section should be a summary description of the
purpose of the manual. It should be made clear that the manual does not address recovery
from day to day operational problems. Similarly, it must be stressed that the manual does
not attempt to foresee all possible disasters, but rather provides a framework within which
management can base recovery from any given disaster.
2. Organisation of the manual: A brief description of the organisation of the manual, and
the contents of each of the major sections, will provide the reader with the direction to the
relevant section of the manual in an emergency situation. Any information which is
external to the manual but will be required in an emergency should be identified in this
section.
3. Disaster definitions: It may assist the user of the manual if a definition of disaster
classification is provided, together with an identification of the relevance of the plan to that
situation. Four types of classification can generally be used:
72
Chapter 2: Strategies for development of BCP
Major disaster: Event or disruptions that cause significant impact and may have an
effect on outside clients.
Catastrophic disaster: Event or disruption that has significant impact and adversely
affect the organisation’s “going concern” status
The BCP manual of each organisation is expected to classify disasters, after taking into account
the size and nature of its business and the time and cost associated to each kind of disaster should
be defined as per the requirement of the individual organisation. It should be noted, however, that
development of a plan based on each classification is not recommended. The need to invoke the
plan should be determined by the length and associated cost of the expected outage and not the
classification of the disaster, although there is a direct correlation. These definitions will be most
useful for communication with senior management.
4. Objectives of the plan: The objectives of the manual should be clearly stated in the
introductory section. Typically, such objectives include:
5. Scope of the plan: In order that there is no confusion as the situations in which the plan
will apply, the scope of the plan must be clearly identified. Any limitations must be
explained.
73
Module 7
6. Plan approach / recovery strategy: A step by step summary of the approach adopted
by the plan should be presented. For ease of reference, it may be good to provide this
overview by means of a schematic diagram. In particular, it may be useful to set up the
recovery process as a project plan in this section.
7. Plan administration: The introductory section should also identify the person or persons,
responsible for the business continuity plan manual, and the expected plan review cycles.
These persons will be responsible for issuing revisions which will ensure that the plan
remains current. Because the manual will include staff assignments, it is also advisable
that the personnel or human resource function accept responsibility for notifying the plan
administrators of all personnel changes which must be reflected in the plan.
8. Plan management: Following a disaster, the normal reporting channels and lines of
management are unlikely to be strictly adhered to. During a disaster, reporting by
exception may be the only feasible way to operate. This does not however negate the
requirement for formalized management. The management responsibilities and reporting
channels to be observed, during disaster recovery should be clearly established in
advance.
9. Disaster notification and plan activation procedures: The procedures represent the
first steps to be followed when any disaster occurs. It is recommended that the procedures
be written in a task oriented manner and provide a logical flow to enable ease of
management.
Dual recording of data: Under this strategy, two complete copies of the database are
maintained. The databases are concurrently updated.
Periodic dumping of data: This strategy involves taking a periodic dump of all or part of
the database. The database is saved at a point in time by copying it onto some backup
storage medium – magnetic tape, removable disk, Optical disk. The dump may be
scheduled.
Logging input transactions: This involves logging the input data transactions which
cause changes to the database. Normally, this works in conjunction with a periodic dump.
In case of complete database failure, the last dump is loaded and reprocessing of the
transactions are carried out which were logged since the last dump.
74
Chapter 2: Strategies for development of BCP
Logging changes to the data: This involves copying a record each time it is changed by
an update action. The changed record can be logged immediately before the update
action changes the record, immediately after, or both.
Apart from database backup strategies as mentioned above, it is important to implement email and
personal files backup policies. The policy can be like burning DVDs with the folders and documents
of importance periodically to more detailed and automated functions. The choice depends and
varies with the size, nature and complexity of the situation. For example, individuals are responsible
for taking backups of personal files and folders. However, a policy may be there whereby individual
users may transfer personal files and folders from the PC to an allocated server space. The data
so transferred in the server will be backed up by the IT department as a part of their routine backup.
Email backups should necessarily include the address book backup. However, the most important
and critical part of the backup strategy is to include a restoration policy. Restoration of the data
from the backup media and devices will ensure that the data can be restored in time of emergency;
else a failed backup is a double disaster. The restoration should be done for all backups at least
twice a year.
Types of Backup
When the back-ups are taken of the system and data together, they are called total system’s back-
up. An organisation has to choose the right type of back up for each of the critical components of
IS and data to meet specific business requirements. The various types of back-ups are:
Full Back up: A full backup captures all files on the disk or within the folder selected for
backup. With a full backup system, every backup generation contains every file in the
backup set. However, the amount of time and space such a backup takes prevents it from
being a realistic proposition for backing up a large amount of data.
Incremental Back up: An incremental backup captures files that were created or
changed since the last backup, regardless of backup type. This is the most economical
method, as only the files that changed since the last backup are backed up. This saves a
lot of backup time and space. Normally, incremental backup are very difficult to restore.
One will have to start with recovering the last full backup, and then recovering from every
incremental backup taken since.
Differential Backup: A differential backup stores files that have changed since the last
full backup. Therefore, if a file is changed after the previous full backup, a differential
backup takes less time to complete than a full back up. Comparing with full backup,
differential backup is obviously faster and more economical in using the backup space,
as only the files that have changed since the last full backup are saved. Restoring from a
differential backup is a two-step operation: Restoring from the last full backup; and then
restoring the appropriate differential backup. The downside to using differential backup is
75
Module 7
that each differential backup probably includes files that were already included in earlier
differential backups.
Mirror back-up: A mirror backup is identical to a full backup, with the exception that the
files are not compressed in zip files and they cannot be protected with a password. Mirror
backup is most frequently used to create an exact copy of the backup data.
LAN Systems
Peer-to-Peer: Each node has equivalent capabilities and responsibilities. For example, five PCs
can be networked through a hub to share data.
Client/Server: Each node on the network is either a client or a server. A client can be a PC or a
printer where a client relies on a server for resources. A LAN's topology, protocol, architecture, and
nodes will vary depending on the organisation. Thus, contingency solutions for each organisation
will be different. Listed below are some of the strategies for recovery of LANs.
1. Eliminating Single Points of Failure (SPOC): When developing the LAN contingency
plan, the organisation should identify single points of failure that affect critical systems
or processes outlined in the Risk Assessment. These single points of failures are to be
eliminated by providing alternative or redundant equipment.
76
Chapter 2: Strategies for development of BCP
2. Redundant Cabling and Devices: Contingency planning should also cover threats to
the cabling system, such as cable cuts, electromagnetic and radiofrequency
interference, and damage resulting from fire, water, and other hazards. As a solution,
redundant cables may be installed when appropriate. For example, it might not be cost-
effective to install duplicate cables to every desktop. However, it might be cost-effective
to install a redundant cable between floors so that hosts on both floors could be
reconnected if the primary cable were cut. Contingency planning also should consider
network connecting devices such as hubs, switches, bridges, and routers.
3. Remote Access: Remote access is a service provided by servers and devices on the
LAN. Remote access provides a convenience for users working off-site or allows for a
means for servers and devices to communicate between sites.
Remote access can be conducted through various methods, including dialup access and virtual
private network (VPN). Remote access may serve as allocation that can access the corporate data
even when they are not in a position to reach the physical premises due to some calamity. If remote
access is established as a contingency strategy, data bandwidth requirements should be identified
and used to scale the remote access solution. Additionally, security controls such as one-time
passwords and data encryption should be implemented, if the communication traffic contains
sensitive information.
Wireless LANs
Wireless local area networks can serve as an effective contingency solution to restore network
services following a wired LAN disruption. Wireless networks do not require the cabling
infrastructure of conventional LANs; therefore, they may be installed quickly as an interim or
permanent solution. However, wireless networks broadcast the data over a radio signal, enabling
the data to be intercepted. When implementing wireless network, security controls, such as data
encryption, should be implemented, if the sensitive information is to be communicated.
77
Module 7
application, data is replicated among servers at each location, and users access the system from
their local server. The contingency strategies for distributed system reflect the system's reliance
Nolan and WAN availability. Based on this fact, when developing a distributed system contingency
strategy, the following methods applicable to system backups should be considered for
decentralized systems. In addition, a distributed system should consider WAN communication link
redundancy and possibility of using Service Bureaus and Application Service Providers (ASPs).
78
Chapter 2: Strategies for development of BCP
I. Cellular phone backup: If the regular voice system is inoperative, key employees can
be provided with cellular phones as a backup. Given that cellular phones are not run by
the major carriers from the same central offices, this also provides coverage for the loss
of the central office. Such phones could also be used on an on-going basis and could be
used to balance the load on the main PBX switch. Cellular services can also be extended
to data and facsimile transmission.
ii. Carrier call rerouting systems: Most of the major carriers now provide customers with
call rerouting services such that all calls to a given number can be rerouted to another
number temporarily. While this will not be possible in the case of a carrier outage, it can
be used for the rerouting of critical business communications following a disaster at a
client’s offices. Calls can be rerouted to call management service, for example, to support
the client in the interim.
79
Module 7
Mirror site
The single most reliable system backup strategy is to have fully redundant systems called an active
recovery or mirror site. While most companies cannot afford to build and equip two identical data
centers, those companies that can afford to do so have the ability to recover from almost any
disaster. This is the most reliable and also the most expensive method of systems recovery.
Hot Site
A dedicated contingency center, or ‘hot site’ is a fully equipped computer facility with electrical
power, heating, ventilation and air conditioning (HVAC) available for use in the event of a
subscriber’s computer outage. These facilities are available to a large number of subscribers on a
membership basis and use of site is on a ‘first come, first served’ basis. In addition to the computer
facility, these facilities offer an area of general office space and computer ready floor space on
which the users can build their own long term recovery configuration. Some of the vendors also
offer remote operations facilities for use in tests or emergency. Where the recovery center is in a
city other than the subscriber’s home location, this can be used to reduce the need to transport
staff and resources.
A hot site is a duplicate of the original site of the organisation, with full computer systems as well
as near-complete backups of user data. Real time synchronization between the two sites may be
used to completely mirror the data environment of the original site using wide area network links
and specialized software. Following a disruption to the original site, the hot site exists so that the
organisation can relocate with minimal losses to normal operations. Ideally, a hot site will be up
and running within a matter of hours or even less. Personnel may still have to be moved to the hot
site so it is possible that the hot site may be operational from a data processing perspective before
staff has relocated. The capacity of the hot site may or may not match the capacity of the original
site depending on the organisation's requirements. This type of backup site is the most expensive
to operate. Hot sites are popular with organisations that operate real time processes such as
financial institutions, government agencies and ecommerce providers.
Cold Site
A cold site is the least expensive type of backup site for an organisation to operate. It does not
include backed up copies of data and information from the original location of the organisation, nor
does it include hardware already set up. The lack of hardware contributes to the minimal start-up
80
Chapter 2: Strategies for development of BCP
costs of the cold site, but requires additional time following the disaster to have the operation
running at a capacity close to that prior to the disaster.
Warm Site
A warm site is a compromise between hot and cold. These sites will have hardware and
connectivity already established, though on a smaller scale than the original production site or even
a hot site. Warm sites will have backups on hand, but they may not be complete and may be
between several days and a week old. An example would be backup tapes sent to the warm site
by courier.
Data Vaults
Backups are stored in purpose built vaults. There are no generally recognized standards for the
type of structure which constitutes a vault. Commercial vaults fit into three categories:
1. Underground vaults
2. Free-standing dedicated vaults
3. Insulated chambers sharing facilities
81
Module 7
transmits data to a service provider. Recent backups are retained locally, to speed data recovery
operations. There are a number of cloud storage appliances on the market that can be used as a
backup target, including appliances from CTERA Networks, Naquin, StorSimple and Twin Strata.
82
Chapter 2: Strategies for development of BCP
ii. Redundancy: Providing multiple identical instances of the same system and switching to
one of the remaining instances in case of a failure (failover);
iii. Diversity: Providing multiple different implementations of the same specification and
using them like replicated systems to cope with errors in a specific implementation.
RAID levels: Levels 0, 1, and 5 are the most commonly found, and cover most requirements.
Generally, most organisations use RAID-1 to RAID-5 for data redundancy.
Electronic vaulting: Electronic vaulting is a backup type where the data is backed up to an offsite
location. The data is backed up, generally, through batch process and transferred through
communication lines to a server at an alternate location.
Database shadowing: Database shadowing is the live processing of remote journaling, but
creates even more redundancy by duplicating the database sites to multiple servers.
83
Module 7
2.9.1 Coverage
Insurance policies usually can be obtained to cover the following resources:
84
Chapter 2: Strategies for development of BCP
85
Module 7
2.10 Summary
We can summarize the following key concepts covered in this chapter:
The development of a Business Continuity Plan can be done with the support of BCP Policy
existing in an organisation. BCP Policy sets the scope of the plan. Development of BCP
involves planning BCP as a project includes conducting a Business Impact Analyses, Risk
Assessment, testing of the BCP, providing training and awareness and continuous
maintenance of the BCP Plan.
Contingency planning encompass Incident Management planning, Disaster recovery planning
and Business Continuity planning.
The hierarchy for invoking a Business Continuity Plan are: Incident Handling and
ResponseDisaster Recovery Business Continuity.
Business Continuity Management would contain the following minimum documents:
o Business Continuity Policy which documents the scope for the Business Continuity
o Business Continuity Manual which documents the step by step process to achieve
Business Continuity and details of relevant contacts.
Backup and Recovery Strategies, Types of Alternative Sites, system resiliency tools and
techniques etc., are some strategies which are considered in developing a Business Continuity
Plan.
Insurance is a mode of transferring the risk that arises due to the threats to the Business
Continuity. The various types of insurance and coverage have been discussed in this chapter.
2.11 Questions
1. Which of the following control concepts should be included in a complete test of disaster
recovery procedures?
86
Chapter 2: Strategies for development of BCP
3. All of the following are security and control concerns associated with disaster recovery
procedures EXCEPT:
4. Which of the following business recovery strategies would require the least expenditure
of funds?
A. Warm site
B. Empty shell
C. Hot site
D. Reciprocal agreement
6. Which of the following would warranty a quick continuity of operations when the recovery
time window is short?
7. For which of the following applications would rapid recovery be MOST crucial?
A. Point-of-sale
B. Corporate planning
C. Regulatory reporting
D. Departmental chargeback
8. Which of the following principles must exist to ensure the viability of a duplicate
information processing facility?
A. The site is near the primary site to ensure quick and efficient recovery is achieved.
87
Module 7
9. While reviewing the business continuity plan of an organisation, the IS auditor observed
that the organisation's data and software files are backed up on a periodic basis. Which
characteristic of an effective plan does this demonstrate?
A. Deterrence
B. Mitigation
C. Recovery
D. Response
10. As updates to an online order entry system are processed, the updates are recorded on
a transaction tape and a hard copy transaction log. At the end of the day, the order entry
files are backed up onto tape. During the backup procedure, the disk drive malfunctions
and the order entry files are lost. Which of the following are necessary to restore these
files?
A. The previous day's backup file and the current transaction tape
B. The previous day's transaction file and the current transaction tape
C. The current transaction tape and the current hardcopy transaction log
D. The current hardcopy transaction log and the previous day's transaction file
2. D. Hot sites can be made ready for operation normally within hours. However, the use of
hot sites is expensive, should not be considered as a long-term solution and does require
that equipment and systems software be compatible with the primary installation being
backed up.
3. D. The inability to resolve system deadlock is a control concern in the design of database
management systems, not disaster recovery procedures. All of the other choices are
control concerns associated with disaster recovery procedures.
4. D. Reciprocal agreements are the least expensive because they usually rely on a
gentlemen's agreement between two firms.
88
Chapter 2: Strategies for development of BCP
5. D. A UPS typically cleanses the power to ensure wattage into the computer remains
consistent and does not damage the computer. All other answers are features of a
UPS.
9. B. An effective business continuity plan includes steps to mitigate the effects of a disaster.
To have an appropriate backup plan, an organisation should have a process capability
established to restore data and files on a timely basis, mitigating the consequence of a
disaster. An example of deterrence is when a plan includes installation of firewalls for
information systems. An example of recovery is when a plan includes an organisation's
hot site to restore normal business operations.
10. A. The previous day's backup will be the most current historical backup of activity in the
system. The current day's transaction file will contain all of the day's activity. Therefore,
the combination of these two files will enable full recovery up to the point of interruption.
89
CHAPTER 3: AUDIT OF BUSINESS CONTINUITY
PLAN
Learning Objectives
This chapter deals with the regulatory requirements that make it mandatory for an organisation to
have Business Continuity Management. Best practices frameworks such as COBIT can be used
by adapting it as per organisation requirements to achieve effective Business Continuity
Management. This chapter provides details of audit procedures that are to be followed by the IS
Auditor. The audit is performed to provide assurance to management on the availability of the
required controls which mitigate identified risks. A good understanding of the concepts covered in
chapters 1 to 3 will help DISAs to provide assurance and consulting services in the area of BCM,
BCP and DRP.
3.1 Introduction
A business continuity plan audit is a formalized method for evaluating how business continuity
processes are being managed. The goal of an audit is to determine whether the plan is effective
and in line with the company's objectives. A business continuity plan audit should define the risks
or threats to the success of the plan and test the controls in place to determine whether or not
those risks are acceptable. An audit should also quantify the impact of weaknesses of the plan and
offer recommendations for business continuity plan improvements.
1. Human Resources
People are perhaps an organization's most obvious resource. Some functions require the effort of
specific individuals, some require specialized expertise, and some only require individuals who can
be trained to perform a specific task. Within the information technology field, human resources
include both operators (such as technicians or system programmers) and users (such as data entry
clerks or information analysts).
2. Processing capability
Traditionally contingency planning has focused on processing power (i.e., if the data center is
down, how can applications dependent on it continue to be processed?). Although the need for
data center backup remains vital, today's other processing alternatives are also important. Local
area networks (LANs), minicomputers, workstations, and personal computers in all forms of
centralized and distributed processing may be performing critical tasks.
92
Chapter 3: Audit of BCP
4. Computer-based services
An organization uses many different kinds of computer-based services to perform its functions.
The two most important are normally communications services and information services.
Communications can be further categorized as data and voice communications; however, in many
organizations these are managed by the same service. Information services include any source
of information outside of the organization.
5. Physical infrastructure
For people to work effectively, they need a safe working environment and appropriate equipment
and utilities. This can include office space, heating, cooling, venting, power, water, sewage, other
utilities, desks, telephones, fax machines, personal computers, terminals, courier services, file
cabinets, and many other items. In addition, computers also need space and utilities, such as
electricity. Electronic and paper media used to store applications and data also have physical
requirements.
93
Module 7
1. Implementation
Much preparation is needed to implement the strategies for protecting critical functions and their
supporting resources. For example, one common preparation is to establish procedures for
backing up files and applications. Another is to establish contracts and agreements, if the
contingency strategy calls for them. Existing service contracts may need to be renegotiated to add
contingency services. Another preparation may be to purchase equipment, especially to support
a redundant capability.
2. Back up
Backing up data files and applications is a critical part of virtually every contingency plan. Backups
are used, for example, to restore files after a personal computer virus corrupts the files or after a
hurricane destroys a data processing center.
94
Chapter 3: Audit of BCP
3. Documentation
It is important to keep preparations, including documentation, up-to-date. Computer systems
change rapidly and so should backup services and redundant equipment. Contracts and
agreements may also need to reflect the changes. If additional equipment is needed, it must be
maintained and periodically replaced when it is no longer dependable or no longer fits the
organization's architecture.
4. Assigning responsibility
Preparation should also include formally designating people who are responsible for various tasks
in the event of a contingency. These people are often referred to as the contingency response
team. This team is often composed of people who were a part of the contingency planning team.
95
Module 7
96
Chapter 3: Audit of BCP
Process Scope
Evaluate that IT processes and IT-supported business processes are compliant with laws,
regulations and contractual requirements. Obtain assurance that the requirements have been
identified and complied with, and integrate IT compliance with overall organisation compliance.
Process Purpose
Ensure that the organisation is compliant with all applicable external requirements.
Management Practices
1. On a continuous basis, identify and monitor for changes in local and international laws,
regulations and other external requirements that must be complied with from an IT perspective.
This will be achieved by doing following activities:
Assign responsibility for identifying and monitoring any changes of legal, regulatory and
other external contractual requirements relevant to the use of IT resources and the
processing of information within the business and IT operations of the organisation.
Identify and assess all potential compliance requirements and the impact on IT activities
in areas such as data flow, privacy, internal controls, financial reporting, industry-specific
regulations, intellectual property, health and safety.
Assess the impact of IT-related legal and regulatory requirements on third-party contracts
related to IT operations, service providers and business trading partners.
Obtain independent counsel, where appropriate, on changes to applicable laws,
regulations and standards.
Maintain an up-to-date log of all relevant legal, regulatory and contractual requirements,
their impact and required actions.
Maintain a harmonized and integrated overall register of external compliance
requirements for the organisation.
2. Review and adjust policies, principles, standards, procedures and methodologies to ensure that
legal, regulatory and contractual requirements are addressed and communicated. Consider
industry standards, codes of good practice, and best practice guidance for adoption and
adaptation. This can be achieved by doing following activities:
97
Module 7
4. Obtain and report assurance of compliance and adherence with policies, principles, standards,
procedures and methodologies. Confirm that corrective actions to address compliance gaps are
closed in a timely manner. To ensure this management practice the following activities are to be
ensured:
Obtain regular confirmation of compliance with internal policies from business and IT
process owners and unit heads.
Perform regular (and, where appropriate, independent) internal and external reviews to
assess levels of compliance.
If required, obtain assertions from third-party IT service providers on levels of their
compliance with applicable laws and regulations.
If required, obtain assertions from business partners on levels of their compliance with
applicable laws and regulations as they relate to intercompany electronic transactions.
Monitor and report on non-compliance issues and, where necessary, investigate the root
cause.
Integrate reporting on legal, regulatory and contractual requirements at an organisation
wide level, involving all business units.
98
Chapter 3: Audit of BCP
Indian legislations
There are various Indian legislations such as the Information Technology Act, Indian Income Tax
act, Central Sales Tax act, State VAT Acts, Services tax act, Central excise act etc. which require
data retention for specific number of years. Organisations which have to comply with these
requirements have to ensure that they have a proper business continuity plan which meets these
requirements. The Reserve bank of India provides regular guidelines to financial institutions
covering various aspects of IT deployment. These guidelines cover business continuity and
disaster recovery procedures for various types of business operations which are dependent on IT
environment.
Bank Audit
The Long Form Audit report in the case of statutory audit of banks contains two key points relating
to business continuity and disaster recovery which need to be evaluated and commented by the
statutory auditor.
Whether regular back-ups of accounts and off-site storage are maintained as per the
guidelines of the controlling authorities of the bank?
Whether adequate contingency and disaster recovery plans are in place for
loss/encryption of data?
The first point may be irrelevant in case of audit of branches where core banking solution is
implemented. However, a general review of the contingency and disaster recovery plans has to be
made by auditor and required comments provided. In case of internal audit or concurrent audit of
banks, there are specific areas of BCP which need to be reviewed by the auditors.
99
Module 7
adapted as required. Below is an extract of management practices and activities from COBIT which
is applicable to BCP.
Process Purpose
Continue critical business operations and maintain availability of information at a level acceptable
to the organisation in the event of a significant disruption.
Identify internal and outsourced business processes and service activities that are critical
to the organisation operations or necessary to meet legal and/or contractual obligations.
Identify key stakeholders and roles and responsibilities for defining and agreeing on
continuity policy and scope.
Define and document the agreed-on minimum policy objectives and scope for business
continuity and embed the need for continuity planning in the organisation culture.
Identify essential supporting business processes and related IT Services.
2. Maintain a continuity strategy: Evaluate business continuity management options and choose
a cost-effective and viable continuity strategy that will ensure organisation recovery and continuity
in the face of a disaster or other major incident or disruption.
Identify potential scenarios likely to give rise to events that could significant disruptive
incidents.
Conduct a business impact analysis to evaluate the impact overtime of a disruption to
critical business functions and the effect that a disruption would have on them.
Establish the minimum time required to recover a business process and supporting IT
based on an acceptable length interruption and maximum tolerable outrage.
Assess the likelihood of threats that could cause loss of business continuity and identify
measures that will reduce the likelihood and impact through improved prevention and
increased resilience.
100
Chapter 3: Audit of BCP
Analyze continuity requirements to identify the possible strategic business and technical
options.
Determine the conditions and owners of key decisions that will cause the continuity plans
to be invoked.
Identify resource requirements and costs for each strategic technical option and make
strategic recommendations.
Obtain executive business approval for selected strategic options.
3. Develop and implement a business: continuity response. Develop a business continuity plan
(BCP) based on the strategy that documents the procedures and information in readiness for use
in an incident to enable the organisation to continue its critical activities.
Define the incident response actions and communications to be taken in the event of
disruption. Define related roles and responsibilities, including accountability for policy and
implementation.
Develop and maintain operational BCPs containing the procedures to be followed to
enable continued operation of critical business processes and/or temporary processing
arrangements, including links to plans of outsourced service providers.
Ensure that key suppliers and outsource partners have effective continuity plans in place.
Obtain audited evidence as required.
Define the conditions and recovery procedures that would enable resumption of business
processing, including updating and reconciliation of information databases to preserve
information integrity.
Define and document the resources required to support the continuity and recovery
procedures, considering people, facilities and IT infrastructure.
Define and document the information backup requirements required to support the plans,
including plans and paper documents as well as data files, and consider the need for
security and off-site storage.
Determine required skills for individuals involved in executing the plan and procedures.
Distribute the plans and supporting documentation securely to appropriately authorize
interested parties and make sure they are accessible under all disaster scenarios.
4. Exercise, test and review the BCP: Test the continuity arrangements on a regular basis to
exercise the recovery plans against predetermined outcomes and to allow innovative solutions to
be developed and help to verify over time that the plan will work as anticipated.
Define objectives for exercising and testing the business, technical, logistical,
administrative, procedural and operational systems of the plan to verify completeness of
the BCP in meeting business risk.
101
Module 7
Define and agree on with stakeholders exercises that are realistic, validate continuity
procedures, and include roles and responsibilities and data retention arrangements that
cause minimum disruption to business processes.
Assign roles and responsibilities for performing continuity plan exercises and tests.
Schedule exercises and test activities as defined in the continuity plan.
Conduct a post-exercise debriefing and analysis to consider the achievement.
Develop recommendations for improving the current continuity plan based on the results
of the review.
5. Review, maintain and improve the continuity plan: Conduct a management review of the
continuity capability at regular intervals to ensure its continued suitability, adequacy and
effectiveness. Manage changes to the plan in accordance with the change control process to
ensure that the continuity plan is kept up to date and continually reflects actual business
requirements.
Review the continuity plan and capability on a regular basis against any assumptions
made and current business operational and strategic objectives.
Consider whether a revised business impact assessment may be required, depending on
the nature of the change.
Recommend and communicate changes in policy, plans, procedures, infrastructure, and
roles and responsibilities for management approval and processing via the change
management process.
Review the continuity plan on a regular basis to consider the impact of new or major
changes to: organisation, business processes, outsourcing arrangements, technologies,
infrastructure, operating systems and application systems.
6. Conduct continuity plan training. Provide all concerned internal and external parties with
regular training sessions regarding the procedures and their roles and responsibilities in case of
disruption.
Define and maintain training requirements and plans for those performing continuity
planning, impact assessments, risk assessments, media communication and incident
response. Ensure that the training plans consider frequency of training and training
delivery mechanisms.
Develop competencies based on practical training including participation in exercises and
tests.
Monitor skills and competencies based on the exercise and test results.
7. Manage backup arrangements: Maintain availability of business critical information.
103
Module 7
Process Purpose
Maintain service availability, efficient management of resources, and optimization of system
performance through prediction of future performance and capacity requirements.
Identify only those solutions or services that are critical in the availability and capacity
management process.
Map the selected solutions or services to application(s) and infrastructure (IT and facility)
on which they depend to enable a focus on critical resources for availability planning.
Collect data on availability patterns from logs of past failures and performance monitoring.
Use modelling tools that help predict failures based on past usage trends and
management expectations of new environment or user conditions.
Create scenarios based on the collected data, describing future availability situations to
illustrate a variety of potential capacity levels needed to achieve the availability
performance objective.
Determine the likelihood that the availability performance objective will not be achieved
based on the scenarios.
104
Chapter 3: Audit of BCP
Determine the impact of the scenarios on the business performance measures (e.g.,
revenue, profit, customer services). Engage the business line, functional (especially
finance) and regional leaders to understand their evaluation of impact.
Ensure that business process owners fully understand and agree to the results of this
analysis. From the business owners, obtain a list of unacceptable risk scenarios that
require a response to reduce risk to acceptable levels.
3. Plan for new or changed service requirements: Plan and prioritise availability, performance
and capacity implications of changing business needs and service requirements.
Establish a process for gathering data to provide management with monitoring and
reporting information for availability, performance and capacity workload of all
information-related resources.
Provide regular reporting of the results in an appropriate form for review by IT and
business management and communication to organisation management.
Integrate monitoring and reporting activities in the iterative capacity management
activities (monitoring, analysis, tuning and implementations).
Provide capacity reports to the budgeting processes.
5. Investigate and address availability, performance and capacity issues: Address deviations
by investigating and resolving identified availability, performance and capacity issues.
Identify performance and capacity gaps based on monitoring current and forecasted
performance. Use the known availability, continuity and recovery specifications to classify
resources and allow prioritization.
Define corrective actions (e.g., shifting workload, prioritizing tasks or adding resources,
when performance and capacity issues are identified).
Integrate required corrective actions into the appropriate planning and change
management processes.
Define an escalation procedure for swift resolution in case of emergency capacity and
performance problems.
For more information, please www.isaca.org/cobit.
3.3.3 ITIL
Information Technology Infrastructure Library (ITIL), a UK body, is a collection of best practices in
IT service management, consisting of a series of books giving guidance on the provision of quality
IT services. ITIL is drawn from the public and private sectors internationally, supported by a
comprehensive qualification scheme and accredited training organisations. ITIL is the most widely
adopted approach for IT Service Management in the world. It provides a practical, no-nonsense
framework for identifying, planning, delivering and supporting IT services to the business. It
106
Chapter 3: Audit of BCP
includes descriptions of best practice in information security management as well as other related
disciplines. For more information, please visit: www.itil-officialsite.com.
3.3.4 SSAE 16
Statement on Standards for Attestation Engagements (SSAE) No. 16, known as SSAE 16, has
been put forth by the Auditing Standards Board (ASB) of the American Institute of Certified Public
Accountants (AICPA). SSAE 16 is an “attest” standard that closely mirrors its international
“assurance” equivalent, ISAE 3402, which was issued by the International Auditing and Assurance
Standards Board (IAASB), a standard-setting board of the International Federation of Accountants
(IFAC).SSAE 16 is issued by AICPA. It is generally applicable when an auditor (called the “user
auditor”) is auditing the financial statements of an entity (“user organisation”) that obtains services
from another organisation (“service organisation”). The service organisations that provide such
services could be application service providers, bank trust departments, claims processing centers,
Internet data centers, or other data processing service bureaus. . For more information, please
visit: www.ssae16.org
i. Automated Tools: Automated tools make it possible to review large computer systems
for a variety of flaws in a short time period. They can be used to find threats and
vulnerabilities such as weak access controls, weak passwords, lack of integrity of the
system software, etc.
ii. Internal Control Auditing: This includes inquiry, observation and testing. The process
can detect illegal acts, errors, irregularities or lack of compliance of laws and regulations.
iii. Disaster and Security Checklists: A checklist can be used against which the system
can be audited. The checklist should be based upon disaster recovery policies and
practices, which form the baseline. Checklists can also be used to verify changes to the
system from contingency point of view.
iv. Penetration Testing: Penetration testing can be used to locate vulnerabilities in the
network.
107
Module 7
During the term of the agreement, it serves as the standard for measuring and adjusting the
services. Service levels are often defined to include hardware and software performance targets
(such as user response time and hardware availability) but can also include a wide range of other
performance measures. Such measures might include financial performance measures (such as
year to year incremental cost reduction), human relationship measures (such as resource planning,
staff turnover, development and training) or risk management measures (compliance with control
objectives).
The IS auditor should be aware of the different types of measures available and should ensure that
they are comprehensive and include risk, security and control measures as well as security and
control measures as well as efficiency and effectiveness measures.
Where the functions of a BCP are outsourced, the IS auditor should determine how management
gains assurance that the controls at the third party are properly designed and operating effectively.
Several techniques can be used by management, including questionnaires, onsite visits or an
independent third-party assurance report such as an SSAE 16 SOC 1 report or SOC 2 or SOC 3
report.
108
Chapter 3: Audit of BCP
3. Designing Test Plans and Conducting Tests of the BCP/DRP. CAs can design plans
that can be used by the management for regular testing of the BCP. He can also
evaluate the tests that have been conducted by the management.
4. Consultancy Services in revising and updating the BCP/DRP. Maintenance of the BCP is
a periodic process. Technologies evolve and the Business Environment often changes
and hence it is necessary to revise and update the BCP.
5. Conducting Pre Implementation Audit, Post Implementation Audit, General Audit of the
BCP/DRP.
A Chartered Accountant can provide assurance whether the BCP would suffice to the
organisation.
6. Consultancy Services in Risk Assessment and Business Impact Analysis. Conducting a
proper Business Impact Analysis and assessing the risks that are present in the
organisation’s environment is really crucial for the correct development of the BCP/DRP.
CAs can help in the development stages by conducting BIA and Risk Assessment for
the organisation.
7. CAs can be involved in any/all areas of BCP implementation or review. These areas could
be pertaining to:
a. Risk Assessment
b. Business Impact Assessment
c. Disaster Recovery Strategy Selection
d. Business Continuity Plan Development
e. Fast-track Business Continuity Development
f. BCP / DRP Audit, Review and Health-check Services
g. Development and Management of BCP / DRP Exercises and Rehearsals
h. Media Management for Crisis Scenarios
i. Business Continuity Training
3.6 Summary
A BCP is not merely about information Technology Assets but is also about people reactions in
case of a crisis. In a crisis, people have to assume responsibilities that are different from their
normal day to day tasks. This requires a series of coordinated actions on the part of the personnel
involved. A BCP is rarely a standalone document. It is, usually, part of a set of documents... There
may be a separate plan, the Cyber Incident Response Plan, to take care of threats like computer
viruses and network intrusions. An Occupant Emergency Plan (OEP) may be in use for the
evacuation of premises during a fire or medical emergencies. Insurance is yet another tool that
supplements BCP. Monetary losses can be minimized by transferring certain risks to an insurance
company on the payment of a premium. A BCP that exists on paper without being tested serves
no useful purpose. The worst possible way to "test" is to see whether it works during a real disaster.
Ideally, while framing the objectives of the BCP, the organisation spells out the "acceptance
109
Module 7
criteria", that is, the tests that will validate the BCP. Testing a BCP can be a complex undertaking
as many personnel will have to carry out the tests even while continuing with normal operations.
After the completion of Chapter 3, we can summarize the broad coverage as follows:
IS Auditor has to understand BCP processes and key activities for each of the key processes.
This chapter has provided an overview of the BCP processes.
Regulations such as the Sarbanes Oxley Act, HIPAA, and BASEL 2 make it mandatory for an
organisation to have Business Continuity Management. Standards such as ISO 22301, 27000
etc. and Frameworks such as COBIT, ITIL lays down the steps that could be followed by the
management of the organisation to have efficient Business Continuity Management Practices.
Audit Process that are to be followed by an IS Auditor. A control is placed always against an
identified risk by the management. It is essential for an IS Auditor to verify the controls that
have been put in place by the management for adequacy and existence. IS auditor has to
follow the standard auditing procedures and guidance notes (if any) issued by the governing
bodies (Like ICAI in case of Chartered accountants, ISACA in case of certified information
system auditors etc.) while discharging their duties.
3.7 References
www.icai.org
www.isaca.org
www.csoonline.com
www.thebci.org
www.aicpa.org
www.iso.org
www.bsigroup.org
110
Chapter 3: Audit of BCP
3.8 Questions
1. An IS auditor reviewing an organisation's information systems disaster recovery plan
should verify that it is:
2. Which of the following would an IS auditor consider to be the MOST important to review
when conducting a business continuity audit?
3. Which of the following findings would an IS auditor be MOST concerned about when
performing an audit of backup and recovery and the offsite storage vault?
4. A company performs full back-up of data and programs on a regular basis. The primary
purpose of this practice is to:
111
Module 7
7. Which of the following offsite information processing facility conditions would cause an IS
auditor the GREATEST concern?
8. Which of the following methods of results analysis, during the testing of the business
continuity plan (BCP), provides the BEST assurance that the plan is workable?
A. Quantitatively measuring the results of the test
B. Measurement of accuracy
C. Elapsed time for completion of prescribed tasks
D. Evaluation of the observed test results
10. Which of the following would be of MOST concern for an IS auditor reviewing back-up
facilities?
2. D. Without data to process, all other components of the recovery effort are in vain. Even
in the absence of a plan, recovery efforts of any type would not be practical without data
to process.
3. C. More than one person would need to have a key to the vault and location of the vault
is important, but not as important as the files being synchronized. Choice A is incorrect
because more than one person would typically need to have a key to the vault to ensure
that individuals responsible for the offsite vault can take vacations and rotate duties.
Choice B is not correct because the IS auditor would not be concerned whether paper
documents are stored in the offsite vault. In fact, paper documents such as procedural
documents and a copy of the contingency plan would most likely be stored in the offsite
vault.
4. B. Back-up procedures are designed to restore programs and data to a previous state
prior to computer or system disruption. These backup procedures merely copy data and
do not test or validate integrity. Back-up procedures will also not prevent changes to
program and data. On the contrary, changes will simply be copied. Although backup
procedures can ease the recovery process following a disaster, they are not sufficient in
themselves.
review of program code and documentation generally does not provide evidence
regarding recovery/restart procedures.
6. C. Adequate fire insurance and fully tested backup processing facilities are important
elements for recovery, but without the offsite storage of transaction and master files,
it is generally impossible to recover. Regular hardware maintenance does not relate
to recovery.
7. A. The offsite facility should not be easily identified from the outside. Signs identifying
the company and the contents of the facility should not be present. This is to prevent
intentional sabotage of the offsite facility should the destruction of the originating site
be from malicious attack. The offsite facility should not be subject to the same natural
disaster that affected the originating site. The offsite facility must also be secured and
controlled just as the originating site. This includes adequate physical access controls
such as locked doors, no windows and human surveillance.
10. C Adequate fire insurance and fully tested backup processing facilities are important
elements for recovery, but without the offsite storage of transaction and master files, it is
generally impossible to recover. Regular hardware maintenance does not relate to
recovery.
114
Module 7
SECTION 3: APPENDIX
CHECKLISTS AND CONTROL MATRIX
Appendix 1: Checklist for a Business Continuity Plan and
Audit
Process Objectives:
This checklist is to be used by the IS Auditor who is conducting the BCP Audit. This checklist
covers the entire BCP Process but it has to be customized as per the specific needs of the
assignment. An IS Auditor can use this checklist as a basis for recording observations and for
collecting evidences for the Audit engagement. This is checklist is an illustrative example as to
how an IS Auditor could conduct a BCM Audit at an organisation. It can be taken as a base for
conducting such audit engagements.
Sl. Checkpoints/Particulars
No
Policy and Procedure
1. Is business continuity plan documented and implemented?
2. Whether the scope and objectives of a BCP are clearly defined in the policy
document?
(Scope to cover all critical activities of business. Objectives should clearly spell
out outcomes of the BCP)
3. Whether there exist any exceptions to the scope of BCP i.e. in terms of location
or any specific area, and whether the management has justifications for
exclusion of the same.
4. What is the time limit for such exclusion and what is the current strategy of
covering such exclusions
5. Are the policy and procedure documents approved by the Top Management?
(Verify sign off on policy and procedure documents and budget allocations
made by the management for a BCP)
115
Section 3
Sl. Checkpoints/Particulars
No
6. Does the business continuity plan ensure the resumption of IS operations
during major information system failures?
(Verify that the IS disaster recovery plan is in line with strategies, goals and
objectives of corporate business continuity plan).
7. Are users involved in the preparation of business continuity plan?
(Managerial, operational, administrative and technical experts should be
involved in the preparation of the BCP and DRP).
8. Does the policy and procedure documents include the following
List of critical information assets.
List of vendor for service level agreements.
Current and future business operations.
Identification of potential threats and vulnerabilities.
Business impact analysis.
Involvement of technical and operational expert in preparation of BCP and
Disaster recovery plans.
Recovery procedure to minimize losses and interruptions in business
operations.
Disaster recovery teams.
Training and test drills.
Compliance with statutory and regulatory requirements
9. Are the BCP policy and procedures circulated to all concerned?
(Verify availability and circulation of the BCP & DRP to all concerned, including
onsite and offsite storage).
10. Is the business continuity plan updated and reviewed regularly?
(Verify minutes of meeting where policy and procedures are reviewed. Verify
amendments made to the policy and procedure documents due to the change
in business environment).
Risk Assessment
1. Has the management identified potential threats/vulnerabilities to business
operations?
(Verify the business environment study report. Risk Assessment Report?)
2. Are the risks evaluated by the Management?
(Verify the probability or occurrence of the threat / vulnerability review carried
out by the management).
3. Has the organisation selected the appropriate method for risk evaluation?
4. Has the organisation carried out the assessment of internal controls?
(Verify the internal controls mitigating the risk).
5 Has the organisation taken an appropriate decision on the risks identified?
(Verify the decision-making on the options - accepted, reduced, avoided or
transferred – for the risks identified).
116
Module 7
Sl. Checkpoints/Particulars
No
6. Are the risk assessment carried out at regular interval?
(Verify the review frequency.)
Business Impact Analysis
1. Does the organisation carry out business impact analysis (BIA) for business
operations?
2. Has the organisation identified a BIA team?
3. Are RTO and RPO defined by the management?
4. Whether the SDO has been defined based upon RTO & RPO
5. Whether the organisation has measured BIA?
(Impact of risks on business operations can be measured in the form of
business loss, loss of goodwill etc.)
6. Is the business impact analysis carried out at a regular interval?
Development and Implementation of the BCP and DRP
1. Has the organisation prioritized recovery of interrupted business operations?
(Prioritization of activities is based on RTO and RPO)
2. Has the organisation identified the various BCP and DRP Teams?
(Verify employees are identified, informed and trained to take an action in the
event of disaster).
3. Are the responsibilities for each team documented?
(Verify the roles and responsibilities assigned to employees for actions to be
taken in the event of incident/disaster)
4. Does the BCP document(s) include the following?
Scope and objective.
Roles and responsibilities of BCP and DRP Teams.
Incident declaration.
Contact list.
Evacuation and stay-in procedure.
Activity priorities.
Human resource and welfare procedure.
Escalation procedures.
Procedure for resumption of business activities.
Media communication.
Legal and statutory requirements.
Backup and restore procedures.
Offsite operating procedures
5. Are the copies of up-to-date BCP Documents stored offsite?
6. Does the offsite facility have the adequate security requirements?
(Verify the logical access, physical access and environmental control of the
offsite).
117
Section 3
Sl. Checkpoints/Particulars
No
7. Does the BCP include training to employees?
(Verify the evidences of training given).
8. Whether the organisation has an adequate media and document backup and
restoration procedures?
(Verify the backup and restoration schedules adopted by the organisation)
9. Are logs for backup and restoration maintained and reviewed?
(Verify the logs maintained and review of the same by an independent person).
10. Whether the media library has an adequate access control?
(Verify the physical and logical access controls to the media library).
11. Are the BCP and DRP communicated to all the concerned?
(Verify availability and circulation of BCP & DRP to all concerned, including
Onsite and offsite storage).
Maintenance of BCP and DRP
1. Whether the business continuity plan is tested at regular interval?
2. Has the organisation reviewed the gap analysis of testing results?
(Review process that includes a comparison of test results to the planned
results).
3. How has the organisation decided to reduce the gaps identified, what is the
time limit set for addressing the same?
4. Has the organisation got a testing plan?
(Verify copy of test plan and updates).
5. Are test drills conducted at appropriate intervals?
6. Do organisation documents and analyses have testing results?
(Verify the corrective copies of test results and analysis of the report).
7. Has the organisation prepared action points to rectify the testing results?
(Verify the corrective action plan for all problems encountered during the test
drill).
8. Does the organisation carry out retesting activity for action points?
(Verify the evidences of retesting activities).
9. Does the organisation review the BCP and DRP at regular intervals?
10. Whether a review of the BCP includes following?
BCP policy and procedure
Scope and exclusion of BCP
Inventory of IS assets
Validating assumption made while risk assessment and preparation of BCP
and DRP
Risk assessment
Business impact analysis
Back up of system and data
118
Module 7
Sl. Checkpoints/Particulars
No
Training to employees
Test drills
119
Section 3
would lead to data being prescribed in the Business •Review information backup
restored but not possible Continuity Policy and procedures in general. The
to render such retrieved signoffs should be availability of backup data
data due to lack of obtained along with could be critical in minimizing
software. notification from the the time needed for recovery.
backup utility. • Determine if the disaster
recovery/ business
resumption plan covers
procedures for disaster
declaration, general
shutdown and migration of
operations to the backup
facility.
Insufficient testing of the Testing and revisions •Determine if a test plan
BCP could lead to should be a part of the exists and to what extent the
difficulties at the time of Policy. Test Plans drafted disaster recovery/business
actual disaster. should be executed and resumption plan has been
reports regarding the tests tested.
should be maintained. • Determine if a testing
schedule exists and is
adequate (at least annually).
Verify the date of the last
test. Determine if
weaknesses identified in the
last tests were corrected.
Lack of Required The required resources •Determine if resources have
Resources those are should be procured and been made available to
essential to execute a preserved. The resources maintain the disaster
DRP/BCP will lead to a needs to be reviewed recovery/business
failed execution. periodically and changed resumption plan and keep it
as there could be wear and current.
tear due to efflux of time. •Have resources been
The required resources allocated to prevent the
should be in accordance to disaster recovery/ business
the BCP/DRP. resumption plan from
becoming outdated and
ineffective?
BCP/DRP without correct BCP/DRP which has been •Obtain and review the
review of the existing made for the organisation existing disaster recovery/
plans, Business Impact should be relevant and business resumption plan.
Analysis and Risk adequate to the size and •Obtain and review plans for
Assessment would lead to nature of the organisation. disaster recovery/ business
120
Module 7
121
Section 3
122
Module 7
123
Section 3
124
Module 7
Observation
Max Infotech does not have an alternate disaster recovery site. Also documented Disaster
Recovery Plan (DRP) and business continuity plan are not there.
Exposure
The DRP is a key plan ensuring availability of resources critical to the business operations. In
the absence of documented procedures and policies for the same, it may be difficult to recover
125
Section 3
from a disaster resulting in non-availability of data and applications to the users for unacceptable
period of time thereby interrupting business processes and impacting the business.
Cause
Recommendation
Ensure that the Max Infotech has an alternate disaster recovery site and a documented
procedures and policies for disaster recovery. This document should include:
126