0% found this document useful (0 votes)
81 views

Chapter 4

ITSM (Information Technology Service Management) is a set of guidelines for managing IT services. It is split into two groups - Service Delivery and Service Support - which contain eleven disciplines for areas like financial management, capacity planning, availability management, change management, and incident management. The goal of ITSM is to align IT services with business needs by setting clear expectations and allowing customers and IT to assess service delivery. Business continuity planning (BCP) and disaster recovery (DR) are important parts of ensuring critical IT services can continue after a disruption through strategies, testing, and backup/restoration.

Uploaded by

Sachal Raja
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Chapter 4

ITSM (Information Technology Service Management) is a set of guidelines for managing IT services. It is split into two groups - Service Delivery and Service Support - which contain eleven disciplines for areas like financial management, capacity planning, availability management, change management, and incident management. The goal of ITSM is to align IT services with business needs by setting clear expectations and allowing customers and IT to assess service delivery. Business continuity planning (BCP) and disaster recovery (DR) are important parts of ensuring critical IT services can continue after a disruption through strategies, testing, and backup/restoration.

Uploaded by

Sachal Raja
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

 ITSM is a set of guidelines for different aspects

of best-practice data management.


 Split into two groups and eleven disciplines.
 The two groups inside of ITSM are Service

Delivery and Service Support.


 Service Delivery
◦ IT Financial Management
◦ Capacity Management
◦ Availability Management
◦ IT Continuity Management
◦ Service Level Management
 Service Support
◦ Change Management
◦ Release Management
◦ Problem Management
◦ Incident Management
◦ Configuration Management
◦ Service Desk
 Provides a way to align the IT services with
the business requirements.
 Customers and IT personnel can discuss and

asses how well a service is being delivered.


 Primary Objective:

◦ Provide a way for setting clear expectations with


both customers and user groups
 The primary management of IT services
 Service Level Management is dependent
upon all the other areas of service delivery
 Business processes associated with SLM:
◦ Reviewing existing services
◦ Negotiating with the customers
◦ Implementation of Service Improvement policies
and processes
◦ Planning for service growth
◦ Involvement in accounting to asses costs of
services
 Financial Management includes budgeting,
accounting, and charging for IT services being
delivered to the customers.
 Budgeting and accounting involves
understanding the costs of providing various
services
 Financial Management ensures that any IT service
proposed is justified from a budget point of view.
 Allows IT departments to function as a business
unit.
 Allows customers to demand a value for their
money.
 The discipline of ensuring the IT infrastructure
is obtained at the most effective price
◦ Not necessarily the cheapest!
 Calculating the costs of providing the service
so the organization can justify the costs of its
IT services
◦ Costs can then be recovered from the customer of the service
 IT Costs can be divided into different units:
◦ Equipment
◦ Software
◦ Organization (staff, overtime, etc.)
◦ Transfer (costs of 3rd party service providers)
 Costs can also be divided into direct and
indirect costs.
 Measuring availability:
Agreement statistics – what is included within the
agreed service
Availability – agreed service times, response times
Help Desk Calls – number of incidents raised,
response times, resolution times
Capacity – performance timings for online
transactions, report production, numbers of
users, etc.
Costing Details – charges for the service, and any
penalties should service levels not be met.
 Includes planning, sizing, and controlling
service solution capacity to satisfy user
demands.
 This requires a collection of information

about usage scenarios and patterns as well as


stated performance requirements.
 Inputs:
◦ Performance monitoring
◦ Workload monitoring
◦ Application sizing
◦ Resource forecasting
 Also known as contingency management
◦ Focuses on minimizing the disruptions to
businesses caused by failure of “mission-critical”
systems
◦ Deals with planning to cope with and recover
from an IT disaster
◦ Provides guidance on safeguarding existing
systems
◦ Also considers what activities need to be taken in
the event certain services are not available
 Basic steps:
Prioritizing the businesses to be recovered by
conducting a Business Impact Analysis (BIA)
Performing a Risk Assessment (Risk Analysis) for
each of the IT Services to identify the assets,
threats, vulnerabilities and countermeasures for
each service.
Evaluating the options for recovery
Producing the Contingency/Recovery Plan
Testing, reviewing, and revising the plan on a
regular basis
 Continuity management and disaster recovery
are important, yet often overlooked, part of IT
security and risk analysis
 Inadequate contingency planning is looked at
as a risk to the business, and is often
overlooked until it is too late.
◦ When a security or other breach results in the loss
of supporting IT systems or valuable information
 Service Support revolves largely around a
strong service or help desk.
 Service desks can be unskilled (used for

incident tracking and call dispatching) or


skilled (incidents are solved at the helpdesk).
 Goal:

◦ Provide a single point of contact for the user for all


their IT queries.
◦ Proactively identify problems as well as create
resolutions to incidents.
 ReleaseManagement is responsible for the
management of software support,
development, and installation.
 Goal:
 Plan the rollout of software.
 Implement procedures for the distribution of changes to
the IT system.
 Path:
 Proper Release Management means to organize a way of
controlling and monitoring distribution of software, often
by creating a single point of storage for all software.
 For internal software this also means creating a logical
store to track releases, and store previous versions of
software.
 Create a standardized process for implementing software.
 Configuration Management is a process that
tracks all of the individual items in a system.
A system may be a single server, or an entire
IT department.
 Goals:

◦ Create a list of every hardware and software item in


the system and define their relationship while
tracking their current status as well as their history.
Business continuity planning (BCP) and contingency planning in
support of operations are elements of a system of internal control
that is established to manage availability of critical processes in the
event of interruption. The most important part of such a plan deals
with the cost-effective support of the information system.
Availability of business data is vital to the sustainable development
and/or even to the survival of any organization.
 BCP and disaster recovery planning (DRP)
processes
 Business impact analysis (BIA)
 Recovery strategies and alternatives
 Plan testing
 Backup and restoration
 Audit considerations
 Inability to maintain critical customer services
 Damage to market share, image, reputation

or brand
 Failure to protect the company assets,

including intellectual properties and


personnel
 Business control failure
 Failure to meet legal or regulatory

requirements
The purpose of business continuity/disaster
recovery is to enable a business to continue
offering critical services in the event of a
disruption and to survive a disastrous
interruption to their activities. Rigorous
planning and commitment of resources is
necessary to adequately plan for such an
event.
 Earthquakes
 Floods
 Tornados
 Thunderstorms
 Fire
 Discontinuation of Services like Electrical Power,
Telecommunications, Natural Gas
 Terrorist Attacks
 Hacker Attacks
 Virus Attacks
 System Malfunctions
 Accidental File Deletions
 Network Denial of Services (DoS) Attacks
 Intrusion
The BCP process can be divided into the following life cycle phases:
 Creation of a business continuity policy

 Business Impact Analysis (BIA)

 Classification of operations and criticality analysis

 Identification of IS processes that support critical organizational

functions
 Development of a BCP and IS disaster recovery procedures

 Development of resumption procedures

 Training and awareness program

 Testing and implementation of plan

 Monitoring
 Interruption window: The time the organization can wait from the point
of failure to the critical services/applications restoration. After this time,
the progressive losses caused by the interruption are unaffordable.

 Service delivery objective (SDO): Level of services to be reached during


the alternate process mode until the normal situation is restored. This is
directly related to the business needs.

 Maximum tolerable outages: Maximum time the organization can


support processing in alternate mode. After this point, different problems
may arise, especially if the alternate SDO is lower than the usual SDO, and
the information pending to be updated can become unmanageable.
 They are fully configured and ready to operate within several
hours. The equipment, network and systems software must
be compatible with the primary installation being backed up.
The only additional needs are staff, programs, data files and
documentation.
 Costs associated with the use of a third-party hot site are
usually high, but less than creating a redundant site, and are
often cost justifiable for critical applications.
They are partially configured, usually with network
connections and selected peripheral equipment, such as disk
drives and other controllers, but without the main computer.
Sometimes a warm site is equipped with a less-powerful
central processing unit (CPU), than the one generally used.
The assumption behind the warm site concept is that the
computer can usually be obtained quickly for emergency
installation (provided it is a widely used model) and, since the
computer is the most expensive unit, such an arrangement is
less costly than a hot site. After the installation of the needed
components, the site can be ready for service within hours;
however, the location and installation of the CPU and other
missing units could take several days or weeks.
They have only the basic environment (i.e., electrical wiring,
air conditioning, flooring, etc.) to reduce the cost. The cold
site is ready to receive equipment, but does not offer any
components at the site in advance of the need. Activation of
the site may take several weeks.
This is a specially designed trailer that can be quickly transported to
a business location or to an alternate site to provide a ready-
conditioned facility. These mobile sites can be connected to form
larger work areas and can be preconfigured with servers, desktop
computers, communications equipment, and even microwave and
satellite data links. They are a useful alternative when there are no
recovery facilities in the immediate geographic area. They are also
useful in case of a widespread disaster and are a cost-effective
alternative to duplicate for a multi-office organization.
 Incident response team: A team that has been designated to receive the information about
every incident that can be considered as a threat to assets/processes. This reporting can be
useful for coordinating an incident in progress and or for postmortem analysis. The analysis of
all incidents also provides input for updating the recovery plans.
 Emergency action team: They are first responders, designated fire wardens and bucket
crews, whose function is to deal with fires or other emergency response scenarios. One of
their primary functions is the orderly evacuation of personnel and the securing of human life.
 Information security team: The main mission of this team is to develop the needed steps to
maintain a similar level of information and IT resource security as was in place in at the
primary site before the contingency, and implement the needed security measures in the
alternative procedures environment. Additionally, this team must continually monitor the
security of system and communication links, resolve any security conflicts that impede the
expeditious recovery of the system, and assure the proper installation and functioning of
security software. The team is also responsible for the security of the organization's assets
during the disorder following a disaster.
 Damage assessment team: Assesses the extent of damage following the disaster. The team
should be comprised of individuals who have the ability to assess damage and estimate the time
required to recover operations at the affected site. This team should include staff skilled in the use
of testing equipment, knowledgeable about systems and networks, and trained in applicable safety
regulations and procedures. In addition, they have the responsibility to identify possible causes of
the disaster and their impact on damage and predictable downtime.
 Emergency management team: Responsible for coordinating the activities of all other
recovery/continuity/response teams and handling key decision making. They determine the
activation of the BCP. Other functions entail arranging the finances of the recovery, handling legal
matters evolving from the disaster, and handling public relations and media inquiries.
 Offsite storage team: Responsible for obtaining, packaging and shipping media and records to
the recovery facilities, as well as establishing and overseeing an offsite storage schedule for
information created during operations at the recovery site.
 Software team: Responsible for restoring system packs, loading and testing operating systems
software, and resolving system-level problems.
 Applications team: Travels to the system recovery site and restores user packs and application
programs on the backup system. As the recovery progresses, this team may have the responsibility
of monitoring application performance and database integrity.
 Pretest: The set of actions necessary to set the stage for the actual test. This
ranges from placing tables in the proper operations recovery area to
transporting and installing backup telephone equipment. These activities are
outside the realm of those that would take place in the case of a real
emergency, in which there is no forewarning of the event and, therefore, no
time to take preparatory actions.
 Test: This is the real action of the business continuity test. Actual operational
activities are executed to test the specific objectives of the BCP. Data entry,
telephone calls, information systems processing, handling orders, and
movement of personnel, equipment and suppliers should take place.
Evaluators review staff members as they perform the designated tasks. This is
the actual test of preparedness to respond to an emergency.
 Post-test: The cleanup of group activities. This phase comprises such
assignments as returning all resources to their proper place, disconnecting
equipment, returning personnel, and deleting all company data from third-
party systems. The post-test cleanup also includes formally evaluating the
plan and implementing indicated improvements.
 Desk-based evaluation/paper test: A paper walk-through of the
plan, involving major players in the plan's execution who reason out
what might happen in a particular type of service disruption. They
may walk through the entire plan or just a portion. The paper test
usually precedes the preparedness test.
 Preparedness test: Usually a localized version of a full test, wherein
actual resources are expended in the simulation of a system crash.
This test is performed regularly on different aspects of the plan and
can be a cost-effective way to gradually obtain evidence about how
good the plan is. It also provides a means to improve the plan in
increments.
 Full operational test: This is one step away from an actual service
disruption. The organization should have tested the plan well on
paper and locally before endeavoring to completely shut down
operations. For purposes of the BCP testing, this is the disaster.

You might also like