ITIL v3 Service Operation
ITIL v3 Service Operation
Service Operation
1.0 Introduction
1.1 Overview 1.2 Context
1.1 Overview
Service Operation (SO) A Phase in ITSM Lifecycle responsible for Business-As-Usual activities. SO can viewed as Factory of IT. (day to day activities must be conducted, controlled and managed) SO (purpose) is to deliver and support IT Services. (Staff should have in-place (shared or interfacing) processes and support tools to have an overview of the operation and to detect the failure to service quality and to manage cross-organizational workflows. SO includes many functions, processes and activities.
1.2 Context
1.2.1 Service Management 1.2.2 Good Practices in Public Domain 1.2.3 ITIL as good practice in Service Mgmt.
People Products
Process
Managed services
Partners
Assessments
Measurable Targets
Process Improvement
Metrics
9
- 70% achieving tangible and measurable benefits - 85% resolution at FPOC - cost per call down 30% - 50% reduction in new product cycle - 79% reduction in downtime and other factors - total savings per user c $800 p.a. - ROI up 1300%
- Downtime reduced from 60 to 15 mins
IDC survey
Barclays
Proctor and Gamble
10
Proprietary knowledge is deeply embedded Proprietary knowledge is customized for the local context and specific business needs Such knowledge available only under commercial terms Public frameworks are vetted by diverse sets of partners, suppliers and competitors. Easy to acquire such knowledge through the labour market Collaboration and coordination across organizations are easier on the basis of shared practices and standards.
Available Frameworks
COBIT
COBIT was first released in 1996, the current version, COBIT 5 was published in 2012. Its mission is to research, develop, publish and promote an authoritative, up-to-date, international set of generally accepted information technology control objectives for day-to-day use by business managers, IT professionals and assurance professionals..
ISO/IEC 20000
ISO/IEC 20000 is based on and replaces BS 15000, the internationally recognized British Standard. ISO/IEC 20000 is published in two parts: Part One is the specification for service management which covers the IT service management. It is this part which you can be audited against and it sets out minimum requirements that must be achieved in order to gain certification. Part Two is the code of practice for service management, which describes the best practices for service management processes within the scope of the specification
DO
Implement and Operate
Control Processes
Configuration Management Change Management
ACT
Continual Improvement
Resolution Processes
Incident Management Problem Management
Release Process
Release Management
Relationship Processes
Business Relationship Management Supplier Management
CHECK
Monitor, Measure and Review
Is best practice in IT Service Management, developed by OGC and supported by publications, qualifications and an international user group Assist organisations to develop a framework for IT Service Management Worldwide, most widely used best practice for IT Service Management Consists of a series of Core books giving guidance on the provision of quality IT services
18
Inter Relationships
ISO 20000 Part 1: - Specification for Service Management ISO 20000 Part 2: - Code of Practice for Service Management BIP 0005: - A Managers Guide BIP 0015: Self Assessment Workbook
BIP 0015 ISO 20000 Part 1 ISO 20000 Part 2 BIP 0005 Objective to Achieve
Code of Practice
Management Overview
Self Assessment
ITIL
Process Definition
Deploy Solution
19
ITIL V2 Books
1.2.3 ITIL and good practice in Service Management ITIL V3 The Structure
Core
Core Best Practice Guidance
Complementary
Support for particular market sector or technology
Web
Value added products, process maps, templates, studies
Customised implementation
ITIL Core
The ITIL Core consists of five publications Each provides the guidance necessary for an integrated approach as required by the ISO/IEC 20000 standard specification:
The Big Picture, Service Model Maps, Practice Basics, Getting Started
V3 Manager Bridge 5
3
V2 Service Manager 17
2 credits
Evolvement of SM
The origins of Service Management are in traditional service businesses such as airlines, banks, hotels and phone companies. Organizational capabilities are shaped by the challenges they are expected to overcome. It is also a professional practice supported by an extensive body of knowledge, experience and skills Its practice has grown with the adoption by IT organizations of a service-oriented approach to managing IT applications, infrastructure and processes. Solutions to business problems and support for business models, strategies and operations are increasingly in the form of services. The popularity of shared services and outsourcing has contributed to the increase in the number of organizations that are service providers, including internal organizational units.
Meaning of Service
Objective
The objective of ITIL Service Operation is to make sure that IT services are delivered effectively and efficiently. This includes 1. fulfilling user requests 2. resolving service failures 3. fixing problems 4. carrying out routine operational tasks.
Processes
Event Management Incident Management Request Fulfillment Access Management Problem Management IT Operations Control Facilities Management Application Management Technical Management
Event Management
Objective: The objective of ITIL Event Management is to make sure CIs and services are constantly monitored. Event Management aims to filter and categorize Events in order to decide on appropriate actions if required. Process Description :Essentially, the activities and process objectives of the Event Management process are identical in ITIL V3 and V2. In ITIL 2011 Event Management has been updated to reflect the concept of 1st Level Correlation and 2nd Level Correlation
Sub Processes
1. Maintenance of Event Monitoring Mechanisms and Rules
- To set up and maintain the mechanisms for generating meaningful Events and effective rules for their filtering and correlating. Event Filtering and 1st Level Correlation - To filter out Events which are merely informational and can be ignored, and to communicate any Warning and Exception Events. 2nd Level Correlation and Response Selection - To interpret the meaning of an Event and select a suitable response if required. Event Review and Closure - To check if Events have been handled appropriately and may be closed. This process also makes sure that Event logs are analyzed in order to identify trends or patterns which suggest corrective action must be taken.
2.
3. 4.
Definitions
to represent process outputs and inputs
Event- see Event Record Event Categorization Scheme The Categorization Scheme for Events supports a consistent approach to dealing with specific types of Events. Ideally, this scheme should be harmonized with the schemes to categorize CIs, Incidents and Problems. Event Filtering and Correlation Rules - Rules and criteria used to determine if an Event is significant and to decide upon an appropriate response. Event Filtering and Correlation Rules are typically used by Event Monitoring systems. Some of those rules are defined during the Service Design stage, for example to ensure that Events are triggered when the required service availability is endangered. Event Record - A record describing a change of state which has significance for the management of a Configuration Item or service. The term Event is also used to mean an alert or notification created by any IT service, Configuration Item or monitoring tool. Events often require IT operations personnel to take actions, and may lead to Incidents being logged. Event Trends and Patterns -Any trends and patterns identified during analysis of significant Events, which suggest that improvements to the infrastructure are needed.
IT OPM A[1]R[2] A A AR
IT OPR R R -
EMS R R -
Other Roles R -
Incident Management
Objective: ITIL Incident Management aims to manage the lifecycle of all Incidents. The primary objective of Incident Management is to return the IT service to users as quickly as possible. Parent Process: Service Operation Process Owner: Incident Manager
Guidance has been improved in Incident Management on how to prioritize an Incident (see Checklist Incident Prioritization Guideline). Additional steps have been added to Incident Resolution by 1st Level Support to explain that Incidents should be matched (if possible) to existing Problems and Known Errors. Incident Resolution by 1st Level Support and Incident Resolution by 2nd Level Support have been considerably expanded to provide clearer guidance on when to invoke Problem Management from Incident Management. The emphasis is now on restoring services as quickly as possible, and to seek the help of Problem Management if the underlying cause of an Incident cannot be resolved with a minor Change and/or within the committed resolution time. The Incident Management sub-process Incident Closure and Evaluation now states more clearly that it is important to check whether there are new Problems, Workarounds or Known Errors that must be submitted to Problem Management. The process overview of ITIL Incident Managementis showing the most important interfaces (see Figure 1).
Medium (M)
Low (L)
Staff are able to deliver an acceptable service but this requires extra effort Customers are inconvenienced but not in a significant way
Medium (M)
Low (L)
A minimal number of users is affected A minimal number of customers is affected The financial impact of the Incident is (for example) likely to be less than $1,000 The damage to the reputation of the business is likely to be minimal
Incident Priority is derived from urgency and impact. If classes are defined to rate urgency and impact (see above), an Urgency-Impact Matrix can be used to define priority classes, identified in this example by colors and priority codes:
H H 1
M 2
N 3
Urgency
M
L
2
3
3
4
4
5
Target Resolution Time 1 Hour 4 Hours 8 Hours 24 Hours
Priority Code 1 2 3 4
Very low
1 Day
1 Week
Some of the key characteristics that make these Major Incidents are:
The ability of significant numbers of customers and/or key customers to use services or systems is or will be affected. The cost to customers and/or the service provider is or will be substantial, both in terms of direct and indirect costs (including consequential loss). The reputation of the Service Provider is likely to be damaged. AND The amount of effort and/or time required to manage and resolve the incident is likely to be large and it is very likely that agreed service levels (target resolution times) will be breached. A Major Incident is also likely to be categorized as a critical or high priority incident.
9 Sub-Processes
Incident Management Support - to provide and maintain the tools, processes, skills and rules for an effective and efficient
handling of Incidents.
Incident Logging and Categorization - To record and prioritize the Incident with appropriate diligence, in order to facilitate a
swift and effective resolution.
Immediate Incident Resolution by 1st Level Support - To solve an Incident (service interruption) within the agreed time
schedule. The aim is the fast recovery of the IT service, where necessary with the aid of a Workaround. As soon as it becomes clear that 1st Level Support is not able to resolve the Incident itself or when target times for 1st level resolution are exceeded, the Incident is transferred to a suitable group within 2nd Level Support.
Incident Resolution by 2nd Level Support - To solve an Incident (service interruption) within the agreed time schedule. The
aim is the fast recovery of the service, where necessary by means of a Workaround. If required, specialist support groups or third-party suppliers (3rd Level Support) are involved. If the correction of the root cause is not possible, a Problem Record is created and the errorcorrection transferred to Problem Management.
Handling of Major Incidents - To resolve a Major Incident. Major Incidents cause serious interruptions of business activities and
must be resolved with greater urgency. The aim is the fast recovery of the service, where necessary by means of a Workaround. If required, specialist support groups or third-party suppliers (3rd Level Support) are involved. If the correction of the root cause is not possible, a Problem Record is created and the error-correction transferred to Problem Management.
Incident Monitoring and Escalation- To continuously monitor the processing status of outstanding Incidents, so that countermeasures may be introduced as soon as possible if service levels are likely to be breached.
Incident Closure and Evaluation Process : To submit the Incident Record to a final quality control before it is closed. The aim
is to make sure that the Incident is actually resolved and that all information required to describe the Incident's life-cycle is supplied in sufficient detail. In addition to this, findings from the resolution of the Incident are to be recorded for future use.
Pro-Active User Information Process : To inform users of service failures as soon as these are known to the Service Desk, so
that users are in a position to adjust themselves to interruptions. Proactive user information also aims to reduce the number of inquiries by users. This process is also responsible for distributing other information to users, e.g. security alerts.
Incident Management Reporting Process : ITIL Incident Management Reporting aims to supply Incident-related information
to the other Service Management processes, and to ensure that that improvement potentials are derived from past Incidents.
Definitions The following ITIL terms and acronyms (information objects) are used in the ITIL Incident Management process to represent process outputs and inputs:
Incident -An Incident is defined as an unplanned interruption or reduction in quality of an IT service (a Service Interruption). Incident Escalation Rules -A set of rules defining a hierarchy for escalating Incidents, and triggers which lead to escalations. Triggers are usually based on Incident severity and resolution times. See also: Checklist Incident Priority Incident Management Report -A report supplying Incident-related information to the other Service Management processes. Incident Model -An Incident Model contains the pre-defined steps that should be taken for dealing with a particular type of Incident. This is a way to ensure that routinely occurring Incidents are handled efficiently and effectively. Incident Prioritization Guideline -The Incident Prioritization Guideline describes the rules for assigning priorities to Incidents, including the definition of what constitutes a Major Incident. Since Incident Management escalation rules are usually based on priorities, assigning the correct priority to an Incident is essential for triggering appropriate escalations. See also: Checklist Incident Prioritization Guideline Incident Record -A set of data with all details of an Incident, documenting the history of the Incident from registration to closure. An Incident is defined as an unplanned interruption or reduction in quality of an IT service. Every event that could potentially impair an IT service in the future is also an Incident (e.g. the failure of one hard-drive of a set of mirrored drives). See also: ITIL Checklist Incident Record Incident Status Information -A message containing the present status of an Incident sent to a user who earlier reported a service interruption. Status information is typically provided to users at various points during an Incident's lifecycle. Major Incident - Major Incidents cause serious interruptions of business activities and must be solved with greater urgency. See also: Checklist Incident Priority: Major Incidents Major Incident Review -A Major Incident Review takes place after a Major Incident has occurred. The review documents the Incident's underlying causes (if known) and the complete resolution history, and identifies opportunities for improving the handling of future Major Incidents. Notification of Service FailureThe reporting of a service failure to the Service Desk, for example by a user via telephone or e-mail, or by a system monitoring tool. Pro-Active User Information - A notification to users of existing or imminent service failures even if the users are not yet aware of the interruptions, so that users are in a position to prepare themselves for a period of service unavailability. Status Inquiry - An inquiry regarding the present status of an Incident or Service Request, usually from a user who earlier reported an Incident or submitted a request. Support Request - A request to support the resolution of an Incident or Problem, usually issued from the Incident or Problem Management processes when further assistance is needed from technical experts. User Escalation -Escalation regarding the processing of an Incident or Service Request, initiated by a user experiencing delays or a failure to restore their services. User FAQs -Self-help information for users supplied by the Service Desk, usually as part of the Support Pages on the intranet.
Number of Escalations
Incident Resolution Time First Time Resolution Rate Resolution within SLA Incident Resolution Effort
Printer
Link/ Attribution to another Incident (if a similar outstanding Incident exists, to which the new Incident is able to be attributed)
Assigned triggers to the Escalation Hierarchy (conditions/ rules, which lead to the Escalation to a particular level within the Escalation Hierarchy)
Documentation of applied Workarounds Documentation of the root cause of the Service interruption Documentation of the applied resolution to eliminate the root cause Date of the Incident resolution Date of the Incident closure
Statistical evaluations
Roles | Responsibilities
Incident Manager - Process Owner - The Incident Manager is responsible for the effective implementation of the Incident Management process and carries out the corresponding reporting. He represents the first stage of escalation for Incidents, should these not be resolvable within the agreed Service Levels. 1st Level Support - The responsibility of 1st Level Support is to register and classify received Incidents and to undertake an immediate effort in order to restore a failed IT service as quickly as possible. If no ad-hoc solution can be achieved, 1st Level Support will transfer the Incident to expert technical support groups (2nd Level Support). 1st Level Support also keeps users informed about their Incidents' status at agreed intervals. 2nd Level Support - 2nd Level Support takes over Incidents which cannot be solved immediately with the means of 1st Level Support. If necessary, it will request external support, e.g. from software or hardware manufacturers. The aim is to restore a failed IT service as quickly as possible. If no solution can be found, the 2nd Level Support passes on the Incident to Problem Management. 3rd Level Support - 3rd Level Support is typically located at hardware or software manufacturers (third-party suppliers). Its services are requested by 2nd Level Support if required for solving an Incident. The aim is to restore a failed IT Service as quickly as possible. Major Incident Team - A dynamically established team of IT managers and technical experts, usually under the leadership of the Incident Manager, formulated to concentrate on the resolution of a Major Incident.
Responsibility Matrix
ITIL Role / Sub-Process Incident Manager A[1]R[2] A 1st Level Support R 2nd Level Support Major Incident Team -
Responsibility Matrix: ITIL Incident Management Applications Analyst[3] Technical Analyst[3] IT Operator[3] -
A AR AR A A AR
R R R R -
R -
R -
R[4] -
R[4] -
R[4] R -
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the Incident Management process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within Incident Management. [3] see Role descriptions... [4] In cooperation, as required. 2nd Level Support Groups often include Applications Analysts and/ or Technical Analysts.
Request Fulfilment Process Objective: To fulfill Service Requests, which in most cases are minor (standard) Changes (e.g. requests to change a password) or requests for information. Process Description - Request Fulfilment was added as a new process to ITIL V3 with the aim to have a dedicated process dealing with Service Requests. This was motivated by a clear distinction in ITIL V3 between Incidents (Service Interruptions) and Service Requests (standard requests from users, e.g. password resets). In ITIL 2011, Request Fulfilment has been completely revised. To reflect the latest guidance Request Fulfilment now consists of five sub-processes, to provide a detailed description of all activities and decision points. Request Fulfilment now contains interfaces with Incident Management - if a Service Request turns out to be an Incident and with Service Transition if fulfilling a Service Request requires the involvement of Change Management. The process overview of ITIL Request Fulfilment is showing the most important interfaces (see Figure 1). A clearer explanation of the information that describes a Service Request and its life cycle has been added. The concept of Service Request Models is explained in more detail.
Sub Processes
Request Fulfilment Support Process Objective: To provide and maintain the tools, processes, skills and rules for an effective and efficient handling of Service Requests. Request Logging and CategorizationProcess Objective: To record and categorize the Service Request with appropriate diligence and check the requester's authorization to submit the request, in order to facilitate a swift and effective processing. Request Model ExecutionProcess Objective: To process a Service Request within the agreed time schedule. Request Monitoring and EscalationProcess Objective: To continuously monitor the processing status of outstanding Service Requests, so that counter-measures may be introduced as soon as possible if service levels are likely to be breached. Request Closure and EvaluationProcess Objective: To submit the Request Record to a final quality control before it is closed. The aim is to make sure that the Service Request is actually processed and that all information required to describe the request's life-cycle is supplied in sufficient detail. In addition to this, findings from the processing of the request are to be recorded for future use.
Definitions
Request for Service - A formal request from a user for something to be provided for example, a request for information or advice; to reset a password; or to install a workstation for a new user. The details of a Request for Service are recorded by Request Fulfilment in a Service Request Record. Service Request Model - A (Service) Request Model defines specific agreed steps that will be followed for a Service Request of a particular type (or category). Service Request Record - A record containing all details of a Service Request. Service Requests are formal requests from a user for something to be provided for example, a request for information or advice; to reset a password; or to install a workstation for a new user. Service Request Status Information - A message containing the present status of a Service Request sent to a user who earlier reported requested a service. Status information is typically provided to users at various points during a Service Request's lifecycle.
Roles | Responsibilities
Incident Manager (Process Owner) - The Incident Manager is responsible for the effective implementation of the Incident Management process and carries out the respective reporting.He represents the first stage of escalation for Incidents, should these not be resolvable within the agreed Service Levels. 1st Level Support - The responsibility of 1st Level Support is to register and classify received Incidents and to undertake an immediate effort in order to restore a failed IT service as quickly as possible. If no ad-hoc solution can be achieved, 1st Level Support will transfer the Incident to expert technical support groups (2nd Level Support). 1st Level Support also processes Service Requests and keeps users informed about their Incidents' status at agreed intervals. Service Request Fulfilment Group - Groups specialize on the fulfilment of certain types of Service Requests. Typically, 1st Level Support will process simpler requests, while others are forwarded to the specialized Fulfilment Groups.
ITIL Role / Sub-Process Request Fulfilment Support Request Logging and Categorization Request Model Execution Request Monitoring and Escalation Request Closure and Evaluation
Incident Manager
A[1]R[2]
AR
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the Request Fulfilment process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within Request Fulfilment.
Access Management
Objective: ITIL Access Management aims to grant authorized users the right to use a service, while preventing access to non-authorized users. The Access Management processes essentially execute policies defined in Information Security Management. Access Management is sometimes also referred to as Rights Management or Identity Management. Part of: Service Operation Process Owner: Access Manager Process Description - Access Management was added as a new process to ITIL V3. The decision to include this dedicated process was motivated by Information security reasons, as granting access to IT services and applications only to authorized users is of high importance from an Information Security viewpoint. In ITIL 2011 an interface between Access Management and Event Management has been added, to emphasize that (some) Event filtering and correlation rules should be designed by Access Management to support the detection of unauthorized access to services. The process overview of ITIL Access Management is showing the most important interfaces (see Figure 1). A dedicated activity has been added to revoke access rights if required, to make this point clearer. In ITIL 2011 it has been made clearer in the Request Fulfilment and Incident Management processes that the requester's authorization must be checked.
Sub Processes
Maintenance of Catalogue of User Roles and Access Profiles
Process Objective: To make sure that the catalogue of User Roles and Access Profiles is still appropriate for the services offered to customers, and to prevent unwanted accumulation of access rights.
Definitions
Access Rights - A set of data defining what services a user is allowed to access. This definition is achieved by assigning the user, identified by his User Identity, to one or more User Roles. Request for Access Rights - A request to grant, change or revoke the right to use a particular service or access certain assets. User Identity Record - A set of data with all the details identifying a user or person. It is used to grant rights to that user or person. User Identity Request - A request to create, modify or delete a User Identity. User Role - A role as part of a catalogue or hierarchy of all the roles (types of users) in the organization. Access rights are based on the roles that individual users have as part of an organization. User Role Access Profile - A set of data defining the level of access to a service or group of services for a certain type of user (User Role). User Role Access Profiles help to protect the confidentiality, integrity and availability of assets by defining what information computer users can utilize, the programs that they can run, and the modifications that they can make. User Role Requirements - Requirements from the business side for the catalogue or hierarchy of user roles (types of users) in the organization. Access rights are based on the roles that individual users have as part of an organization.
A[1]R[2]
AR
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the Access Management process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within Access Management.
Problem Management
Objective: The objective of ITIL Problem Management is to manage the lifecycle of all Problems. The primary objectives of Problem Management are to prevent Incidents from happening, and to minimize the impact of incidents that cannot be prevented. Proactive Problem Management analyzes Incident Records, and uses data collected by other IT Service Management processes to identify trends or significant Problems. Process Description - Essentially, the activities and process objectives of ITIL Problem Management are identical in ITIL V3 and ITIL V2. A new sub-process Major Problem Review was introduced in ITIL V3 to review the solution history of major Problems in order to prevent a recurrence and learn lessons for the future. In ITIL 2011 the new sub-process Proactive Problem Identification has been added to emphasize the importance of proactive Problem Management. In Problem Categorization and Prioritization, it has been made clearer that categorization and prioritization should be harmonized with the approach used in Incident Management, to facilitate matching between Incidents and Problems. The process overview of ITIL Problem Management is showing the most important interfaces (see Figure 1). The concept of recreating Problems during Problem Diagnosis and Resolution is now more prominent. This sub-process has been completely revised to provide clearer guidance on how this process cooperates with Incident Management. Note: The new ITIL 2011 books also contain an expanded section on problem analysis techniques and examples for situations where the various techniques may be applied.
Sub-Processes
1. Proactive Problem Identification - To improve overall availability of services by proactively identifying Problems. Proactive Problem Management aims to identify and solve Problems and/or provide suitable Workarounds before (further) Incidents recur. 2. Problem Categorization and Prioritization - To record and prioritize the Problem with appropriate diligence, in order to facilitate a swift and effective resolution. 3. Problem Diagnosis and Resolution- To identify the underlying root cause of a Problem and initiate the most appropriate and economical Problem solution. If possible, a temporary Workaround is supplied. 4. Problem and Error Control - To constantly monitor outstanding Problems with regards to their processing status, so that where necessary corrective measures may be introduced.
Sub-Processes
5.
6.
7.
Problem Closure and Evaluation - To ensure that - after a successful Problem solution - the Problem Record contains a full historical description, and that related Known Error Records are updated. Major Problem Review - To review the resolution of a Problem in order to prevent recurrence and learn any lessons for the future. Furthermore it is to be verified whether the Problems marked as closed have actually been eliminated. Problem Management Reporting - ITIL Problem Management Reporting aims to ensure that the other Service Management processes as well as IT Management are informed of outstanding Problems, their processing-status and existing Workarounds (see "Problem Management Report").
Definitions
Known Error - is a problem that has a documented root cause and a Workaround. Known Errors are managed throughout their lifecycle by the Problem Management process. The details of each Known Error are recorded in a Known Error Record stored in the Known Error Database (KEDB). As a rule, Known Errors are identified by Problem Management, but Known Errors may also be suggested by other Service Management disciplines, e.g. Incident Management, or by suppliers. Known Error Database (KEDB) - is created by Problem Management and used by Incident and Problem Management to manage all Known Error Records. Problem - cause of one or more Incidents. The cause is not usually known at the time a Problem Record is created. Problem Management Report - A report supplying Problem-related information to the other Service Management processes. Problem Record - contains all details of a Problem, documenting the history of the Problem from detection to closure (see: ITIL Checklist Problem Record). Suggested new Known Error - A suggestion to create a new entry in the Known Error Database, for example raised by the Service Desk or by Release Management. Known Errors are managed throughout their lifecycle by Problem Management. Suggested new Problem - A notification about a suspected Problem, handed over to Problem Management for further investigation, possibly leading to the formal logging of a Problem. Suggested new Workaround - A suggestion to enter a new Workaround in the Known Error Database, for example raised by the Service Desk or by Release Management. Workarounds are managed throughout their lifecycle by Problem Management. Workaround - are temporary solutions aimed at reducing or eliminating the impact of Known Errors (and thus Problems) for which a full resolution is not yet available. As such, Workarounds are often applied to reduce the impact of Incidents or Problems if their underlying causes cannot be readily identified or removed.
Definition Number of Problems registered by Problem Management grouped into categories Average time for resolving Problems grouped into categories
Number of Problems
Number of Problems where the underlying root cause is not known at a particular time
Number of reported Incidents linked to the same Problem after problem identification Average time between first occurance of an Incident and identification of the underlying root cause Average work effort for resolving Problems grouped into categories
Printer
Links to
Incidents associated with this problem Other Problems, whose resolution is associated with this Problem
Priority (for example in stages 1, 2 and 3): The result from the combination of urgency and the degree of severity
Relationships to CIs Problem category, usually selected from a category-tree according to the following example (Problem categories should be harmonized with CI and Incident categories to support matching between Incidents, Problems and CIs):
Hardware error
Server A
Component x Component y Symptom a Symptom b
Software error
Links to related Problem Records (if there are other outstanding Problems related to this one) Links to related Incident Records (if outstanding Incidents exist, whose solution depends on the solution of this Problem) Links to Known Errors and Workarounds (if Known Errors and Workarounds related to the Problem have been identified) Problem Recovery Procedures: Any procedures that are required to be performed to eliminate the Problem. These procedures may need to be performed as part of removing Workarounds that have been applied while solving related Incidents. Activity log/ resolution history
Date and time Person in charge Description of activities New Problem status (if the activity results in a change of status)
Priority (e.g. in stages 1, 2 and 3): A function of urgency and the degree of severity
Documentation of the root cause of the Problem (Known Error) Documentation of possible Workarounds Documentation of the applied (causal) resolution Date of Problem resolution Date of Problem closure
Problems with special importance regarding Availability, Capacity, IT Service Continuity and IT Security Management
Description Problem cause Applied resolution strategy
Elimination of the root cause Possible Workarounds
Other important Problems with extensive effects upon the quality of the IT Services
Description Problem cause Applied resolution strategy
Elimination of the root cause Possible Workarounds
Roles | Responsibilities
Responsibility Matrix: ITIL Problem Management Problem Applications ITIL Role | Sub-Process Technical Analyst[3] Manager Analyst[3] Proactive Problem Identification Problem Categorization and Prioritization Problem Diagnosis and Resolution A[1]R[2] AR -
AR
AR
AR AR AR
Problem Manager (Process Owner) is responsible for managing the lifecycle of all Problems. His primary objectives are to prevent Incidents from happening, and to minimize the impact of Incidents that cannot be prevented. To this purpose he maintains information about Known Errors and Workarounds.
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the Problem Management process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within Problem Management. [3] see Role descriptions...
IT Operations Control
Objective: IT Operations Control aims to monitor and control the IT services and their underlying infrastructure. The process IT Operations Control executes day-to-day routine tasks related to the operation of infrastructure components and applications. This includes job scheduling, backup and restore activities, print and output management, and routine maintenance. Part of: Service Operation Process Owner: IT Operations Manager
Process Description
ITIL does not provide a detailed explanation of all aspects of IT Operations, as the activities to be carried out will depend on the specific applications and infrastructure components in use. Rather, ITIL 2011 highlights common operational activities and assists in identifying important interfaces with other Service Management processes. The official ITIL publications treat IT Operations Control as a "function". The process overview of IT Operations Control is showing the most important interfaces (see Figure 1). Remark: In ITIL V3, IT Operations Control activities were covered in the process "IT Operations Management".
Roles | Responsibilities
IT Operations Manager - Process Owner. An IT Operations Manager will be needed to take overall responsibility for a number of Service Operation activities. For instance, this role will ensure that all day-to-day operational activities are carried out in a timely and reliable way. IT Operator - are the staff who perform the dayto-day operational activities. Typical responsibilities include: Performing backups, ensuring that scheduled jobs are performed, installing standard equipment in the data center.
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the IT Operations Control process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within IT Operations Control.
IT Facilities Management
Objective: The objective of ITIL Facilities Management is to manage the physical environment where the IT infrastructure is located. Facilities Management includes all aspects of managing the physical environment, for example power and cooling, building access management, and environmental monitoring. Part of: Service Operation Process Owner: Facilities Manager
Process Description
ITIL Facilities Management is part of ICT Infrastructure Management in ITIL V2, where some aspects of managing facilities are described in more detail as in the new ITIL V3 books. Interfaces between Facilities Management and the other ITIL processes were adjusted in order to reflect the new ITIL V3 process structure. The process overview of ITIL Facilities Management is showing the most important interfaces (see Figure 1). Note: The official ITIL publications treat Facilities Management as a "function".
Roles | Responsibilities
Facilities Manager (Process Owner) The Facilities Manager is responsible for managing the physical environment where the IT infrastructure is located. This includes all aspects of managing the physical environment, for example power and cooling, building access management, and environmental monitoring.
Responsibility Matrix: ITIL Facilities Management ITIL Role / SubProcess Facilities Management (no sub-processes specified)
Facilities Manager
A[1]R[2]
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the ITIL Facilities Management process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within ITIL Facilities Management.
Process Description
Application Management is treated in ITIL as a "function". It plays an important role in the management of applications and systems. Many Application Management activities are embedded in various ITIL processes - but not all Application Management activities. For this reason, at IT Process Maps we decided to introduce an Application Management process as part of the ITIL Process Map which contains the Application Management activities not covered in any other ITIL process. Application Management activities embedded in other processes are shown there, with responsibility assigned to the Applications Analyst role. The process overview of ITIL Application Management is showing the most important interfaces (see Figure 1).
A[1]R[2]
Process Description
Technical Management is treated in ITIL as a "function". It plays an important role in the management of the IT infrastructure. Many Technical Management activities are embedded in various ITIL processes - but not all Technical Management activities. For this reason, at IT Process Maps we decided to introduce a Technical Management process as part of the ITIL Process Map which contains the Technical Management activities not covered in any other ITIL process. Technical Management activities embedded in other processes are shown there, with responsibility assigned to the Technical Analyst role. The process overview of ITIL Technical Managementis showing the most important interfaces (see Figure 1).
Roles | Responsibilities
Technical Analyst - Process Owner - is a Technical Management role which provides technical expertise and support for the management of the IT infrastructure. There is typically one Technical Analyst or team of analysts for every key technology area. This role plays an important part in the technical aspects of designing, testing, operating and improving IT services. It is also responsible for developing the skills required to operate the IT infrastructure.
Remarks [1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the ITIL Technical Management process. [2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within ITIL Technical Management.
Responsibility Matrix: ITIL Technical Management ITIL Role / SubProcess Technical Analyst
A[1]R[2]