Facility Operations Maturity Model For Data Centers: White Paper 197
Facility Operations Maturity Model For Data Centers: White Paper 197
Data Centers
Executive summary
An operations & maintenance (O&M) program determines
to a large degree how well a data center lives up to its
design intent. The comprehensive data center facility
operations maturity model (FOMM) presented in this
paper is a useful method for determining how effective
that program is, what might be lacking, and for bench-
marking performance to drive continuous improvement
throughout the life cycle of the facility. This understand-
ing enables on-going concrete actions that make the data
center safer, more reliable, and operationally more
efficient.
by Schneider Electric White Papers are now part of the Schneider Electric white
paper library produced by Schneider Electric’s Data Center Science Center
[email protected]
Facility Operations Maturity Model for Data Centers
Introduction Every data center relies on effective operation, maintenance, and management by well-
trained, organized human beings. This program of operations and maintenance (O&M) plays
a critical role in how successful a data center is in meeting its design goals and business
objectives. White Paper 196, Essential Elements of Data Center Facility Operations,
describes twelve key components that make up an effective O&M program. This information
can be used to develop a program or be used as a tool for performing a quick and basic gap
analysis on an existing program. This “maturity model” white paper, on the other hand,
moves beyond just describing the high level elements of a good program. This paper
provides a more detailed framework for evaluating and benchmarking all aspects of an
existing program. This comprehensive and standardized framework offers a means to
determine to what level or degree the program is implemented, used, managed, and measur-
able. Armed with this information, facility operations teams can better ensure their O&M
program continuously lives up to their data center’s specific design and business goals
throughout the life cycle of the facility.
Maturity model’s Figure 1 shows the various phases of the data center life cycle. The primary focus of a
facility operations team would obviously be in the “Operate” phase. However, Facilities team
role in the data involvement in the early planning, design, and commissioning phases is important. Their
center life cycle detailed and practical knowledge of operations and maintenance can help ensure poor design
and construction choices are avoided that might, otherwise, compromise performance,
efficiency, and/or availability once the data center becomes operational.
Figure 1
Assessing performance and
O&M maturity are key tasks
within the data center life
cycle
To learn more about the benefits of including facility operation teams in earlier phases of the
life cycle, see The Green Grid’s White Paper 52, An Integrated Approach to Operational
Efficiency and Reliability.
As described in White Paper 196, Essential Elements of Data Center Facility Operations, it is
important to monitor, measure, and report on the performance of the data center so that
performance, efficiency, and resource-related problems can be avoided or, at least, identified
early. Besides problem prevention, assessments are necessary to benchmark performance,
determine whether changes are needed and what specific steps are required to reach the
next desired performance or maturity level. The maturity model presented in this paper offers
a framework for assessing the completeness and thoroughness of an O&M program. Ideally,
an organization would do the first assessment during Commissioning for new data centers or
as soon as possible for an existing data center. Next, results should be compared against
the data center’s goals for criticality, efficiency, and budget. Gaps should be identified and
decisions made as to whether any changes need to be made in the program. Once the level
of maturity has been benchmarked in this way, periodic assessments using the model should
be conducted at regular intervals (perhaps annually) or whenever there is a major change in
personnel, process, budget, or goals for the facility that might warrant a significant change in
the O&M program.
How the model The Schneider Electric data center facility operations maturity model (FOMM) proposed in
this paper has a form and function based on the IT Governance Institute’s maturity model
works structure 1. The model is built around 7 core disciplines (see Figure 2). Each discipline has
several operations-related elements associated with it. Each element is further divided into
several sub-elements. Each sub-element is graded or ranked on a scale of “1” to “5” (see
Figure 3) with “1” being least mature to “5” being the most developed. And for each of these
program sub-elements, each of the five maturity levels are defined in terms of the specific
criteria needed to achieve that particular score. The score criteria and the model it supports
have been tested and vetted with real data centers and their owners. The score criteria
represents a realistic view of the spectrum and depth of O&M program elements that owners
have in place today ranging from poorly managed data centers to highly evolved, forward
thinking data centers with proactive, measurable programs.
Environmental Emergency
Maintenance Site Operations Change Quality
Health & Safety Preparedness &
Management Management Management Management Management
Management Response
Figure 2
Emergency Asset Personnel
The FOMM is divided into 7 Response Management Infrastructure
Management Risk Analysis &
Document
Management Management
disciplines that are further Illness & Procedures & Communication
divided into elements and Injury Drills Work Order
Prevention Performance
sub-elements. This image Management Measurement
Site Training
shows the 7 disciplines and Operations Operational
Computerized
their 26 elements only. Scenario Maintenance Risk Procedure
Drills Management Management Development
System & Review
Efficiency & Inspections &
Optimization Financial Auditing
Vendor Management
Statutory Management
Compliance Change
Incident
Control Continuous
Management Spare Parts Site Condition Reporting Practices Improvement
Management
1
http://www.itgi.org/
1 2 3 4 5
2
Formal training is defined as a set of activities that combines purpose-specific written materials with
oral presentation, practical demonstration, or hands-on practice, along with a written evaluation.
• Monitoring is performed.
• The activity is under constant improvement.
• Formal training on the activity is being routinely performed and tracked.
• Automated tools are employed, but in a limited and fragmented way.
Level 5: Optimized
• Affected personnel are trained in the means and goals of the activity.
• Documentation is present.
• Monitoring is performed.
• The activity is under constant improvement.
• Formal training on the activity is being routinely performed and tracked.
• Automated tools are employed in an integrated way, to improve quality and effective-
ness of the activity.
Hazardous Materials
Hazardous Comms
Program Structure
Hazard Analysis
Training
LOTO
PPE
Figure 4
Example of how to depict a
sub-element’s present level
of maturity score; the Level 5
colors indicate to what Level 4
degree the score meets Level 3
goals. Level 2
Level 1
Level qualification Score quantification
1 Non-existent/Initial Not observed or N/A
2 Reactive Target
3 Proactive Partially achieved
4 Managed & Measured Achieved
5 Optimized
Figure 5 shows a unique score graphic called a “Risk Identification Chart” which shows the
level of risk (i.e., threat of system disruption; 100% represents highest risk) by line of inquiry.
That is, for any element in the model, each has sub-elements related to one of three “lines of
inquiry”: process, awareness & training, and implementation in the field (of whatever task,
knowledge, resources, etc. are required for that element to be in place). The scores for the
sub-elements are then grouped and divided based on these three lines of inquiry. These
particular lines of inquiry represent three key focus areas of any highly reliable and mature
data center facility operations team. Knowing which of the three areas poses the greater risk
to the facility helps organizations more quickly identify the type and amount of resources
needed to make corrections. Immediate corrective action plans should be developed to
address any element with risk levels at 60% or above.
Figure 6 shows a method for taking sub-elements that are deemed to have unacceptable
scores and ranking them based on how easy they are to improve (or implement) vs. their
impact on operations (if corrected). This is an effective way to help organizations prioritize
“where to go from here” based on FOMM goals, business objectives, time, and available
resources. “Quick wins” can be easily identified and separated from items that fit longer term,
strategic objectives that might require significant changes in staff competencies and behav-
iors. Base-lining the current implementation of the O&M program against the organization’s
desired levels should then lead to a concrete action plan with defined goals and owners.
Implementation score
Metrics Management
Most
CMMS/DCIM
Cost and ease of implementation
Staffing
Figure 6
Operational Procedures
Organizational Documentation
Change Control
Training
Example illustration of how
to rank elements in terms
Review/Revision
Security Policy
Mission Statement
impact on operations
Least
Most Least
Operational impact significance
Those who determine they lack the required time, expertise, or objectiveness would be best
served to hire a third party service provider with good facility operations experience. A third
party would more likely play an independent and objective role in the process having no
investment in the way things “have always been done”. There’s also value in having a “new
set of eyes” judge the program whose fresh viewpoint might yield more insightful and
actionable data analysis. Experienced service vendors offer the benefit of having knowledge
gained through the repeated performance of data center assessments throughout the
industry. Broad experience makes the third party more efficient and capable. This knowledge
makes it possible, for example, to provide their customer with an understanding of how their
O&M program compares to their peers or other data centers with similar business require-
ments. Beyond performing the assessment and helping to set goals, experienced third
parties can also be effective at providing implementation oversight which might lead to a
faster return on investment, especially when resources are already constrained.
Conclusion Preventing or reducing the impact of human error and system failures, as well as managing
the facility efficiently, all requires an effective and well-maintained O&M program. Ensuring
such a program exists and persists over time requires periodic reviews and effort to reconcile
assessment results with business objectives. With an orientation towards reducing risk, the
Facility Operations Maturity Model presented and attached to this paper is a useful framework
for evaluating and grading an existing program. Use of this assessment tool will enable
teams to thoroughly understand their program including:
• Whether and to what degree the facility is in compliance with statutory regulations and
safety requirements
• How responsive and capable staff is at handling and mitigating critical events and
emergencies
• The level of risk of system interruption from day-to-day operations and maintenance
activities
• Levels of staff knowledge and capabilities
Also know that grading and assessment of results is best done by an experienced, unbiased
assessor.
Patrick Donovan is a Senior Research Analyst for the Data Center Science Center at
Schneider Electric. He has over 18 years of experience developing and supporting critical
power and cooling systems for Schneider Electric’s IT Business unit including several award-
winning power protection, efficiency and availability solutions. An author of numerous white
papers, industry articles, and technology assessments, Patrick's research on data center
physical infrastructure technologies and markets offers guidance and advice on best practices
for planning, designing, and operating data center facilities.
Resources
Browse all
white papers
whitepapers.apc.com
Browse all
TradeOff Tools™
tools.apc.com
Contact us
© 2014 Schneider Electric. All rights reserved.
For feedback and comments about the content of this white paper:
If you are a customer and have questions specific to your data center project: