Section 4 Data Center Electrical Design - Architecture Resilience
Section 4 Data Center Electrical Design - Architecture Resilience
Sub-Transmission System
The sub-transmission system is pivotal in the electricity distribution process, bridging the
gap between the high-voltage (HV) transmission network and medium-voltage (MV)
substations within a specific region or locality. Its primary purpose is to facilitate the
transition of electricity from the HV grid, designed for long-distance transmission, to levels
suitable for local distribution via the MV network.
In terms of voltage levels, sub-transmission lines typically operate at voltages higher than
those used in MV distribution but lower than the HV transmission lines. Common voltage
levels range from 34.5 kV to 161 kV, tailored to regional standards and operational
requirements.
Substations within the sub-transmission system play a crucial role by stepping down
voltages from the HV grid to levels appropriate for the MV distribution network. They serve
as key points for voltage transformation, switching operations, monitoring, and control
before electricity proceeds to local consumers.
In summary, the sub-transmission system acts as a critical intermediary between the long-
distance HV transmission network and local MV distribution networks. Through its meshed
topology and strategic voltage management, it facilitates reliable electricity distribution
within regional or local contexts, essential for sustaining economic activities and
community infrastructure.
MV Distribution Networks
MV distribution networks play a critical role in delivering electricity from HV/MV
substations to medium-voltage (MV) customers and MV/LV distribution transformers,
operating within voltage ranges typically from 2.4 kV to 34.5 kV as per regional standards.
These networks are characterized by an open-loop topology, where radial feeders extend
from substations to various endpoints such as industrial zones, commercial areas, and
residential neighborhoods. Radial feeders are unidirectional paths that do not loop back to
the substation, ensuring straightforward power flow direction. This design offers
advantages such as efficient fault isolation, enabling quick identification and containment
of faults to minimize disruptions and expedite power restoration to unaffected areas.
Additionally, the open-loop topology provides operational flexibility, allowing for easy
network reconfiguration during maintenance or changes in electricity demand, optimizing
load balancing and overall network efficiency. Scalability is another key benefit, as the
design allows for expansion by adding new feeders or connecting additional distribution
transformers, supporting the growth and adaptation of urban and industrial areas to meet
evolving energy needs and ensuring reliable electricity supply across diverse sectors.
LV Distribution Networks
Low Voltage (LV) networks are essential for distributing electricity from MV/LV
transformers directly to low-voltage (LV) customers, including residential buildings, small
businesses, and other low-power consumers. These networks typically operate at
standard voltages like 120V or 240V for residential areas and 208V or 480V for commercial
and industrial zones. Using a radial architecture, LV networks feature unidirectional
feeders that extend from MV/LV transformers to individual customers or groups of
customers. Each feeder originates from a transformer and extends outward to serve a
specific area, ensuring electricity flows along a single path without looping back.
Customers connect at the end of the feeder line, receiving electricity directly from the
transformer through the feeder. Radial architectures are prized for their simplicity, cost-
effectiveness, and ease of maintenance, making them ideal for areas with consistent load
characteristics and predictable electricity demand patterns. Key advantages include
reduced infrastructure needs, straightforward fault detection and isolation, and scalability
through the addition of new feeders or transformers to accommodate future growth in
electricity demand or customer distribution changes. LV networks with radial architecture
efficiently deliver reliable and cost-effective power supply to residential, commercial, and
industrial areas while enabling straightforward expansion and maintenance as needed to
support evolving energy requirements.
Grid HV Level Operations and Emergency Procedures
Fault Handling on HV Lines
The HV (High Voltage) grid employs a meshed topology, comprising interconnected paths
of lines and substations, which enhances reliability by providing redundancy and
alternative routes for electricity flow. Zone protection substations strategically located
throughout the grid employ protective relays and devices to swiftly detect faults and
isolate affected sections. When a fault occurs, protective relays at the nearest zone
protection substation detect it within milliseconds, triggering actions like opening circuit
breakers or disconnecting switches to isolate the faulted area.
During the fault clearing process, there is a transient voltage drop in the affected zone,
which can impact nearby substations. However, due to the meshed topology and rapid
fault detection and isolation, the overall impact on customers is minimized. Customers
typically experience brief interruptions or voltage dips, but power is restored swiftly,
ensuring no permanent loss of service. Engineers and operators closely monitor voltage
stability throughout this process using protective devices and automation systems to
maintain grid stability and minimize disruptions.
In summary, the HV grid's meshed topology with zone protection substations enables
efficient fault management, ensuring minimal customer downtime and enhancing the
overall reliability and resilience of the HV power transmission system.
The restoration timeline for permanent faults varies widely, typically ranging from a few
minutes to several hours depending on factors like accessibility to the fault location,
availability of replacement equipment, and the complexity of the repair needed. Utilities
prioritize preventive maintenance and monitoring programs to minimize the occurrence of
faults and maximize system reliability. Effective emergency response plans ensure rapid
mobilization of repair crews and efficient deployment of resources during fault restoration
efforts, ensuring swift recovery and minimal disruption to MV customers.
Grid Supply Reliability Performance Variability
Reliability Performance Variation
Grid supply reliability is a pivotal factor in designing data center electrical systems,
influencing decisions on backup strategies and overall resilience. This reliability varies
significantly by country and location, with developed nations generally boasting more
stable grids due to advanced infrastructure and robust maintenance. In contrast,
developing countries may face more frequent outages due to infrastructure challenges and
underinvestment. Regulatory frameworks also play a crucial role, shaping grid standards,
maintenance schedules, and investment requirements for utilities. Within countries, urban
areas typically enjoy more reliable power due to higher infrastructure density and better
maintenance, whereas rural areas often contend with longer outage durations and less
frequent maintenance. Proximity to power generation sources such as plants and dams
can enhance reliability, reducing transmission losses. Natural disasters and geographical
obstacles like mountains or forests can further complicate grid reliability. Evaluating
historical outage data and consulting local utilities are essential steps in assessing grid
reliability for data center design, enabling informed decisions to mitigate risks and ensure
uninterrupted operations.
Example in Europe;
In Europe, the reliability of electrical utilities varies significantly depending on the type and
frequency of interruptions experienced across different voltage levels. Medium voltage
(MV) grid connections generally encounter more frequent interruptions and voltage drops
compared to high voltage (HV) grid connections. For instance, MV grid connections may
experience regional blackouts (lasting over 4 hours) approximately 0.01-0.02 times per
year, whereas HV grid connections have similar frequencies. Long interruptions (over 3
minutes) occur on MV grids about 0.5-5 times per year, with HV grids experiencing them
less frequently, around 0.1 times annually. Short interruptions (less than 3 minutes) are
more common on MV grids, ranging from 0.5 to 20 times per year, compared to about 0.5
times per year on HV grids. Severe voltage drops are also more frequent on MV grids,
occurring 10-100 times per year, compared to 5-15 times per year on HV grids.
These variations in reliability performance between MV and HV grids influence the design
considerations for data center electrical systems in Europe. Designers must prioritize
robust backup power solutions like UPS systems and generators to maintain continuous
operations during interruptions, particularly for longer outages typical on MV grids.
Redundancy in power paths and systems within the data center infrastructure is crucial to
mitigate the impact of interruptions and voltage fluctuations on critical operations. Real-
time monitoring systems play a pivotal role in promptly detecting interruptions and
automatically initiating backup power sources as needed to ensure uninterrupted service.
Compliance with local regulations and standards regarding grid reliability and backup
power requirements is essential to safeguard operational resilience and avoid potential
penalties. Understanding these reliability metrics enables data center designers in Europe
to effectively plan for high availability and minimize downtime during grid disturbances.
MV Grid Connection is suitable for data centers with lower power needs, typically up to
several megawatts, offering quicker and less costly establishment compared to HV
options. In contrast, HV Grid Connection with a dedicated HV/MV substation is essential
for high-demand data centers, supporting tens of megawatts with enhanced reliability and
reduced transmission losses.
Key considerations include evaluating local grid infrastructure, reliability needs, initial
setup costs, long-term investment benefits, scalability for future expansions, and
compliance with regulatory standards. Implementing the chosen grid connection involves
conducting feasibility studies, meticulous design and planning, stakeholder coordination
with utility providers and regulatory bodies, and infrastructure development to ensure
seamless integration and operational reliability.
Backup for Short Grid Interruptions
Utility interruptions below 3 minutes
In scenarios where utility short interruptions lasting less than 3 minutes occur, deploying
UPS (Uninterruptible Power Supply) technologies with small storage capacities is
essential. These interruptions, while brief, can disrupt operations, particularly impacting
sensitive equipment and critical processes in data centers or essential facilities. UPS
systems with small storage offer immediate backup power during such interruptions,
ensuring continuous operation without the need for manual intervention or system
reboots. They are designed to provide power for a few minutes to several minutes,
stabilizing voltage and frequency to protect against accompanying power quality issues
like voltage sags or spikes. When implementing UPS with small storage, it's crucial to
ensure the system's capacity matches critical load requirements, consider scalability for
future expansion, and comply with local regulatory standards for electrical equipment
safety and performance.
Rotary AC UPS
In a rotary UPS system, the output sine wave is generated through rotating machinery,
such as a dynamo or flywheel, rather than directly from stored energy, as in static UPS
systems. Here, a motor is powered by the utility grid, driving a generator that produces a
clean AC sine wave for the IT load. When the UPS detects deviations in utility voltage or
frequency, it uses its rectifier and inverter to supply controlled power to the motor, which
drives the generator maintaining stable output.
During a complete blackout, batteries supply power to the motor, allowing time for a
standby generator (external to the UPS) to start and reach operational speed, providing
sustained power to the facility. Unlike modular static UPSs, rotary UPS systems are non-
modular and require oversizing to accommodate potential future load increases. They also
need additional ventilation due to fumes emitted during operation, typically housed in
separate buildings or custom rooms.
Both rotary and static UPS systems are highly reliable, but rotary UPSs can offer longer
lifespans when maintained properly due to their reliance on motors rather than large
battery banks. These systems are best suited for centralized power architectures in data
centers and environments expecting frequent short power surges, like broadcast stations
with power-hungry amplifiers.
While static UPS systems dominate the data center market globally, rotary UPSs are more
prevalent in Europe. As data centers scale up and demand for reliable power increases, a
mix of UPS types may be deployed, each offering distinct advantages and considerations.
Therefore, selecting the right UPS type involves careful evaluation of specific operational
needs and considerations for the data center's power infrastructure.
UPS at Rack Level
Rack-mounted servers are central to modern network computing systems, requiring
reliable and compact power protection solutions that can keep up with their increasing
demands. These servers are typically housed in standard 19-inch server racks, which allow
for dense hardware configurations within a small footprint. Equipment is mounted in these
racks by attaching its front panel securely.
Integrating both servers and UPS systems within a single rack offers significant space-
saving benefits. Rack UPS systems are specifically designed for servers, networking
equipment, and other critical applications in rackmount environments. They are well-
suited for mission-critical systems, network workstations, IDF/network closets, large
network peripherals, VoIP systems, and workstations.
In today's converged networks, high availability and reliability are paramount. UPS systems
designed for rack installations must accommodate mixed load voltages and plug types
while remaining easy to install and maintain. These systems also feature communication
capabilities such as SNMP and web-based management, along with connections for
environmental sensors, enabling comprehensive monitoring and management.
Rackmount UPS units are versatile and can be deployed across various locations and
environments, ranging from large data centers to network closets and remote edge
locations. Their design ensures they can operate effectively and reliably wherever they are
installed, supporting critical IT infrastructure with efficient power protection and
management solutions.
Static DC UPS
The concept of using DC (Direct Current) throughout data centers instead of AC
(Alternating Current) has been proposed as a way to simplify power distribution.
Traditionally, data centers use AC power from the grid, which is then converted to DC to
charge UPS batteries and subsequently converted back to AC for distribution to racks. In
the early 2000s, some companies suggested eliminating the AC conversion steps by
directly distributing DC power to the racks after charging the batteries with DC. ABB, for
instance, showed interest by acquiring Validus, a prominent DC technology provider, in
2011.
However, despite these proposals, the widespread adoption of DC in data centers did not
materialize. Concerns about the safety of DC, compared to AC, played a significant role in
this reluctance. Additionally, the data center industry tends to be conservative when it
comes to adopting new technologies that diverge from established norms.
Despite the initial setback, the idea of using DC in data centers remains relevant. As
technology advances and priorities shift towards efficiency and sustainability, there may
be renewed interest in exploring DC-based solutions for data center power distribution in
the future.
Static UPS Systems with Lead-Acid Batteries are traditional and the most common type
used in data centers due to their reliability, cost-effectiveness, and proven track record.
They are cost-effective, generally less expensive than lithium-ion batteries, making them a
popular choice for many data centers. Lead-acid batteries have been used for decades,
providing a well-understood and reliable power backup solution. They are robust and can
handle high power loads, making them suitable for large-scale data center applications.
However, lead-acid batteries are heavy and require more space compared to lithium-ion
batteries, which can be a constraint in space-limited data centers. They also require
regular maintenance and have a shorter lifespan compared to lithium-ion batteries.
Additionally, lead-acid batteries have lower energy density and efficiency, leading to higher
operational costs over time.
Static UPS Systems with Lithium-Ion Batteries are becoming increasingly popular due to
their higher energy density, longer lifespan, and reduced maintenance needs. Lithium-ion
batteries provide higher energy density, allowing for more compact and lighter UPS
systems, which saves valuable space in data centers. They have a longer lifespan
compared to lead-acid batteries, reducing the frequency of replacements and associated
costs. Lithium-ion batteries require less maintenance, leading to lower operational costs
and increased reliability. They also offer higher efficiency, resulting in lower energy losses
and improved overall performance of the UPS system. However, lithium-ion batteries are
more expensive upfront compared to lead-acid batteries. They also require effective
thermal management systems to prevent overheating and ensure safe operation.
When choosing UPS systems, key considerations include load requirements, physical
space and weight constraints, cost versus performance, and environmental impact. It is
important to assess the power load requirements of the data center to determine the
appropriate UPS capacity and battery type. Consider the physical space and weight
constraints of the data center when selecting between lead-acid and lithium-ion batteries.
Evaluate the trade-offs between initial cost, maintenance, lifespan, and overall
performance to choose the most suitable UPS system for the data center's needs.
Additionally, consider the environmental impact and disposal requirements of the
batteries, especially for large-scale data centers.
In conclusion, static UPS systems with lead-acid and lithium-ion batteries are both widely
used in data centers, each offering distinct advantages and disadvantages. Lead-acid
batteries provide a cost-effective and reliable solution, while lithium-ion batteries offer
higher efficiency, longer lifespan, and reduced maintenance. The choice between these
two types depends on the specific requirements, constraints, and priorities of the data
center.
Innovations in UPS Technologies for Data Centers
Why?
Uninterruptible Power Supply (UPS) systems and their associated energy storage
components are critical elements in the infrastructure of data centers. These systems
significantly impact the costs and energy losses associated with Low Voltage (LV)
equipment.
The initial investment in acquiring UPS systems and their energy storage components,
such as batteries, is a substantial part of the capital expenditures (CAPEX) for setting up a
data center. These costs cover not only the purchase of the equipment but also its
installation and integration into the existing infrastructure. UPS systems consume energy
continuously, even when not in active use, leading to ongoing operational expenses. These
energy costs are a critical factor in the overall cost of ownership and operation of a data
center.
The efficiency of energy storage systems directly affects energy costs. During charging and
discharging cycles, energy losses are inherent, reducing the overall efficiency of the
system. By improving battery efficiency, these losses can be minimized, leading to lower
energy costs over time. Higher energy density batteries, such as lithium-ion, offer a more
compact solution, reducing the physical footprint of the UPS systems and associated
cooling requirements. This optimization of space can lead to further cost reductions and
more efficient use of available data center real estate.
The importance of UPS and energy storage in Low Voltage (LV) equipment is multifaceted,
influencing both capital expenditures and ongoing operational costs. Enhancing energy
efficiency and optimizing battery performance are key strategies for reducing the financial
impact of these critical systems in data centers.
Modular UPS systems consist of multiple smaller UPS units that can be combined to
provide the required power capacity. The key benefits of modular UPS systems include
scalability, redundancy, and efficiency. They offer easy scalability, allowing data centers to
adjust power capacity according to changing needs, thereby optimizing capital
expenditure (CAPEX). These systems also enhance reliability since each module can
operate independently, reducing the risk of a single point of failure. Additionally, they can
be operated at optimal efficiency, reducing energy losses and operational expenditure
(OPEX).
Three-level converters are a power conversion topology that reduces voltage stress on
components and improves efficiency. These converters lead to improved efficiency by
lowering switching losses and reducing harmonic distortion. This results in higher overall
efficiency and a longer lifespan of components due to reduced voltage stress, which
decreases maintenance costs and OPEX.
Silicon Carbide (SiC) and Gallium Nitride (GaN) semiconductors offer superior electrical
properties compared to traditional silicon-based devices. These semiconductors provide
higher efficiency by reducing conduction and switching losses, thereby improving energy
efficiency. They also allow for more compact and efficient UPS designs due to smaller and
lighter components, potentially reducing CAPEX. Additionally, their better thermal
performance reduces cooling requirements, lowering OPEX.
Dynamic line-interactive UPS systems combine the features of line-interactive and online
double-conversion UPS systems, providing efficient power conditioning and protection.
These systems offer energy savings by avoiding double conversion losses during normal
operation, and they ensure quick response by transitioning to battery mode rapidly during
power disturbances, thus ensuring continuous protection.
Transformerless UPS systems eliminate the traditional transformer used in UPS systems,
resulting in a more compact and efficient design. The benefits include reduced weight and
size, making installation easier and saving space, which optimizes CAPEX. These systems
also improve efficiency by lowering energy losses without the transformer, thereby
reducing OPEX and lowering initial cost and maintenance expenses.
Innovations in power conversion topologies for static UPS systems are driving significant
improvements in both CAPEX and OPEX for data center operators. By adopting new
technologies such as modular UPS systems, three-level converters, advanced
semiconductors, dynamic line-interactive UPS, transformerless designs, and advanced
battery management systems, data centers can achieve higher efficiency, reliability, and
scalability. These advancements not only reduce initial investment and operational costs
but also enhance the overall resilience and performance of critical power infrastructure.
UPS function embedded at the server power supply unit (PSU) level
Embedding UPS functionality directly into the server power supply unit (PSU) represents a
significant innovation in data center power management. This approach integrates a small-
scale UPS within each server's PSU, decentralizing power backup and offering several
advantages across efficiency, cost, space utilization, scalability, and reliability. By
embedding UPS functionality at the PSU level, data centers can achieve improved
efficiency through minimized power conversion steps, thereby reducing energy losses
associated with traditional centralized UPS systems. This approach also facilitates
reduced energy consumption by directly supplying backup power to servers without the
need for intermediary systems, enhancing overall power usage efficiency. Moreover, the
elimination of centralized UPS units frees up valuable floor space in data centers,
optimizing space utilization and potentially accommodating more servers per rack. The
compact design of embedded UPS systems fits seamlessly within the existing footprint of
server PSUs, maximizing the use of available space and contributing to overall space-
saving benefits. This integration strategy leads to lower initial capital expenditures by
reducing the reliance on large, expensive centralized UPS systems, thereby lowering
infrastructure costs and eliminating the need for separate UPS rooms or extensive cabling.
Additionally, embedding UPS functionality at the server PSU level supports enhanced
scalability and flexibility, allowing data centers to easily scale power backup capability by
adding more servers as needed, facilitating incremental growth without requiring a
complete reconfiguration of the power backup system. This approach also enhances
system reliability by eliminating single points of failure associated with centralized UPS
systems, as any failure in one UPS unit affects only the associated server, not the entire
system. Furthermore, managing smaller, distributed UPS units embedded within servers
simplifies UPS management and reduces ongoing maintenance costs compared to
maintaining a large centralized system. Rapid response times during power disturbances
ensure immediate continuity for each server, enhancing overall performance and reliability
in data center operations. However, challenges such as increased thermal load due to UPS
integration within PSUs, battery lifespan management, and initial per-server costs require
careful consideration to fully leverage the long-term benefits of embedding UPS
functionality at the server PSU level.
DC UPS systems offer several potential benefits. They can be more compact than AC UPS
systems due to fewer components like transformers and inverters, leading to space
savings and lower installation costs. Moreover, with the increasing prevalence of DC-
powered IT equipment (such as servers and storage devices with built-in DC-DC
converters), DC UPS systems can directly support these loads without additional
conversion losses, thereby improving overall compatibility and efficiency.
Integration with advanced battery technologies optimized for DC storage, such as lithium-
ion batteries, further enhances performance and longevity while reducing maintenance
efforts. Modular designs allow for scalable deployments and redundancy options,
enhancing system reliability without the complexity of synchronized AC systems.
However, challenges remain, including the need for standardized interfaces and
compatibility with existing AC-powered infrastructure, as well as ensuring compliance with
safety standards and grid integration requirements. While the initial investment in DC UPS
systems may be higher due to specialized components and integration needs, a
comprehensive analysis of total cost of ownership (TCO) can demonstrate long-term
savings in energy costs, maintenance, and operational efficiency.
In conclusion, as data centers strive for greater energy efficiency and sustainability, DC
UPS systems present a promising alternative to traditional AC architectures. Continued
advancements in battery technology and system efficiency are likely to drive further
adoption and innovation in DC UPS solutions, supporting the evolving needs of modern
data center environments.
Setting UPS Energy Storage Autonomy
Generator starting and loading time
Setting the UPS energy storage autonomy based on generator operations is crucial for
ensuring uninterrupted power supply during grid failures. Understanding generator starting
and loading times is essential: the startup sequence varies by generator type (diesel,
natural gas) and configuration, influenced by factors like fuel availability, engine warm-up,
and automatic transfer switch operation. Once started, the generator requires time to
stabilize and reach full capacity to handle the data center's electrical load without issues.
Determining UPS energy storage autonomy involves ensuring it covers the time for
generator startup, stabilization, and load assumption. Estimating typical generator start
times using historical data or manufacturer specifications helps set a baseline, while
adding a safety margin accommodates unforeseen delays, ensuring reliable power
transition and minimizing downtime risks.
To determine and set UPS energy storage autonomy based on IT shutdown needs, estimate
the maximum time required for IT systems to complete an emergency shutdown process
based on operational procedures and historical data. Add a safety margin to this estimate
to accommodate unexpected delays or complexities in the shutdown process. Identify
critical IT loads that must remain operational until the end of the shutdown process to
maintain essential services.
On the other hand, opting for a standby diesel power plant entails initial investments in
diesel generators, fuel storage systems, automatic transfer switches (ATS), and related
infrastructure. Installation costs depend on the generator capacity required to meet data
center power demands. Operational expenses include fuel procurement, storage
management, and regular maintenance of generator systems. Fuel costs are subject to
market fluctuations and consumption rates during extended outage scenarios.
Diesel generators provide flexibility and independence from utility grid dependencies,
ensuring a reliable backup power source during prolonged grid outages. Control over fuel
supply and maintenance schedules can enhance operational resilience and
responsiveness during emergency situations.
Alternative Technologies:
Gas generators
As data centers continue to evolve in resilience and sustainability, the consideration of
alternative backup technologies like gas generators is gaining traction. Gas generators can
utilize natural gas or propane as fuel sources, which are often more stable in pricing
compared to diesel. Natural gas availability is generally reliable in urban and industrial
areas, reducing logistical challenges associated with fuel procurement during
emergencies.
In terms of operational efficiency, gas generators often offer higher efficiency ratings and
lower operating costs per kilowatt-hour compared to diesel counterparts. A continuous
supply of natural gas enables extended runtime without frequent refueling, enhancing
operational autonomy during prolonged grid outages.
Noise levels and maintenance requirements also favor gas generators in urban or noise-
sensitive environments. They generally operate quieter than diesel engines, contributing to
their suitability for densely populated areas. Moreover, reduced maintenance needs and
longer service intervals translate to lower lifecycle costs and enhanced reliability over
time.
When considering deployment in data centers, initial capital expenditure (CAPEX) for gas
generators includes equipment procurement, installation, and necessary infrastructure
adaptations. Costs may vary based on generator capacity, fuel storage requirements, and
integration with existing electrical systems.
Ensuring compliance with local regulations and emissions standards is crucial, especially
in urban or environmentally sensitive areas. Permitting processes and environmental
assessments may be required to meet legal requirements and community expectations.
Seamless integration with automatic transfer switches (ATS) and power distribution
systems is essential for gas generators to effectively serve as backup power sources.
Advanced monitoring and control systems enable real-time oversight and remote
management of generator operations during emergencies, ensuring swift transitions from
grid power to backup supply.
In conclusion, gas generators represent a viable alternative to diesel generators for data
center backup power solutions. Their advantages in fuel availability, environmental
impact, operational efficiency, and maintenance make them a compelling choice as data
centers strive for enhanced resilience, reduced operational risks, and sustainable
practices. Comprehensive planning, including cost-benefit analysis and risk assessments,
supports optimal selection and deployment of backup technologies aligned with data
center performance objectives and long-term sustainability goals.
Fuel cells
Fuel cells are emerging as a promising technology for providing backup power in data
centers, particularly in scenarios involving long interruptions to grid supply. They offer
flexibility in fuel choice, utilizing hydrogen, natural gas, or methanol based on availability
and operational needs. Hydrogen fuel cells, in particular, emit zero emissions when using
hydrogen, aligning with environmental sustainability goals and regulatory requirements.
Operational efficiency is a significant advantage of fuel cells, operating with high efficiency
by converting chemical energy directly into electrical energy through an electrochemical
process. This efficiency results in lower operational costs and reduced fuel consumption
compared to traditional combustion engines.
Reliability is another key benefit, as fuel cells provide consistent power output suitable for
continuous operation during grid outages or as a primary power source in off-grid
locations. Their scalability allows data centers to incrementally expand backup power
capacity, adapting to growing demand over time.
Initial capital expenditure (CAPEX) for fuel cell systems includes procurement of modules,
storage tanks (if applicable), and integration with electrical infrastructure, with costs
varying based on capacity and installation requirements. Ensuring fuel supply availability,
especially for hydrogen, requires investments in storage facilities and supply chain
management.
Assessing the maturity of fuel cell technology and availability of technical support are
crucial for long-term reliability. Partnering with experienced suppliers and service
providers ensures optimal performance and uptime in critical data center operations.
In conclusion, fuel cells offer a sustainable and efficient alternative to traditional backup
power solutions for data centers, addressing fuel flexibility, environmental impact,
operational efficiency, and scalability. As technology advances and regulatory frameworks
evolve, fuel cells are poised to enhance resilience, reduce carbon footprint, and support
reliable power supply in data center environments. Strategic deployment of fuel cell
technology enables data center operators to achieve operational continuity, mitigate grid
interruption risks, and advance sustainability objectives effectively.
2N Redundant Topology
In a 2N redundant topology for data centers, the design aims for maximum reliability and
uptime by incorporating redundant systems. Power from the utility grid enters the data
center through two independent paths, each equipped with its own UPS and PDU systems,
referred to as System A and System B. This setup ensures that electricity is distributed
through separate UPS units for each path before being routed through dedicated PDUs
associated with System A and System B. Servers or racks within the data center are
typically connected to both System A and System B PDUs, enabling redundant power
sources and minimizing the risk of downtime due to power supply failures.
2N Redundant Topology
Redundancy Coverage:
In a 2N redundant topology, redundancy is integrated throughout the entire system to
ensure continuous operation and accommodate both single- and dual-corded racks
effectively. Each server or rack connects to two independent power paths known as
System A and System B, each equipped with its own UPS units and PDUs for redundant
power supply. For single-corded racks, power is provided by both System A and System B
PDUs, each fed by separate UPS systems. In case of a failure on one path, such as a UPS
malfunction, an automatic transfer switch (ATS) seamlessly shifts the load to the
redundant path without interruption.
Dual-corded racks are connected to both System A and System B PDUs, ensuring each
cord receives power from separate UPS and PDU units. Power loads are balanced between
these paths to prevent overload and optimize operational continuity. This setup offers
flexibility to support various server configurations, reliability by ensuring continuous power
availability during failures or maintenance, scalability to meet increasing demands, and
fault tolerance across the entire power distribution network from the utility grid to server
connections.
In conclusion, the 2N redundant topology is ideal for data centers requiring maximum
uptime and reliability. By offering dual independent power paths with redundant UPS and
PDU systems, it ensures uninterrupted service delivery for mission-critical applications,
meeting stringent uptime requirements and enhancing overall operational continuity.
N+1 Redundant Topology
In an N+1 block redundant topology, also known as a catcher system, the power flow is
designed to maximize reliability and availability from the utility through the UPS/PDU
system to the server.
Each server rack or group of racks connects to redundant PDUs, which receive power from
the UPS units and distribute it to the servers and IT equipment. This dual-path redundancy
ensures continuous power supply to IT equipment even if one UPS unit or PDU fails.
Automatic Transfer Switches (ATS) or similar mechanisms facilitate seamless switching
between UPS units or PDUs in the event of failures, maintenance, or during load balancing
activities.
Servers are directly connected to the PDUs for power distribution, ensuring stable and
reliable power supply backed by redundant UPS systems. This setup minimizes the risk of
downtime due to power interruptions or equipment failures, supporting uninterrupted
operation of critical IT services within the data center.
Automatic Failover: When one of the four systems experiences a failure, whether due to
equipment malfunction or scheduled maintenance, the load originally managed by the
failed system is automatically redistributed to the remaining three operational systems.
Capacity Utilization: Given that each system is designed to handle up to 50% of the total
data center load during normal operations, there is ample capacity available in the
operational systems to absorb the additional load from the failed system. This setup
effectively prevents any single system from being overloaded, maintaining stability and
ensuring uninterrupted service delivery to critical IT infrastructure.
Operational Benefits: High Availability: The topology is designed to ensure high availability
by incorporating backup systems (N+1 configuration) that can seamlessly take over in case
of a system failure. This minimizes downtime and ensures uninterrupted operation of
critical IT services.
Scalability: Its modular design allows for easy scalability, enabling data centers to expand
or adjust their infrastructure by adding or removing systems as needed. This flexibility
supports growth and accommodates changes in IT demands without compromising
operational continuity.
Flexibility: The topology supports both single-corded and dual-corded rack configurations.
This flexibility is essential for accommodating various equipment setups and redundancy
preferences within data centers, ensuring compatibility with diverse IT infrastructure
requirements.
Overall, the N+1 distributed redundant topology enhances data center resilience,
scalability, and flexibility, making it a preferred choice for maintaining high availability and
supporting efficient IT operations.
N+1 Diesel Rotary UPS (DRUPS) Unit Topology
In an N+1 Diesel Rotary UPS (DRUPS) unit topology, the design integrates Diesel Rotary
UPS technology to provide reliable and cost-effective backup power solutions for data
centers. This setup combines a diesel engine generator with a flywheel energy storage
system and an AC alternator, enabling immediate backup power in the event of utility
outages or disturbances without requiring a transfer switch. The N+1 redundancy
configuration ensures operational continuity by deploying N active DRUPS units under
normal conditions, with an additional standby unit (+1) ready to seamlessly take over if any
active unit fails or undergoes maintenance. This approach guarantees uninterrupted power
supply, leveraging the stored kinetic energy from the flywheel to bridge the gap until the
diesel engine stabilizes power generation. DRUPS systems are recognized for their
efficiency in converting stored kinetic energy into electrical power, minimizing energy
losses and operational costs compared to traditional UPS systems, making them an
efficient choice for maintaining critical operations in data centers.
A critical component of this topology is the paralleling bus, which interconnects all active
DRUPS units. This configuration facilitates load sharing among the DRUPS units, ensuring
that power demands are distributed evenly. By evenly distributing the load, the system
optimizes resource utilization and prevents overloading of any single DRUPS unit. This
approach enhances efficiency and reliability, crucial for maintaining continuous and stable
power supply to the data center's critical infrastructure.
Failover Scenario:
In a properly configured N+1 Diesel Rotary UPS (DRUPS) topology, if one DRUPS unit fails,
the remaining units are designed to seamlessly take over the load without interruption.
This redundancy ensures continuous power supply to the data center or facility, even
during DRUPS maintenance or unexpected failures, thereby minimizing downtime and
ensuring operational reliability.
Chokes, or inductors, installed on the common bus play a crucial role by limiting the rate
of current change (di/dt) during a short circuit event. This limitation helps mitigate voltage
spikes and fluctuations that could potentially harm sensitive IT equipment connected to
the DRUPS units. Additionally, the chokes contribute to stabilizing voltage levels on the
common bus, adhering to standards such as the ITI curve which defines safe voltage
tolerances for IT equipment.
When a fault is detected, the zone protections initiate fault-clearing procedures by opening
circuit breakers or isolating switches to isolate the faulted section. Meanwhile, the DRUPS
units continue supplying power to unaffected areas, ensuring uninterrupted operation for
critical loads.
Benefits of these measures include enhanced fault tolerance, compliance with industry
standards, and operational continuity. By swiftly responding to faults and maintaining
stable power supply conditions, the DRUPS topology supports high availability and
reliability in data center environments, minimizing downtime and safeguarding critical IT
infrastructure from potential damage or disruption.
N+1 Block Redundancy Using IT Redundancy
In an N+1 Block Redundancy Using IT Redundancy topology, redundancy is focused at the
IT equipment level rather than at the block level. Each critical IT component, such as
servers, storage devices, and networking equipment, is configured with redundant units.
This approach ensures that if one unit fails or undergoes maintenance, the workload can
seamlessly transfer to the redundant unit without interruption.
At the block level, typically defined by a group of IT racks or cabinets, there is no inherent
redundancy. This means that within each block, IT equipment operates without duplication
or failover capabilities. Instead, redundancy is ensured and managed at the individual IT
equipment level.
The N+1 redundancy principle dictates that for every critical IT component, there is at least
one additional unit (N+1) available to assume operations in case of failure or maintenance.
This configuration enhances availability and reduces the risk of downtime caused by IT
equipment failures.
These redundant servers are designed to seamlessly take over the workload of the failed
servers without any interruption to IT operations. This ensures continuity of service and
maintains uptime for critical applications, as the workload transitions smoothly to the
redundant servers. Thus, the scenario mitigates downtime risks associated with
equipment failures, enabling continuous IT operations and preserving service availability in
the event of a block-level failure.
Investment and Redundancy Strategy:
In an N+1 Block Redundancy Using IT Redundancy architecture, where redundancy is
focused on IT equipment rather than at the LV (Low Voltage) distribution level, several
implications and advantages emerge:
Simplified Design and Maintenance: The architecture simplifies the design and
maintenance of the LV electrical distribution system within the data center. It minimizes
the need for redundant LV components and associated maintenance costs, focusing
instead on ensuring redundant IT equipment operates efficiently.
Considerations:
Initial Investment: While the architecture avoids redundancy at the LV level, the initial
investment in redundant IT equipment can be higher. Organizations must weigh the upfront
costs against the benefits of improved reliability and reduced downtime risk.
Tier 1 and 2
Tier I data centers have basic capacity components and rely on a single path for power and
cooling distribution. They lack built-in redundancy in their infrastructure for power and
cooling systems. Tier I data centers are designed to have an availability of approximately
99.671%, which translates to about 28.8 hours of downtime per year.
Tier II data centers, on the other hand, enhance reliability by incorporating redundant
capacity components for power and cooling. This includes redundancy for critical systems
such as UPS (Uninterruptible Power Supply) units, chillers, and power distribution units.
Tier II facilities aim for an availability of approximately 99.741%, equating to about 22 hours
of downtime per year.
Tier 3 and 4
Tier III data centers, categorized as Concurrently Maintainable, build upon Tier II by adding
stringent requirements for concurrent maintainability. This means that all critical systems
and components, including power and cooling infrastructure, can be maintained or
replaced without causing any interruption to the data center's operations.
Key features of Tier III data centers include redundant capacity components similar to Tier
II. However, the critical enhancement lies in the ability to conduct maintenance activities
without disrupting data center operations. This capability is crucial for ensuring
continuous uptime and reliability.
In terms of availability, Tier III data centers are designed to achieve an uptime of
approximately 99.982%, which corresponds to approximately 1.6 hours of downtime per
year. This high availability level reflects the robust redundancy and maintainability features
implemented in Tier III facilities.
Moving to Tier IV data centers, classified as Fault Tolerant, these facilities extend the
capabilities of Tier III by focusing on eliminating single points of failure within the critical
infrastructure.
Key features of Tier IV data centers include all the attributes of Tier III, such as redundant
capacity components and concurrent maintainability. Additionally, Tier IV facilities
typically implement a 2N redundancy strategy, ensuring that every critical component has
a backup in case of failure. This redundancy approach effectively eliminates any single
points of failure, further enhancing the data center's resilience.
In terms of availability, Tier IV data centers offer the highest level of reliability, aiming for
approximately 99.995% uptime. This translates to about 26.3 minutes of downtime per
year, showcasing the robustness and fault tolerance built into Tier IV facilities.
Overall, the Tier classification system provides a standardized framework for evaluating
and categorizing data center reliability and availability levels, helping organizations select
facilities that align with their uptime requirements and business continuity goals.
To assess and mitigate risks associated with utility reliability, data center operators
conduct thorough risk assessments. These assessments evaluate various factors such as
historical outage data specific to the region, the overall stability of the grid, local
infrastructure conditions, and the effectiveness of contingency plans designed to handle
utility failures. By understanding these factors, data centers can implement proactive
measures to enhance resilience, mitigate potential risks, and maintain operational
continuity even in the face of utility disruptions.
Regarding fault tolerance in Tier III and Tier IV designs, it's important to clarify that these
tiers require fault-tolerant architecture, including redundancy in critical systems such as
generators and UPS units. However, achieving fault tolerance does not necessarily
mandate oversizing beyond practical operational needs. Designing for fault tolerance
involves ensuring that critical systems have adequate redundancy to continue operating
seamlessly during component failures or maintenance activities. Oversizing beyond
operational requirements can lead to unnecessary costs without proportional benefits in
terms of operational reliability.
The key components in this setup include double-corded servers and redundant power
paths. Double-corded servers are designed with two power supplies to provide a fail-safe
mechanism. If one power path is disrupted, the other can immediately take over,
preventing any loss of power to the server. Redundant power paths consist of two
independent power supply routes, labeled as Path A and Path B. These paths are
engineered to function autonomously, so maintenance work on one path does not affect
the other. This design is critical for high-availability environments, such as data centers,
where uninterrupted power supply is essential.
During planned maintenance, one power path (e.g., Path A) can be safely shut down
without impacting the operation of the servers. The servers will continue to receive power
from the other path (e.g., Path B), ensuring that there is no downtime or disruption in
service. This seamless transition is facilitated by the server's dual power supplies, which
automatically draw power from the remaining active path. As a result, maintenance tasks
can be performed without any adverse effects on the servers or the services they support.
This redundancy model is not only beneficial for planned maintenance but also for
unplanned outages or failures in one of the power paths. The design ensures that the
servers remain operational even if one path experiences a fault. This level of redundancy is
a critical aspect of data center reliability, enabling organizations to maintain high levels of
uptime and service availability. The ability to perform maintenance without service
interruptions also contributes to better management of the data center infrastructure,
allowing for regular updates and repairs while ensuring continuous operation.
N+1 Block Redundant configurations offer one system's redundancy capacity with a
standby reserve bus. This setup facilitates seamless transitions during outages or
maintenance but may underutilize backup systems, impacting overall efficiency.
N+1 Isolated Parallel Bus configurations use isolated parallel buses for redundancy,
emphasizing cost efficiency while requiring robust protection systems and complex
operational oversight.
N+1 Block Redundant and IT Redundancy integrates IT redundancy into the block
redundancy setup, suitable for data centers managing their IT processes but requiring
careful coordination and oversight.