2-A Data Centre Air Flow Model For Predicting Computer Server Inlet Temperatures

Uploaded by

Peter

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

2-A Data Centre Air Flow Model For Predicting Computer Server Inlet Temperatures

Uploaded by

Peter

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A Data Centre Air Flow Model for Predicting Computer Server Inlet Temperatures

Raymond Lloyd, Jer Hayes, Marek Rebow∗ , Brian Norton∗

IBM, IBM Research Dublin, Ireland
∗ Dublin
Institute of Technology, Dublin Energy Lab, Dublin, Ireland
Email: [email protected], [email protected], [email protected], [email protected]

ABSTRACT become a critical issue in all reaches of the world, with

Data centres account for approx. 1.3% of the world’s elec- electricity use in data centres being about 1.3% worldwide
tricity consumption, of which up to 50% of that power is ded- in 2010, and 2% for the US, burning 198 Billion Kilowatts
icated to keeping the actual equipment cool. This represents a per year and it’s expected to increase year on year [2].
huge opportunity to reduce data centre energy consumption by The motivation for the model outlined in this paper stems
tackling the cooling system operations with a focus on thermal from work performed in the area of thermal management and
management. This work presents a novel Data Centre Air Flow energy efficiency of data centres where the need for real-time
Model (DCAM) for temperature prediction of server inlet tem- temperature modeling often arose. Real-time monitoring of
peratures. The model is a physics-based model under-pinned inlet temperatures of server equipment can help ensure energy
by turbulent jet theory allowing a reduction in the solution efficient operations are continued, monitored and maintained.
domain size by using only local boundary conditions in front The measuring of temperatures in data centres can of
of the servers. Current physics-based modeling approaches course be completed directly by placing wired or wireless
require a solution domain of the entire data centre room which sensors in appropriate locations. Measurement-based solutions
is expensive in terms of computation even if a small change give real-time sensor readings but generally have sensors
occurs in a localised area. By limiting the solution domain and sparsely placed within the data centre due to the installation
boundary conditions to a local level, the model focuses on the and maintenance cost. Indirect modeling approaches of air
airflow mixing that affects temperatures while also simplifying flow and heat transfer provide alternatives to costly sensor
the related computations. The DCAM model does not have the deployments. Numerical methods such as Computational Fluid
usual complexities of numerical computations, dependencies Dynamics (CFD) based on the Naiver-Stokes equations with
on computational grid size, meshing or the need to solve a heat transfer offer accurate representations of the physics but at
full domain solution. The input boundary conditions required a cost of computational complexity and long solution compu-
for the model can be supplied by the Building Management tation times. To simulate complex geometries, domain specific
System (BMS), Power Distribution Units (PDU), sensors, compact models are used to simplify the physics yielding
or output from other modeling environments that only need significant improvements in computation time but inherently
updating when significant changes occur. Preliminary results generate inaccuracies which are deemed acceptable. They are
validated on a real world data centre yield an overall prediction usually expensive and require trained personnel to operate.
error of 1.2◦ C RMSE. The model can perform in real-time, CFD models are not suitable for real-time applications where
giving way to applications for real-time monitoring, as input to rapid solutions are required that may feed into a Computer
optimise control of air conditioning units, and can complement Room Air Conditioner (CRAC) control system for example.
sensor networks. Reduced order physics models based on Potential Flow
Theory (PFT) [3] offer faster solution times but at the expense
I. I NTRODUCTION of accuracy and still require a full domain solution. Accuracy
Data centres are major consumers of electricity where was improved by the addition of buoyancy effects with En-
according to the Uptime Institute’s most recent survey [1], hanced Potential Flow Models (EPFM) [4] but doubled the
the average worldwide Power Utilisation Effectiveness (PUE) execution time. Another approach [5] [6] deals with buoyancy
remained around 1.7 in recent years. PUE is the data centre and recirculation physics by implementing a Rankine Vortex
total power / IT power and a value of 1.7 means over 40% of superposition with the core placed vertically at the top of
data centre energy is consumed by the Heating, Ventilating, the racks, halving the prediction error overall throughout the
and Air Conditioning (HVAC) systems. Ultimately, the goal is data centre room with significant improvement at server inlets
to maximise performance of Information and Communications but still require a full domain solution. Measurement-based
Technology (ICT) equipment while minimising overall power physical modeling [7] [8] approaches which leverage real-
consumption. To achieve this goal, new business practices time sensor data as the boundary conditions, offer a good
and technologies are continuously being developed. Moreover, compromise between sensor placement density and fitting the
with the steep rise in energy costs, energy consumption has missing data with physics-based prediction.

978-1-5090-2994-5/$31.00 ©2017 IEEE 830 16th IEEE ITHERM Conference

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
Statistical solutions are very computationally efficient and unique characteristics of data centre air flow such as the intake
model non-linearity very well but require training data which of air from the cooling jet by the servers.
is not always to hand or easily obtainable for actual mea- Research carried out on plume and jet theory models show
surements, so simulated data is optionally used,e.g. Proper that the velocity flow field of the jet after the initial formation
Orthogonal Decomposition (POD) [9][10]. Statistical models conditions express self-similarity characteristics when looking
are susceptible to changes in physical conditions, e.g. layout at the cross section of the jet. This is also true of the
change, and lose accuracy rapidly when they occur. They do concentration field which is the mixing of the jet fluid and
provide rapid predictions which would be suitable for opti- the ambient fluid together and is defined by the concentration
mising CRAC control systems. However, in an ever changing of jet fluid in the mix. This characteristic is leveraged in the
data centre, the requirements for re-training may out-weigh design of the DCAM to simplify computation.
the adoptability of this approach. The study of jets and plumes has a long history [11]
The need to obtain temperature predictions quickly is evi- and it has been applied to studies of industrial chimney
dent when we consider how fast conditions change, e.g. vari- stacks, volcanic eruptions and underwater plume activity on
ations in perforated tile flow volume, air supply temperature the sea bed and more recently to building ventilation systems.
or server power loads. The prediction data may be useful in Raised floor or Under Floor Air Distribution (UFAD) was
many ways, from input into the HVAC control system, to first introduced in the 1950’s to cool computer rooms. More
alerting and reporting. Such a solution could be used stand- recently UFAD is emerging as leading ventilation design for
alone to provide temperature prediction, however it could also modern office buildings. As such, much work has been carried
be used alongside or integrated with other modeling techniques out to model UFAD in the office environment in terms of
and sensor readings to provide complete flow and temperature providing user comfort. Of interest, plume theory modeling
fields. approaches have been used to explore the impact of UFAD
Our approach to reduce the complexity of obtaining tem- on room temperature stratification by means of buoyant heat
perature predictions is to reduce the solution domain. Solving plume and fountain jet model for the inertial diffuser flow [12],
local changes rather than the solving the complete solution [13], [14] [15]. These approaches model steady state layered
domain would be far more efficient requiring only local stratification of displacement ventilation in a room. The effect
boundary conditions. Thus we propose a model that will focus of different configurations on heat sources and cooling sources
on the prediction of server inlet temperatures. The prediction have been studied theoretically and experimentally.
method is fast enough to be real-time acceptable and accurate In terms of applicability to a data centre, the lower zone
enough to be useful. The DCAM model is underpinned by of the UFAD models do not represent the dynamic airflow
turbulent jet theory but adapted for use in data centres. activity between the supply air, server consumption and ex-
We validate the model with real-world temperature data col- haust of heated air very well. Firstly, the inertial forces are far
lected from a data centre which has undergone 25 step changes greater than the buoyancy influences (Richardardson number
to the CRAC flow rates. The 25 scenarios provide a range of Ri in the order of 0.1). Secondly, the models are zone-based
different input boundary conditions that test adaptability of the assuming a well mixed zone but in fact, in data centres it is
model to other data centres. the mixing within a zone which is of interest and this is highly
variable. The upper layer, which is at the ceiling height of a
II. A SSUMPTIONS AND L IMITATIONS
data centre is relatively well mixed, as the warm air travels
The research is focused on data centres utilising a raised- back to the CRAC units for cooling and recirculation. In free
floor cooling arrangement where the cold air is supplied by cooling systems, displacement ventilation occurs as the heated
perforated floor tiles placed directly in front of the server racks. air is displaced through ceiling vents.
The model predicts inlet temperatures in front of servers and More appropriately representing the airflow from a data
is limited to server racks which are not placed at the end of centre’s perforated tile are studies into turbulent jets since
aisles. Power consumption by server racks are assumed to be the air exiting the perforated tiles exhibit a high Reynolds
uniformly distributed vertically from bottom to top of the rack. number in the region of 27,000 (Eqn. 1 and Eqn. 2 for a
Possible increases in server fan speeds due to increased inlet 25% open perforated tile at 0.142m3 /s). Importantly, the high
temperatures are not currently accounted for. inertial forces generated by the pressurised under floor plenum
lend themselves to fit the model of a turbulent rather than a
III. DATA C ENTRE A IRFLOW M ODEL buoyant plume.
We now describe the DCAM model where the goal of this
physics-based approach is to limit the solution domain and
boundary conditions to a local level. The local level is defined ρvDH
Re = (1)
as the area in front of the server, the perforated tile supplying μ
the cool air, and the server itself. This lends itself to the
application of turbulent jet theory to model the air jet from where ρ is the density of air, μ is viscosity, v velocity, DH
the perforated tile. The model is adapted to take into account is the characteristic length.

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
1.204 ∗ 3.1 ∗ .142
Re = = 26, 727 (2)
1.983 ∗ 10− 5
Fig. 1 shows a round free turbulent jet. The air stream exits
the orifice along the jet axis y and spreads as the jets moves
away from the orifice. This angle of spread has been empir-
ically shown as approximately 11.8o from the y axis, with
little effect from the fluid type (water, air), orifice diameter,
shape or discharge velocity [16][17][18]. The intersection of
the spread lines under the orifice reveals the virtual source of
the jet where y = 0. The relationship between the increase in
radial distance r of the spread to the distance along the y axis
is tan(11.8) ≈ 1/5.

1
R(y) = y (3)
5
The distance from the orifice to the virtual source is d2 = 15 y
therefore
5d
y= (4)
2
The distribution of velocity of the air jet exhibits a Gaussian
bell curve shape as the flow develops along y. Initially on
exit of the orifice, a top hat profile of velocity can be seen
with an average exit velocity of U0 . The transition from a
top hat to a Gaussian shape has been studied experimentally Fig. 1. Round Free Turbulent Jet with Zone of Flow Established Flow and
[19]. The experimental measurements reveal that the Zone of Zone of Established Flow.
Flow Establishment (ZFE) ends at ≈ 6.2d where d is the
diameter of the orifice. Up to the Zone of Established Flow
(ZEF) the change of maximum centreline velocity umax is model due to the deterministic behavior of the flow with
minimal and self similarity begins as the Gaussian profile now respect to distance from the source, initial velocity, and size
characterises the velocity distribution across the jet and umax of the orifice. This eliminates the complications of grid sizing,
starts to reduce almost linearly. Fig. 1 shows the transition iterative processes and convergence as seen in numerical
from ZFE to ZEF. The transition to full Gaussian occurs when models. Another important observation from the jet flow is
the eddies from the side ambient entrainment reach the flow that velocity flow distribution across the jet is proportional to
axis of the jet y. the concentration between the jet fluid and ambient fluid. This
The Gaussian shape or bell curve shape of the velocity correlation between velocity and concentration has been found
profile across the jet cross-section can be represented by; empirically[21].
Importantly, the mixing concentrations of the ambient fluid
and the jet fluid represents the mixing of the hot exhausted
r2
u(r) = umax exp (5) air from the servers recirculating into the cold aisles and
2σ 2 the cooling air supplied by the perforated tile. The inlet
where u is the velocity, r is the radial distance from the temperature is a function of the concentration proportions
centreline and σ is the standard deviation. Since we know between the two i.e. t = t(C1 , C2 ) where C1 , C2 are the jet
that 4σ’s make up approximately 95% of the diameter of the and ambient fluid concentrations.
y
jet which is 2r, from Eqn. 3 we can say σ = 10 therefore we To determine the velocity of the centre line we consider
can write the jet entering a body of fluid with the absence of external
acceleration/deceleration. In this case the momentum flux in

y 1 50 50r2 the jets cross-section remains constant, that is:
σ= → = 2 → u(y, r) = umax exp − 2
10 2σ 2 y y ∞
πd2
(6) M= ρu2 2πdr = ρU 2 (7)
0 4
The curve characteristic studies show the self similarity
of the velocity profile as the flow develops away from the where the momentum flux is ρu×u with ρ the density and u
source of flow [17] [20]. Self similarity, from a modeling the velocity accumulated over the jet cross-section with respect
perspective helps with the simplification of the data centre to the radius r. U and d are average exit velocity taken at the

5d
umax = U (8)
y
In the proceeding text, a theoretical two dimensional sim-
plified model is proposed to model the behavior of the jet
by placing a boundary wall consisting of a rack of server
equipment to the side of the jet. The perforated tile airflow
is modeled as a two dimensional round free turbulent jet Fig.
2. The area of the perforated tile is a 600mm square shape
with an effective open area determined by the tile perforation
percentage or damper setting (adjustable perforations).
In the DCAM, the velocity field is calculated on the
premise of constant momentum flux, therefore we can use a
concentrated jet placed in front of the rack with effective tile
opening of Aef f where δperf is the percentage of perforated
tile opening.

Aef f = δperf Atile (9)

The effective two dimensional cross-section value of d is

therefore
Fig. 2. Round Free Turbulent Jet with a computer server rack placed to the
Aef f side of the jet
d= (10)
2
Typically, in a data centre, the perforated tiles are po- the radial entrainment transverse velocity v carries the
sitioned directly in front of the server equipment. The jet entrainment into the jet so that
is prohibited from expanding on one side with the server
equipment restricting the flow of air from the perforated tile. dQ = vdA (14)
The symmetrical velocity profile of the jet is now disturbed with dA = 2πRdy the lateral area of the jet section. Substi-
with a rack of servers on one side. The rack of servers prevents tution of dA and further substitution of R from Eqn. 3 gives
entrainment on this side, therefore we assume that mixing from
entrainment on the server side of the flow axis y does not take
dQ 2πyv
place. The transition from top hat to Gaussian is schematically = 2πRv = (15)
represented as a central flat spot on the top of the curve dy 5
where umax is constant as the flow progresses along y until Equating this to the previous dQ/dy yields the value for
the curve shape is reached. The simplified model assumes a transverse velocity of entrainment v in terms of average jet
smooth transition between ZFE and ZEF in terms of top hat velocity ū
to Gaussian profile.
The volumetric flux Q increases, due to entrainment of Ud umax
surrounding ambient air E as the jet travels along the jet axis v= = = 0.10ū (16)
4y 20
y. The entrainment is the rate at which the volumetric flux
grows and can be represented as Fig. 3 is a two dimensional schematic showing the first
step in the adapted model. In this schematic, the servers are
dQ not operational, therefore do not consume any of the jet flow
E= (11)
dy and the server boundary interface is ignored. The sampling of
where the volumetric flux Q is jet cross-section is represented by the horizontal line at the
∞ bottom of the curve. The sampling distance along y can be
π π as granular as required. Here it is set at 305mm intervals to
Q= u2πrdr = umax y 2 = dU y (12)
0 50 10 match the experimental data resolution and sampling location.
so E becomes The volume of cool air discharged from the perforated tile
is represented in terms of its velocity profile with the initial
dQ πdU volume of cool air exiting the perforated tile is represented as
E= = (13)
dy 10 a two dimensional rectangle Acool i.e. a top hat on exit. As

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
the jet progresses along y the area of Acool remains constant where d is the width of the effective perforated tile.
but the shape changes to that of a Gaussian shape. In Fig 3, to the left of the Gaussian curve, the curve
is flattened off. This represents the flat spot umax the max
∂Acool velocity of the jet. This value of umax is constant up to the
=0 (17) ZFE where umax = U0 . After this point the value of umax
∂y
can be determined by Eqn. 8.
On the server side, the velocity profile is squared off as After the ZFE, in the ZEF, the knee of the curve remains
entrainment on that side is ignored. on the jet flow axis y and follows self similarity for values for
y > 6.2d the start of the ZFE. The DCAM model leverages
the velocity/concentration profiles to determine mixing ratio’s
between cooling supply air from the perforated tile and warmer
ambient air in the data centre.
The ratio’s are captured as a two dimensional representation
between the area of cooling air to area of ambient air, which
of course represents the volume of cooling and volume of
ambient air respectively.
The model requires scaling from units of length which deter-
mine the jet characteristics (diameter d in mm and height along
the y axis in mm) to units of volumetric flow (m3 /s) on the
x-axis and a dimensionless value on the y-axis umax U0 for each
calculation of mixing ratios. The y axis is conveniently scaled
to this dimensionless unit whereby the areas representing the
velocity profiles in the model can be calibrated directly to the
volumetric flow rate. At jet exit from the perforated tile the
top hat profile has vertical height of umaxU0 = U0 = 1 so the
U0

x-axis can be scaled to the exit volumetric ﬂow rate in m3 /s.

The initial unit height is denoted as h. As we sample further
up the server rack, both the shape of the area and the vertical
height change. Where the servers are non-operational, the area
of cooling air supply remains constant, only the shape of the
area changes.
The next step is to model the server consumption at the
inlets of the servers. The server consumption is depicted in Fig.
Fig. 3. Round Free Turbulent Jet - the velocity profile is simplified. A ZFE 4 as a red vector Vs , with the volume of air consumed depicted
line is drawn from the edge of the perforated tile flow at U0 to the centre of
the curve at umax at the point along y where the ZEF begins. The intersection
are the red rectangle AVs = Vs ∗ h. The server consumption
of the ZFE line with the any of the sample lines determines the knee of the rectangle has its rectangle height scaled to the unit height h =
curve (start of the Gaussian curve). 1 and the length of which is scaled to the volumetric flow rate
demanded by the server fans Vs (see legend in Fig. 4). The
To determine the shape of the curve along y in the ZFE, a available cooling air is shown as Acool.
line (ZFE line) is drawn from the edge of the perforated tile The server air demand Vs is positioned horizontally on y
to the end of the ZFE where it intersects the y axis. Where axis at the server inlet layer li . The sampling mixture area
this ZFE line intersects any horizontal sampling line at point is depicted as the dashed rectangle Acons and is equal to the
k (within the ZFE) , this is represented as the “knee” of the server consumption AVs . It is drawn from the position of x =
curve, i.e. the start of the Gaussian curve, which is the point at VS away from the server inlet (where x = 0 at the servers).
which entrainment of ambient air has penetrated into the jet. The area Acons is removed from the jet stream. The mixing
The tail of the Gaussian curve is determined by the intersection proportions of cooling air to ambient air are determined by
of the entrainment line with the horizontal sampling line, point the areas bounded by Acons .
j. Recall that the slope of the entrainment line is a universal As sampling location moves up the server rack along the
angle of 11.8o so y = 1/5x and that the start of the ZEF is y axis to the next server li+1 , the proportion of cooling air
6.2d, therefore points k and j can be calculated as follows consumed by the server in the previous step li is deducted
from the available supply air Acool and is denoted as Acoolcons ,
y shown as “Reduction due to previous Server cool air demand”
k =d− (18)
12.4d on the schematic in orange 4. This is effectively the remaining
silhouette of Acool , if the servers were non-operational.
The process continues as the sample location moves up the
j = d + 5y (19) y-axis from server to server as in Fig. 4. In our case the

A. Comparison with simulation

The model is first verified using CFD software TileFlow
[22]. The purpose of this testing is to ensure the model behaves
and captures the implementation of the physics in a controlled
Fig. 4. Adapted Model depicting the demand of the computer servers on the simulated experiment with a well trusted simulation package
jet as h increases specifically developed and optimised for data centres. In a
live data centre, even under experimental conditions, some
parameters are difficult to set and control and are often off-
sampling location is every 305mm starting at l0 =152mm to
limits, e.g., server loads which in turn cause changes in power
l7 =2286mm from the perforated tile to match the ”layers” in
consumption. The CFD model is created to represent a simple
the validation experimental data Section IV-B1. This sampling
data centre comprising of a single row of seven identical
location could in fact be varied to suit the configuration of the
server racks with two CRAC units placed at each end and
servers installed in the rack. In Fig. 4 we can see the remaining
aligned with the hot aisles. The simulation results are taken
Acool reduces as the servers consume more and more of the
from the centre server rack in the middle of the row of racks.
original supply air from the perforated tile.
Three simulations were created to model different degrees of
recirculation from the top of the rack to the centre of the

i
rack using different parameter settings for rack power and
Alcool
i
= Alcool
0
− Alcoolcons
i
(20) CRAC volumetric flow rate and temperature. The rack power
0
is uniformly distributed from top to bottom of the rack.
Calculation of the temperature at the inlet Tinlet is a The input parameters to the DCAM model are shown in
function of the volumetric mixing of air temperatures between Table. I. The parameters, which are local boundary conditions,
the supply air Tcool and the surrounding ambient temperature are extracted from the CFD model. The T Ambient value is
Tamb contained in the sampling area Alcons
n
. This can be seen taken from the temperature at the top centre of the rack.
in the area enclosed by Acons in Fig. 5 which can be defined
Boundary Conditions
as; Test Rack Tile Tile T Ambient
li
l i
Tcool ∗(Acons
li
∩Acool
l
i
)+Tamb ∗(Acons
l
i
−(Acons
l
i
∩Acool )) Case Pwr Flow Temp
Tinlet = li (21) # kW cfm ◦C ◦C
Acons
1 5 612 12.8 23.6
where 2 6 612 12.8 32.8
3 5 460 12.8 34.6
Acons = Vs ∗ h (22) TABLE I
I NPUT PARAMETERS FOR DCAM MODEL , VALUES EXTRACTED FROM
CFD SIMULATION
IV. R ESULTS
Validation of the DCAM model is performed with real-life The reference temperatures are taken from the CFD simu-
data, however the model was first verified against a simplified lations directly in front of the server rack. The temperature
CFD simulation to verify the the software implementation of is sampled at 305mm intervals from between 152mm to
the algorithm was correct. 2286mm. Using the extracted local boundary conditions from

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
the CFD simulation, the DCAM model is run and the results nine thermocouples per layer and one humidity sensor. The
are shown in Fig. 6. cart has motion encoding wheels and hardware to capture the
data relative to the location in the data centre. The MMT cart
fits on a standard 600mm tile and is traversed tile by tile
throughout the data centre to digitise a high resolution three
dimensional snapshot of the thermal environment.
Simultaneously, information is recorded from the CRAC
units, power distribution units and perforated tiles. The data
was gathered from a real operational data centre which is
102.2m2 in size (25 x 11 tiles). The CRAC unit flow rates are
varied with 25 different settings and the MMT scan conducted
each time. The settings are shown in the in Table. III
The layout of the data centre is shown in Fig. 7. The racks
are arranged in a hot aisle, cold aisle configuration with two
CRAC units placed at each end of the data centre. The total IT
power is 75KW. Racks A5 and C2 are fitted with rear door heat
exchangers accounting for the removal of approx. 25KW of
heat load. The data centre and measurements have been used
previously in other modelling work, namely Measurement-
based modelling [8] and Reduced-order modelling via Proper
Orthogonal Decomposition [9] and thus has been held to be a
valid data set by other researchers.
Fig. 6. Comparative verification results between DCAM model and CFD
model in the simulated data centre

L1 L2 L3 L4 L5 L6 L7 L8
CFD1 12.8 12.8 13.2 13.9 14.9 17.5 22.4 23.6
PHY1 12.8 12.81 12.86 13.1 14.1 17.19 22.45 23.6
ERR1 0 -0.01 0.34 0.8 0.8 0.31 -0.05 0 0.432189
CFD2 12.8 12.8 13.3 14.5 17.4 22.9 31.3 32.8
PHY2 12.8 12.83 12.99 13.84 17.3 27.98 32.24 32.8
ERR2 0 -0.03 0.31 0.66 0.1 -5.08 -0.94 0 1.845014
CFD3 12.8 13 13.9 15.9 22.7 35.4 35.5 34.6
PHY3 12.81 12.86 13.18 14.82 21.58 33.44 34.31 34.6
ERR3 -0.01 0.14 0.72 1.08 1.12 1.96 1.19 0 1.013447

Total RMSE Error 1.096883

TABLE II
M ODEL VS S IMULATION : DCAM SHOWS REASONABLE AGREEMENT OF
CFD RESULTS . H IGHLIGHTED ARE THE TWO MAX ERRORS .

Generally, the DCAM model agrees with the CFD model

quite well except for one temperature location highlighted in
Table II. The location of the error is at the inflection point
of the curves, the most variable area of the inlet face of the
servers. As both the DCAM and CFD models are simulations a
definitive justification for the differences between the models
can not be fully assessed, however the shapes and general
profiles agree indicating that the DCAM model is executing
TABLE III
the physics in the application as intended. To validate the F LOW RATES FOR CRAC 1 AND CRAC 2 FOR 25 SETTINGS
model accuracy, real measured data is used and the results
are presented below.
2) Validation: The validation of the model is completed
B. Validation via empirical data using temperature data as gathered from the MMT cart out-
1) Collecting Real World Empirical Data: Temperature lined in Section IV-B1. The combined overall prediction error
data for validation was collected in a data centre using IBM is measured by different metrics, Root Mean Square Error
Measurement and Management Technology (MMT) which (RMSE) , Mean Absolute Error (MAE) and Mean Absolute
was developed in IBM and is widely used in data centres Percentage Error (MAPE) and results are show in Table IV.
around the world [23]. MMT uses a specially developed cart Overall the model preforms very well with an acceptable
to collect temperatures from the data centre room. The cart 1.20◦ C RMSE with standard deviation of 1.13◦ C and a mean
consists of layers spaced vertically at 305mm intervals with absolute error of 0.86◦ C and standard deviation of 0.85◦ C.

This is comparable to results from CFD with an RMS error

approximately 2◦ C [24] in a small controlled data centre.
Reduced order methods such as measurement-based models
[8] which have an inlet temperature RMSE ranging from
0.627◦ C to 2.044◦ C with the lower value requiring a quite
large number of prescribed sensors nodes to supply the input
boundary conditions, provide another example of error rates.
On a macro level, the error distribution at each layer for
all sensors is shown in the chart Fig. 8. The chart plots the
root squared error between each of the predicted and measured
temperatures for each layer. We can see that maximum error
for 50% of the temperature data for all layers is under
1.04◦ C with a range of approximately 0.60◦ C. This extends
to approximately 1.68◦ C with a range of 1.17◦ C for 75% of
the data and approximately 2.5◦ C for 90% of the temperature
data. The extreme outliers occur beyond 95% of the data.
Plotting histograms in Fig. 9 we can see that layers 4,5,6,7
have greater variability and indeed standard deviation. This Fig. 8. The chart shows the root squared error of measured vs predicted
temperatures for each layer. The x-axis is the percentage of the data and the
is as expected as the prediction of the inflection point in y-axis is the RMSE.
the curve is influenced by many factors in real data centres
that are either not, difficult, or computationally prohibitive to
capture in most modeling approaches. These factors include the rack power to individual servers in the rack. Here the
geometry of the server doors, individual server power loads sampling height is dictated by the measurement data for direct
(there can up to 42 individual servers in a single rack), layout comparison. The value at layer 8 is above the rack and used
of the data centre, recirculation, missing servers or filler panels as the ceiling temperature, so is ignored.
etc. The experimental data only contains rack locations with Fig. 10 shows a color map representation of all results
no information on the servers contained therein, therefore a broken down over the 25 test cases. Each test case is a
generalised uniform power distribution is implemented. Given vertical column broken down into the layers. Within each
the availability of such information, the algorithm makes pro- case setting we predict 17 inlet temperature locations with
visions to vary the sampling height increments and distribute 8 layers each. Each color in the column is the RMSE of all

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
RMSE and standard deviation on (predicted-measured)
L1 L2 L3 L4 L5 L6 L7 Overall Range
RMSE 0.42 0.64 1.12 1.46 1.40 1.50 1.40 1.20 -
Standard Deviation 0.14 0.51 0.92 1.37 1.28 1.38 1.40 1.13 -
Max Value 0.05 2.34 2.23 6.76 4.89 4.59 3.77 6.76 -
Min Value -0.60 -1.39 -6.95 -4.74 -3.53 -3.63 -4.06 -6.95 -

MAE Statistics of SQRT( (predicted-measured)2 )

L1 L2 L3 L4 L5 L6 L7 Overall Range
Mean Value 0.39 0.51 0.79 1.03 1.05 1.20 1.04 0.86 -
Standard deviation 0.14 0.39 0.80 1.04 0.93 0.91 0.94 0.85 -
Deviation Squared 8.58 65.46 271.27 459.60 362.94 347.99 373.70 - -
90% error under 0.54 0.98 1.78 2.49 2.39 2.46 2.53 1.88 1.99
75% error under 0.51 0.68 1.06 1.57 1.43 1.68 1.56 1.21 1.17
50% error under 0.47 0.44 0.54 0.63 0.76 1.04 0.72 0.66 0.60
Max Value 0.05 2.34 6.95 6.76 4.89 4.59 4.06 6.95 -

MAPE - Mean Absolute Percentage Error

L1 L2 L3 L4 L5 L6 L7 Overall Range
Mean Value 2.52 2.77 5.02 5.82 4.95 5.35 4.64 4.44 -
Standard deviation 1.18 2.08 5.63 5.74 4.18 3.87 4.20 4.31 -
Max Value 4.58 14.10 50.00 29.39 20.81 19.01 20.20 50.00 -
TABLE IV
M ODEL VS E MPIRICAL DATA : S TANDARD DEVIATION OF LAYERS L1-L7 OVER ALL 25 CASES .

deviates from the actual temperature.

Fig. 10. RMSE for all sensors per layer over the 25 case settings.

This can be explained when we examine the temperature

distribution in the data centre in Fig. 11 which reveals a
recirculation problem close to this location where the exhaust
air from the servers is concentrated. This is due to CRAC 2
which is switched off in these cases. Additionally, there is
an inherent bad layout of the data centre as the CRAC units
and racks are parallel to each other. This prevents a direct
Fig. 9. Distribution of errors per layer binned by .25 degree c
return path for exhausted air, causing it to stagnate as the
operational CRAC unit is too far to draw the hot air back for
cooling. This encourages side recirculation at our inlet sensor
17 temperature predictions for that layer. As we can see, there location indicated by the red dot in Fig. 11. We account for
is very good agreement between predicted values and actual only overhead recirculation in this preliminary version of the
values of temperature indicated by the blue color on the color DCAM model but in future work, a 3 dimensional DCAM is
map, however there is an area where the predicted temperature proposed to take into account the effects of side recirculation.

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.
[6] M. M. Toulouse, D. J. Lettieri, V. P. Carey, C. E. Bash, and A. J.
Shah., “Evaluation of a vortex model of buoyancy-driven recirculation
in potential flow analysis of data center performance,” in 13th Inter-
Society Conference on Thermal and Thermomechanical Phenomena in
Electronic Systems., vol. doi:10.1109/itherm.2012.6231413, 2012.
[7] H. F. Hamann, V. Lopez, and A. Stepanchuk, “Thermal zones for more
efficient data center energy management,” in Thermal and Thermome-
chanical Phenomena in Electronic Systems (ITherm), 2010 12th IEEE
Intersociety Conference on, June 2010, pp. 1–6.
[8] V. Lopez and H. F. Hamann, “Measurement-based modeling for data
centers,” in Thermal and Thermomechanical Phenomena in Electronic
Systems (ITherm), 2010 12th IEEE Intersociety Conference on, June
2010, pp. 1–8.
[9] E. Samadiani, Y. Joshi, H. F. Hamann, M. K. Iyengar, S. Kamalsy,
and J. Lacey., “Reduced order thermal modeling of data centers via
Fig. 11. Sensor location 2: build up of hot exhaust air in the location of distributed sensor data,” In Proceedings of IPACK2009, 2009, 2009.
sensor 2 due to CRAC 2 switched off and reduced air supply from CRAC 1. [10] E. Samadiani and Y. Joshi, “Proper orthogonal decomposition for
reduced order thermal modeling of air cooled data centers,” ASME
Transactions J. Heat Transfer, vol. 132, no. 7, pp. 071 402–071 402–
V. C ONCLUSIONS 14, 2010.
[11] O. Sutton, Micrometeorology. McGraw-Hill, 1953.
The data centre is a complex dynamic environment in [12] Q. Liu, “Energy performance of underfloor air distribution (ufad) sys-
tems part iii: The fluid dynamics of a ufad system,” 1996, phD Thesis,
terms of airflow and temperature distribution. Current physics- University of California.
based modeling approaches require solving the full numeric [13] Q. Li and P. Linden, “The fluid mechanics of underfloor air distribution,”
calculations for the complete data centre domain which can Journal of Fluid Mechanics, vol. 554, pp. 323–341, May 2006.
[14] P. Linden and P. Cooper, “Multiple sources of buoyancy in a naturally
be expensive in term of computation time and is usually done ventilated enclosure,” Journal of Fluid Mechanics, vol. 311, pp. 177–
offline. Our approach, underpinned by turbulent jet theory 192, March 1996.
is applied to only local boundary conditions and can be [15] P. Cooper, “The theory of plumes adapted to model air movement in
naturally ventilated buildings,” in Building Simulation ’93 Proceedings,
solved in isolation of the remaining data centre. The model August 1993.
is validated against a real-life data centre which was subject [16] S. B. Pope, Turbulent Flows. New York, NY: Cambridge University
to 25 control change settings and returned and accuracy error Press, 2000.
[17] H. J. Hussein, S. Capp, and W. K. George, “Velocity measurements in a
of 1.2◦ C (RMSE). These results are in line with simulation high-reynolds-number, momentum-conserving, axisymmetric, turbulent
software. The DCAM model can be used standalone or used to jet,” J. Fluid Mech., vol. 258, pp. 31–75, 1994.
complement real-world sensors. Potential future applications [18] B. Cushman-Roisin, Environmental Fluid Mechanics. John Wiley &
Sons, Inc., New York, NY., 2010.
include real-time monitoring of inlet temperatures to servers [19] M. L. Albertson, Y. Dai, R. Jensen, and H. Rouse, “Diffusion of
which adapts to changes in server workloads, cooling flow submerged jets,” American Society of Civil Engineers: 637-677, 1948.
rates, supply temperatures and could be used in conjunction [20] I. Wygnanski and H. Fiedler, “Some measurements in the self-preserving
jet,” J. Fluid Mech, vol. 38, pp. 577–612, 1969.
with variable frequency drives on air conditioners. [21] N. Kotsovinos, “A study of entrainment and turbulence on a plane
buoyant jet,” 1975, report No. KH-R-32, W. M. Keck Laboratory of
ACKNOWLEDGMENT Hydraulics and Water Resources, California Institute of Technology,
Pasadena, California.
[22] INRES, “Tileflow,” http://inres.com/products/tileflow/overview.html.
We wish to acknowledge the support of various IBM data [23] H. F. Hamann, M. Schappert, M. Iyengar, T. van Kessel, and A. Claassen,
centre staff who have helped us over the years but especially “Methods and techniques for measuring and improving data center best
Hendrik Hamann. We would also like to thank Kailash Karki practices,” in ITHERM, 2008.
[24] W. A. Abdelmaksoud, H. E. Khalifa, T. Q. Dang, R. R. Schmidt, , and
of Innovative Research for use of TileFlow CFD software. M. Iyengar., “Improved cfd modeling of a small data center test cell.”
12th IEEE Intersociety Conference on Thermal and Thermomechanical
R EFERENCES Phenomena in Electronic Systems, 2010.

[1] U. Institute, “2014 uptime institute data center industry,” 2014,

https://journal.uptimeinstitute.com/2014-data-center-industry-survey/
(accessed June 2016).
[2] J. G. Koomey, “Growth in data center electricity use 2005 to 2010,
(a report by analytics press, completed at the request of the new york
times),” August 2011, http://www.analyticspress.com/datacenters.html.
[3] M. Toulouse, G. Doljac, and C. Bash., “Exploration of a potential-flow-
based compact model of air-flow transport in data centers,” in Proceed-
ings of the 2009 International Mechanical Engineering Congress and
Exposition, November 2009.
[4] J. W. VanGilder, X. Zhang, and C. M. Healey, “Data center airflow
prediction with an enhanced potential flow model.” in Thermal Man-
agement; Data Centers and Energy Efficient Electronic Systems., vol.
doi:10.1115/ipack2013-7307, 2013.
[5] D. Lettieri., “Expeditious data center sustainability, flow, and tempera-
ture modeling: Life-cycle exergy consumption combined with a potential
flow based, rankine vortex superposed, predictive method.” vol. PhD
Thesis. UC Berkeley (2012).

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on May 22,2023 at 12:02:08 UTC from IEEE Xplore. Restrictions apply.