0% found this document useful (0 votes)
77 views92 pages

NVD 2031 Hybrid Cloud 6 5 On Premises Design

Cloud Design

Uploaded by

nagarajs50
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views92 pages

NVD 2031 Hybrid Cloud 6 5 On Premises Design

Cloud Design

Uploaded by

nagarajs50
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

v1.

5 | November 2024 | NVD-2031

NUTANIX VALIDATED DESIGN

Hybrid Cloud: AOS 6.5 with


AHV On-Premises Design
Legal
© 2024 Nutanix, Inc. All rights reserved. Nutanix, the Enterprise Cloud Platform, the
Nutanix logo and the other Nutanix products, features, and/or programs mentioned
herein are registered trademarks or trademarks of Nutanix, Inc. in the United States
and other countries. All other brand and product names mentioned herein are for
identification purposes only and are the property of their respective holder(s), and
Nutanix may not be associated with, or sponsored or endorsed by such holder(s). This
document is provided for informational purposes only and is presented "as is" with no
warranties of any kind, whether implied, statutory or otherwise.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Contents

1. Executive Summary................................................................................. 5
Audience.............................................................................................................................................. 7
Purpose............................................................................................................................................... 7
Software Versions............................................................................................................................... 7
Document Version History.................................................................................................................. 8

2. Core Infrastructure Design..................................................................... 9


Core Infrastructure Conceptual Design............................................................................................ 11
Scalability...........................................................................................................................................12
Resilience.......................................................................................................................................... 14
VM Design.........................................................................................................................................15
Cluster Design...................................................................................................................................18
Storage Design..................................................................................................................................23
Network Design................................................................................................................................. 25
Management Components................................................................................................................ 33
Monitoring.......................................................................................................................................... 35
Security and Compliance.................................................................................................................. 39
Datacenter Infrastructure...................................................................................................................42

3. Backup and Disaster Recovery............................................................ 45


Backup and Disaster Recovery Conceptual Design.........................................................................51
Disaster Recovery............................................................................................................................. 51
Backup............................................................................................................................................... 56

4. Self-Service with Automation............................................................... 62


Self-Service with Automation Conceptual Design............................................................................ 67
Self-Service with Automation Logical Design................................................................................... 67
Self-Service with Automation Detailed Design................................................................................. 70

5. Ordering.................................................................................................. 76
Substitutions...................................................................................................................................... 76
Sizing Considerations........................................................................................................................77
Bill of Materials..................................................................................................................................77
6. Test Plan................................................................................................. 85

7. Appendix................................................................................................. 86
Windows VM Performance Tuning................................................................................................... 86
Linux VM Performance Tuning......................................................................................................... 86
Design Limits.....................................................................................................................................88
References........................................................................................................................................ 90

About Nutanix.............................................................................................91
List of Figures.............................................................................................................................................92
Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

1. Executive Summary
Nutanix continues to innovate and engineer solutions that are simple to deploy and
operate. To further improve customer experience and add value for customers, Nutanix
uses robust validation to simplify designing and deploying solutions. This document
discusses the design decisions that support deploying a scalable, resilient, and secure
private cloud solution with two datacenters for high availability and disaster recovery.
Nutanix can deliver this Nutanix Validated Design (NVD), based on the Nutanix Hybrid
Cloud Reference Architecture, as a bundled solution for general server virtualization that
includes hardware, software, and services to accelerate and simplify the deployment
and implementation process. The architecture includes an automation layer, a business
continuity layer (with backup and restore), a management layer (with capacity planning,
access control, monitoring, platform life-cycle management, and real-time analytics),
a security and compliance layer, and a cluster that contains a virtualization layer
(hypervisor, networking, and storage) and a physical layer (compute, networking, and
storage).

© 2024 Nutanix, Inc. All rights reserved | 5


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 1: Architecture for Hybrid Cloud On-Premises Validated Designs

This scalable modular design, based on the Nutanix block-and-pod architecture, is well
suited to hybrid cloud use cases of all sizes. Some highlights of the NVD include:
• A full-stack solution for hybrid cloud deployments that integrates multiple products
including AOS, AHV, Prism Pro, Nutanix Cloud Manager (NCM) Self-Service (formerly
Calm), Flow Network Security, Nutanix Disaster Recovery, Mine, and HYCU
• Support for up to 7,500 general-purpose VMs per pod in a block-and-pod architecture,
which offers a repeatable and scalable design
• A multidatacenter design built for failure tolerance and 99.999 percent availability
• Active-active datacenters with two availability zones (AZs) that run at 50 percent
capacity to provide for full AZ failover in either direction
• Testing for both planned and unplanned full-site failover with standardized business
continuity and disaster recovery (BCDR) service levels
• NCM Self-service automation through the NCM Self-Service marketplace that includes
blueprints for Windows, Linux, LAMP, and WISA as well as standardized VM sizes

© 2024 Nutanix, Inc. All rights reserved | 6


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Accelerated customer time-to-value and reduced risk


• A fully elaborated bill of materials for hardware, software, and services
This validated design is just one example of a supported hybrid cloud configuration. You
can design and build a hybrid cloud on Nutanix in many ways, and you can deviate from
this specific configuration while still following Nutanix best practices.
You can have this validated solution up and running in weeks with minimal burden on
your internal teams so that you can realize the full value of your infrastructure quickly.
After you place your order, Nutanix takes care of the rest.

Audience
This guide is part of the Nutanix Solutions Library and is intended for architects and
engineers responsible for scoping, designing, installing, and testing server virtualization
solutions. Readers of this document should already be familiar with the Nutanix Hybrid
Cloud Reference Architecture and Nutanix products.

Purpose
This document describes the components, integration, and configuration for the NVD
packaged hybrid cloud solution and covers the following topics:
• Core Nutanix infrastructure and related technology
• Backup and disaster recovery for the Nutanix platform and hosted applications
• NCM Self-Service automation and integration with third-party applications
• Bill of materials

Software Versions
Table: Software Versions Used in Validation Testing
Component Software Version
Nutanix Prism Central pc.2022.6.8
Nutanix AOS 6.5.4

© 2024 Nutanix, Inc. All rights reserved | 7


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Component Software Version


Nutanix AHV el7.nutanix.20220304.242
Nutanix Mine Integrated Backup 3.0
HYCU 4.6
Nutanix Objects 3.6
Object manager 3.6
NCM Self-Service 3.6
Infoblox 8.4.4-386831
F5 BIG-IP 16.1.0 Build 0.0.19 Final
Nutanix Life Cycle Manager 2.5
Nutanix Microservices Platform 2.4.2.1
Nutanix Flow Network Security 6.0
Nutanix Cluster Check 4.6.0
Nutanix Foundation 5.3

Document Version History


Version Number Published Notes
1.0 December 2022 Original publication.
1.1 November 2023 Removed references to
NearSync replication and
reduced the number of
protected VMs.
1.2 January 2024 Added the Test Plan section.
1.3 July 2024 Updated the Core
Infrastructure Conceptual
Design and Ordering sections.
1.4 September 2024 Updated the Ordering section.
1.5 November 2024 Updated AOS release cycle
descriptions.

© 2024 Nutanix, Inc. All rights reserved | 8


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

2. Core Infrastructure Design


The following lists provide core infrastructure design requirements, assumptions, risks,
and constraints.
Core infrastructure design requirements by component:
• Management:
› Deploy a unified management plane at the right scale to manage all clusters and
workloads in the environment.
› Deploy unified management for the dedicated management cluster at each
datacenter (dual Prism Central instances per pod).
› Configure management to integrate with Active Directory for authentication.
› Use Active Directory–based groups for access control.
• Virtual machines:
› Support at least three VM sizes: small, medium, and large.
› Support Windows Server 2019 and Red Hat Enterprise Linux (RHEL) 8 as VM
operating systems.
› Limit virtual CPU overcommitment to 4:1, or 4 vCPU per physical CPU core.

© 2024 Nutanix, Inc. All rights reserved | 9


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Monitoring:
› Enable platform fault monitoring and use email to send alerts.
› Monitor performance metrics and store historical data for the past 12 months.
› Keep resource usage under 75 percent; usage over 75 percent generates an email
alert.
› Monitor resources critical to Nutanix AOS operations (for example, CPU, memory,
storage, and network resources); resource usage that exceeds configured limits
generates an alert.
› For resources with high-availability reservations, measure the resource usage
threshold against the usable capacity after subtracting the capacity reserved for
high availability.
› Monitor all network links (including host-switch and switch-switch) for bandwidth
utilization and store historical data for the past 12 months.
› Use email as the primary channel for event monitoring alerts.
› Ensure that event monitoring is resilient. For example, when the management plane
is the primary source of alerts, you need a secondary method for monitoring the
management plane itself. Then, if the management plane fails, an alert from the
secondary source can trigger the action to recover the management plane.
› Facilitate automated issue discovery and remote diagnostics.

Note: You must confirm every assumption in the following list.

Core infrastructure design assumptions by component:


• Clusters: The maximum number of VMs per workload cluster is 1,860 (124 per usable
node).

© 2024 Nutanix, Inc. All rights reserved | 10


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Monitoring:
› IT operations teams can continuously staff the mailbox that receives monitoring
alerts to address critical issues promptly.
› IT operations teams can provide email infrastructure with sufficient resilience to
send, receive, and access emails even during critical outages.
› Network security appliances allow the management plane to transmit telemetry data
to Nutanix.
• Infrastructure: IT operations teams can deploy Active Directory and DNS in a highly
available configuration in each management cluster.
Core infrastructure monitoring component design risk: If Prism Central becomes
unavailable for any reason, the platform can't send alerts. To mitigate this risk, configure
each Prism Element instance to send alerts as well. As this approach results in duplicate
alerts during normal operations, send Prism Element alerts to a different mailbox that you
can monitor when Prism Central is unavailable.
Core infrastructure design constraints by component:
• Clusters: The number of VMs per pod doesn't exceed 7,500 (the limit of Flow Network
Security policies per Prism Central instance).
• Monitoring: SMTP is an available channel in the environment that can receive event
monitoring alerts. Syslog captures logs but doesn't generate alerts on events.

Core Infrastructure Conceptual Design


The conceptual pod design has the following features:
• Two active-active datacenters in separate availability zones (AZs) with less than 5 ms
of round-trip time latency between sites
• A small management cluster in each AZ that hosts services such as Prism Central and
Active Directory
• An instance of Prism Central hosted in the management cluster of each AZ (dual
Prism Central deployment per pod)
• A workload cluster in each AZ that hosts the production workloads

© 2024 Nutanix, Inc. All rights reserved | 11


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• A backup cluster in each AZ replicated between sites for disaster recovery

Figure 2: Hybrid Cloud On-Premises Conceptual Pod Design

Scalability
Scalability is one of the core concepts of the Nutanix platform and refers to the ability to
increase storage and compute capacity to meet current and future workload demands. A
well-designed cluster meets current requirements while providing a path to support future
growth.

© 2024 Nutanix, Inc. All rights reserved | 12


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Scalability Conceptual Design


This NVD allows horizontal and vertical scaling within the boundaries set by running
workloads in a single rack per AZ across two AZs. If the workload grows, you can add
nodes and storage capacity to the cluster. This design has a maximum of 16 nodes per
cluster; to scale beyond that number, create additional Nutanix clusters.
Note: If the infrastructure changes in one AZ, you must upgrade the other AZ accordingly to ensure that a
failover can finish successfully.

Because this NVD supports three general VM sizes, each node's memory is fully
populated to accommodate the resulting mixed memory requirements. This approach
also provides maximum memory performance, even if you don't need it. If memory
pressure increases, add more nodes. The design uses all-flash disks to accommodate
peak workload demands.
The design uses two racks in the on-premises datacenter, with redundant top-of-rack
network switches. One rack holds management and backup clusters, and the other
holds workload clusters. Datacenter power and cooling limitations might introduce further
constraints; for more information, see the Datacenter Infrastructure section.
When you scale VM workloads, cluster design is the biggest constraint.
Table: Scalability Design Decisions
Design Option Validated Selection
Node memory population Fully populate node memory.
Node drive type Use all-flash drives.
Node drive population Don't fully populate nodes with disk drives.
Dual rack Use two racks per AZ.
Establish scalability boundaries Use X-Ray to confirm load per node.
Rack availability Don't use rack availability.

Configuration maximums also constrain solution scalability. For the limits specific to this
design, see the Design Limits section of this document or the configuration maximums
or the maximum system values on the Nutanix Support Portal (Nutanix Portal credentials
required). You might reach a constraint before you reach a configuration maximum. For
example, a workload node that contains only Linux LAMP all-in-one VMs can't hold more
than 48 VMs, assuming that you can use 100 percent of the available memory for VMs.

© 2024 Nutanix, Inc. All rights reserved | 13


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Resilience
Nutanix provides many resilience features, including storage replication, snapshots,
block awareness, degraded node detection, and self-healing. These capabilities
increase the resilience of all workloads, even if the application itself has limited resilience
options. Nutanix layers these software features on hardware designed to be resilient
(for example, with redundant physical components and power supplies, many of
which are hot-swappable or otherwise easily serviceable). Running workloads in a
virtualized environment adds another kind of resilience, as you can perform many
maintenance operations without application downtime. A resilient network fabric that can
sustain individual link, node, or block failures without significant impact completes the
architecture.

Resilience Conceptual Design


All components are physically redundant. The physical components include the top-of-
rack switches, the nodes and their internal parts, and the datacenter itself in case of a
disaster.
To protect workloads to meet or exceed service-level agreements (SLAs), this NVD
separates the workload clusters from the management clusters. The workload cluster
sizing provides n + 1 failure redundancy. Monitoring and alerting ensure that any
issues result in an alert; consistently monitoring workload growth ensures that sufficient
headroom is available at any time.
Generic workloads don't have an ideal cluster size. This NVD uses 16-node building
blocks to take advantage of block awareness, a key platform resilience feature.
X-Ray test scenarios establish resilience boundaries for various failure scenarios.
Table: Resilience Design Decisions
Design Option Validated Selection
Component redundancy Ensure the full redundancy of all components
in the AZ.
Resilience boundaries Use X-Ray to find resilience constraints.

© 2024 Nutanix, Inc. All rights reserved | 14


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

VM Design
As the overall objective is to provide a hybrid cloud environment for general server
virtualization workloads, this NVD establishes three standard VM sizes to facilitate
consistent deployment, automation, sizing, and capacity planning for the environment.
The Cluster Design section specifies the maximums for each VM size to help with
capacity planning, but you can combine any number of VMs of any size up to the
maximums Nutanix designed this architecture to support. All VMs deploy with UEFI and
Secure Boot enabled.

VM Names
Nutanix recommends keeping the VM name and the guest OS host name the same. This
approach streamlines operational and support requirements and minimizes confusion
when you identify systems in the environment.

VM Guest Clustering
You can use VM guest clustering to form failover clusters using shared disk devices
with both Windows and Linux guest operating systems. With Nutanix AHV, you can use
a shared volume group between multiple VMs as part of a failover cluster—connect
the shared volume group to the VMs and install the necessary guest software. Nutanix
natively integrates SCSI-based fencing using persistent reservations and doesn't require
any complex configuration.

VM Standard Deployment Sizes


This NVD supports the VM configurations detailed in the following table.
Table: Supported VM Configurations
Configurable Small VM Medium VM Large VM
Virtual CPU 1 2 4
Virtual memory 8 GB 16 GB 32 GB
Virtual storage 50 GB 100 GB 200 GB
Virtual NIC 1 1 1
Virtual CD-ROM 1 1 1
Volume groups No No No

© 2024 Nutanix, Inc. All rights reserved | 15


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Configurable Small VM Medium VM Large VM


Maximum VM 124 92 46
instances per node

Note: This design targets an oversubscription ratio of four or fewer virtual CPUs per physical CPU.

Windows VMs
All Windows VMs in this NVD are based on Windows Server 2019 Datacenter Edition.
Windows VMs use the standard blueprints detailed in the following table when
provisioned with NCM Self-Service.
Note: Secure Boot–enabled VMs don't support hot-plug operations.

Table: Standard Blueprints for Windows VMs


Configurable WISA All-in-One WISA Distributed Standard Blueprint
Template Template Template
Base template size Large Medium Small
Virtual CPU 4 per VM 2 per VM 1 per VM
Virtual memory 32 GB per VM 16 GB per VM 8 GB per VM
Virtual storage 200 GB per VM 100 GB per VM 50 GB per VM (VirtIO-
(VirtIO-SCSI) (VirtIO-SCSI) SCSI)
Virtual NIC 1 (VirtIO-Net: kNormal) 1 (VirtIO-Net: kNormal) 1 (VirtIO-Net:
kNormal)
Virtual CD-ROM 1 1 1
UEFI or Legacy BIOS UEFI UEFI UEFI
Secure Boot Enabled Enabled Enabled

Note: Flow Network Security policies require the kNormal NIC type to function correctly.

Nutanix VirtIO Driver version 1.1.7 or later is required on all Windows Guest VMs.
The WISA (Windows Server, Internet Information Services, Microsoft SQL Server, and
ASP.NET) all-in-one blueprint installs and configures all necessary web, application, and
database components when deployed through NCM Self-Service.

© 2024 Nutanix, Inc. All rights reserved | 16


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

The WISA distributed blueprint includes at least two load-balanced VMs for web servers,
two load-balanced application servers, and one database server. NCM Self-Service
provisions the individual VMs and installs their specific roles. The WISA distributed
blueprint predefines and automatically applies Prism Central categories and Flow
Network Security policies.
Refer to the appendix for Windows VM performance tuning recommendations.

Linux VMs
All Linux VMs in this NVD are based on Red Hat Enterprise Linux 8.4. Linux VMs use
the standard blueprints detailed in the following table when provisioned with NCM Self-
Service.
Note: Secure Boot–enabled VMs don't support hot-plug operations.

Table: Standard Blueprints for Linux VMs


Configurable LAMP All-in-One LAMP Distributed Standard Blueprint
Template Template Template
Base template size Large Medium Small
Virtual CPU 4 per VM 2 per VM 1 per VM
Virtual memory 8 GB per VM 16 GB per VM 8 GB per VM
Virtual storage 200 GB per VM 100 GB per VM 50 GB per VM (VirtIO-
(VirtIO-SCSI) (VirtIO-SCSI) SCSI)
Virtual NIC 1 (VirtIO-Net) 1 (VirtIO-Net) 1 (VirtIO-Net)
Virtual CD-ROM 1 1 1
UEFI or Legacy BIOS UEFI UEFI UEFI
Secure Boot Enabled Enabled Enabled

The LAMP (Linux, Apache, MySQL, and PHP) all-in-one blueprint has all necessary web,
application, and database components preinstalled and ready to deploy on demand as a
single VM through NCM Self-Service.
The LAMP distributed blueprint includes at least two load-balanced VMs for web servers,
two load-balanced application servers, and one database server. NCM Self-Service
provisions the individual VMs and installs their specific roles. The LAMP distributed

© 2024 Nutanix, Inc. All rights reserved | 17


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

blueprint predefines and automatically applies Nutanix and Flow Network Security
policies.
For Linux VM performance tuning recommendations, see the appendix.

Cluster Design
This design incorporates three distinct cluster types:
• Management: Critical infrastructure and environment management workloads
• Workload: The building block for all general server virtualization workloads
• Backup: Backup storage for the workload and management components
This section defines the overall high-level cluster design, platform selection, capacity
management, scaling, and resilience. This design follows the block-and-pod architecture
defined in the Nutanix Hybrid Cloud Reference Architecture.

Figure 3: Hybrid Cloud On-Premises Block-and-Pod Architecture

© 2024 Nutanix, Inc. All rights reserved | 18


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Cluster Conceptual Design


This NVD solution uses one region with two separate AZs. Both AZs host active
workloads, and each AZ provides a replication target for the other's workload
cluster building blocks, as shown in the Hybrid Cloud On-Premises Conceptual Pod
Design image. Cloud-native applications that have built-in redundancy don't require
infrastructure-level replication between AZs.
Table: Cluster Design Decisions
Design Option Validated Selection
Number of regions Use 1 region.
Number of AZs Use 2 AZs.
Number of datacenters Use 2 datacenters: 1 per AZ.
Workload types per cluster Use mixed workloads per cluster, as this
design is for general server virtualization.
Minimum workload cluster building block size Use at least 4 nodes.
Workload cluster building block expansion Use 4 nodes.
increments
Maximum workload cluster building block size Use at most 16 nodes.
Maximum number of supported VMs Maximum number of supported VMs across
two AZs is 7,500.
Maximum number of running VMs per usable Use at most 124 small VMs, 92 medium VMs,
node in the workload cluster building block or 46 large VMs per usable node.
Maximum number of VMs per workload cluster Use at most 1,860 small VMs per workload
building block cluster building block.
Workload cluster building block node Use n + 1 for redundancy.
redundancy
Maximum usable nodes per maximum Configure at most 15 usable nodes per
workload cluster building block maximum workload cluster building block to
ensure n + 1.
Workload cluster building blocks in one rack or Use one rack per workload cluster building
split across multiple racks block.
Cluster replication factor Use replication factor 2.
Cluster high availability configuration Guarantee high availability.

© 2024 Nutanix, Inc. All rights reserved | 19


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Percentage of deployed VMs supported during Support 25% of VMs: protect up to 1,500 VMs
disaster recovery failover with asynchronous replication and up to 500
VMs with synchronous replication, for up to
2,000 VMs per pod total.
Maximum number of VMs deployed per Deploy at most 930 small VMs per workload
workload cluster building block accounting for cluster building block.
disaster recovery capacity
Maximum usable resource capacity per Use at most 50 percent of the resource
workload building block accounting for disaster capacity.
recovery failover

Platform Selection
The following table provides details regarding hardware platform selection.
Table: Platform Selection
Hardware or Service Management Cluster Workload Cluster Backup (Mine)
Cluster
Node type NX-1175S-G8 NX-3170-G8 NX-8155-G8
Node count 4 (increments of 1) 4–16 per building 4 (increments of 1)
block (increments of 4,
up to 16 maximum)
Processor 1 Intel Xeon Gold 2 Intel Xeon Gold 2 Intel Xeon Gold
6326 16-core 185 W 5318Y 24-core 165 W 6326 16-core 185 W
2.9 GHz 2.1 GHz (Ice Lake) 2.9 GHz
Memory 8 × 64 GB 3,200 MHz 24 × 64 GB 3,200 MHz 8 × 32 GB 3,200 MHz
DDR4 RDIMM (512 DDR4 RDIMM (1.5 TB DDR4 RDIMM (256
GB total) total) GB total)
SSD 2 × 1.92 TB 6 × 3.84 TB 2 × 3.84 TB
HDD N/A N/A 8 × 18 TB
NIC 25 GbE Dual SFP+ 25 GbE Dual SFP+ 25 GbE Dual SFP+
Form factor 1RU of single nodes 1RU of single nodes 2RU of single nodes
Support 3Y Production 3Y Production 3Y Production

© 2024 Nutanix, Inc. All rights reserved | 20


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Capacity Management
This NVD sizes the management and backup (Nutanix Mine Integrated Backup) clusters
to host typical workloads as defined in the Management Components and Backup
sections of this document. If those clusters need more resources, you can expand them
one node at a time. NCM Intelligent Operations can help forecast resource demand.
The main unit of expansion for workload clusters is the building block. In this design,
each workload cluster building block has a maximum of 16 nodes, with 15 nodes of
usable capacity and 1 node for failure capacity, and a minimum of 4 nodes with 3
usable (following the n + 1 principle). You can expand a workload cluster building block
in increments of 4 nodes, up to the maximum. Based on the small VM specification,
you can have a maximum of 1,860 VMs per workload cluster building block. When a
workload cluster building block reaches the maximum number of nodes, the administrator
starts a new building block with the 4-node minimum, then can expand the new block in
increments of 4 nodes as needed.
Each pod can support a maximum of 7,440 VMs. When a pod reaches the maximum
number of VMs, start a new pod. This NVD sets the workload cluster building block
maximum at 16 nodes to allow you to complete nondisruptive Nutanix software,
hardware, firmware, and driver maintenance using Nutanix LCM within a 48-hour
maintenance window (using Nutanix NX model hardware). You can split the maintenance
window into shorter segments if needed. You can also use a smaller maximum size per
workload building block to shorten maintenance windows and deploy more small clusters
per pod without changing the maximum number of nodes or VMs each pod supports.
For example, an 8-node workload cluster building block reduces maintenance windows
by half and provides twice the number of clusters per pod without changing the number
of nodes supported. However, the number of usable nodes decreases with the smaller
cluster size, as one node per cluster is logically reserved for maintenance and failure.
Software upgrades of AHV and AOS run for approximately 1 hour per node and firmware
(BMC, BIOS, host bus adapters) upgrades run for approximately 1.5 hours per node.
Note: Nutanix OEM partner hardware platforms might require more or less time depending on the specific
OEM partner recommendations.

In the following figure, the first pod (Pod 1) reached capacity and the administrator
started a new pod (Pod n). If existing management clusters have enough capacity for the
additional pods, you can reuse them and not implement additional management clusters.

© 2024 Nutanix, Inc. All rights reserved | 21


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 4: Hybrid Cloud On-Premises Scaling

The following table displays the maximum number of VMs per workload cluster building
block and per node.
Table: Maximum Number of VMs
Scalability Small VMs Medium VMs Large VMs
Consideration
Maximum running 1,860 1,380 690
VMs per workload
cluster building block
Maximum running 124 92 46
VMs per node
Maximum deployed 930 690 345
VMs per workload
cluster building block
to ensure disaster
recovery capacity

Note: The maximum deployed VMs per workload cluster is 50 percent of the running maximum to ensure
disaster recovery capacity.

Cluster Resilience
Replication factor 2 protects against the loss of a single component in case of failure or
maintenance. During a failure or maintenance scenario, Nutanix rebuilds any data that
falls out of compliance much faster than traditional RAID data protection methods, and
rebuild performance increases linearly as the cluster grows.

© 2024 Nutanix, Inc. All rights reserved | 22


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

The Nutanix architecture rapidly recovers in the event of failure and has no single points
of failure. You can configure the cluster to maintain three copies of data; however, for
general server virtualization, Nutanix recommends that you distribute application and VM
components across multiple clusters to provide greater resilience at the application level.
You can achieve rack-aware resilience when you split clusters evenly across at least
three racks, but this NVD doesn't use that approach because it adds configuration and
operational complexity. Nutanix cluster replication factor 2 in this design is sufficient to
exceed five nines of availability (99.999 percent).

Figure 5: Availability Chart

Storage Design
Nutanix uses a distributed, shared-nothing architecture for storage. For details on
Nutanix storage constructs, see the Storage Design section in the Nutanix Hybrid
Cloud Reference Architecture. For information on node types, counts, and physical
configurations, see the Cluster Design section in this document.
Creating a cluster automatically creates the following storage containers:
• NutanixManagementShare: Used for Nutanix features like Files and Objects and other
internal storage needs
This storage container doesn't store workload vDisks.

© 2024 Nutanix, Inc. All rights reserved | 23


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• SelfServiceContainer: Used by the NCM Self-Service Portal and automation services


• Default-Container-XXXX: Used by VMs to store vDisks for user VMs and applications
In both AZs, the management cluster uses the Default-Container to store VMs and their
vDisks. Because this NVD provisions workloads from images with NCM Self-Service, the
SelfServiceContainer serves the workload and backup clusters. This NVD enables inline
compression and erasure coding on the Default-Container and the SelfServiceContainer
for all management, workload, and backup clusters in both datacenters. Because these
clusters have a fault tolerance level of 1, the replication factor for the containers is 2. The
system reserves rebuild capacity to ensure that the cluster can rebuild in the event of
node failures and to provide early warning of pending capacity constraints.

Data Reduction Options


To increase the effective capacity of the cluster, the design enables inline compression
and erasure coding with the default strip size on the container used for workloads, as the
intended workload is general server virtualization.
For the Default-Container-XX, NutanixManagement Share, and SelfService Container,
enable compression and erasure coding and disable deduplication across both the
primary and disaster recovery clusters.
Table: Storage Design Decisions
Design Option Validated Selection
Sizing a cluster Use an all-flash cluster to provide enough
usable SSD capacity to support the
application's active data set.
Node type vendors Use all Nutanix NX nodes. Don't mix node
types from different vendors in the same
cluster.
Node and disk types Use identical node types that have similar
disks.
Sizing for node redundancy for storage and Size all clusters for n + 1 failover capacity.
compute
Fault tolerance and replication factor settings Configure the cluster for fault tolerance 1 and
configure the container for replication factor 2.
Inline compression Enable inline compression.

© 2024 Nutanix, Inc. All rights reserved | 24


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Deduplication Disable deduplication.
Erasure coding Enable erasure coding.
Availability domain for workload cluster Use block awareness.
Availability domain for backup cluster Use node awareness.
Availability domain for management cluster Use node awareness.
Reserve rebuild capacity Enable reserve rebuild capacity.

Network Design
A Nutanix cluster can tolerate multiple simultaneous failures because it maintains a set
redundancy factor and offers features such as block awareness and rack awareness.
However, this level of resilience requires a highly available network connecting a cluster's
nodes.
Nutanix clusters send each write to another node in the cluster. As a result, a fully
populated cluster sends storage replication traffic in a full mesh, using network bandwidth
between all Nutanix nodes. Because storage write latency directly correlates to the
network latency between Nutanix nodes, any increase in network latency adds to storage
write latency. Protecting the cluster's read and write storage capabilities requires highly
available connectivity between nodes. Even with intelligent data placement, if network
connectivity between multiple nodes is interrupted or becomes unstable, VMs on the
cluster can experience write failures and enter read-only mode.
Nutanix recommends using datacenter-grade switches designed to handle high-
bandwidth server and storage traffic at low latency. For more information, see the Nutanix
physical networking best practice guide.

© 2024 Nutanix, Inc. All rights reserved | 25


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 6: Physical Network Architecture

Table: Physical Network Design Decisions


Design Option Validated Selection
Datacenter switch Use large-buffer 25 Gbps switches for the
datacenter.
Network topology Use the highly available leaf-spine network
topology.
Top-of-rack switches Populate each rack with two 25 Gbps top-of-
rack switches.
Link aggregation group (LAG) type Use an MLAG configuration to avoid stacking
and ensure network availability during
individual device failure.
Number of switches between nodes Use at most three switches between any two
Nutanix nodes in the same cluster.
Network oversubscription Reduce network oversubscription to achieve a
1:2 ratio.
Network design Use a layer 2 network.

© 2024 Nutanix, Inc. All rights reserved | 26


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


IGMP snooping Enable IGMP snooping.

Table: Node Connectivity Network Design Decisions


Design Option Validated Selection
Controller Virtual Machine (CVM) and Configure the CVM and hypervisor VLAN as
hypervisor VLAN native, or untagged, on server-facing switch
ports.
Switch ports for guest workloads Use tagged VLANs on the switch ports for all
guest workloads.
Top-of-rack switches Connect a 25 GbE NIC to each top-of-rack
switch.
Virtual switches Use one vs0 virtual switch with two of the
fastest uplinks of the same speed.
NICs Use NICs from the same vendor in a bond.
Logical network separation Use VLANs to separate logical networks.
Load balancing Use LACP (Link Aggregation Control Protocol;
active-active) uplink load balancing.
MTU size Size the MTU at 1,500 bytes.
Terminate L2/L3 networking Have L2/L3 networking terminate on the spine.

Table: Workload Cluster Networks


Network Variable Recommendation
Shared infrastructure network subnet size /24
VM network subnet size /22
Number of addresses available per /24 254
network
Number of VM networks 4 per AZ
Present VM networks to other workload No
clusters
Stretch VM networks to secondary site No

© 2024 Nutanix, Inc. All rights reserved | 27


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Network Variable Recommendation


Number of addresses available per /22 1,022
network
AHV virtual switch IGMP snooping Enabled

Table: Management Cluster Networks


Network Variable Recommendation
Shared infrastructure network subnet size /24
VM network subnet size /24
Number of addresses available per /24 254
network
Number of VM networks 1
AHV virtual switch IGMP snooping Enabled

Network Microsegmentation
Flow Network Security enables VM- and application-based microsegmentation for traffic
visibility and control. This NVD uses Flow Network Security to protect the environment
from network attacks, create strict traffic controls that segment the network, and gain
visibility into application network behavior for all AHV hosts managed by a single Prism
Central instance. Flow Network Security applies all categories and security policies
uniformly across all clusters and VMs in this Prism Central instance.
When two Prism Central instances exist—for example, for disaster recovery or scalability
—the categories and policies don't replicate between them, so the designer and
administrator need to create a system to either automatically or manually sync categories
and policies between sites. This NVD uses a script to sync Flow Network Security
policies and categories between Prism Central instances in different AZs. To enable
Flow Network Security synchronization between two Prism Central instances, follow the
procedure in KB 12253.
We created a list of requirements and limitations to keep in mind when protecting your
on-premises hybrid cloud environment with Flow Network Security:
• Flow Network Security relies on AHV to enforce policies in the hypervisor virtual
switch, so you can't use it to protect VMs running on ESXi or Hyper-V hypervisors.

© 2024 Nutanix, Inc. All rights reserved | 28


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Nutanix Flow Network Security can secure fewer VMs than the maximum number of
VMs that Prism Central can manage. Consider these scalability limits when you design
clusters and Prism Central deployments.
• Consider the maximum number of VMs protected in a single policy when you design
the individual security policies. For detailed requirements and limitations, see the Flow
Network Security guide.
• In this NVD, all applications exist inside the same environment, so you don't need
isolation policies.
• You can achieve good application security by strictly controlling the inbound side of
the policy and allowing all traffic on the outbound side, but your situation might require
stricter outbound traffic regulation. If you don't have a physical north-south firewall
available for this task, the Flow Network Security application policy can perform this
function.
• You must determine whether you need policy hit logs to track allowed and blocked
connections. This NVD enables policy hit logs for all policies.
• This NVD creates application VMs through NCM Self-Service with the appropriate
AppType and AppTier categories assigned. You can also use external automation with
our APIs. When disaster recovery replicates VMs to another site, the categories also
replicate.
Table: Flow Network Security Design Decisions
Design Option Validated Selection
Number of Prism Central instances Use two Prism Central instances (one per AZ)
and use a script to replicate security policies
between Prism Central instances.
VM scale per Prism Central instance Limit VM scale to 7,500 per Prism Central
instance.
Isolation policies Don't use isolation policies.
Application inbound and outbound traffic Use inbound security policies and allow all
outbound traffic.
Category creation Create a unique AppType category for each
application and reuse AppTier categories.

© 2024 Nutanix, Inc. All rights reserved | 29


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Category automation Create and apply categories using NCM Self-
Service when deploying applications.
Address groups Create address groups to define the corporate
network for easy rule creation.
Policy naming convention Name each security policy as follows: <AZ
number><policy name><policy number> (for
example, AZ01AdProtection01).

Application Connectivity Requirements


The first step in identifying application connectivity is to define the scope of a single
application. This application becomes the center of an application policy, and you can
tag all VMs inside the application with the same AppType category (such as AppType:
AZ01App1).
Next, identify the tiers in the application, such as web and database, and create an
AppTier category for each. AppTier categories don't need to be unique between
AppTypes. Create the smallest number of application types and application tiers required
to uniquely identify and group your applications.
Note: Use the AZ number in the application name for easier identification when replicating across AZs.

For each application policy, determine the inbound traffic required to this application
and whether the traffic is from another VM in the Nutanix environment or external. Next,
determine the required traffic between tiers of the application and whether you should
allow traffic within the same tier. Finally, decide whether you should allow outbound traffic
for this application.
The following table provides an example security policy for a single application.
Modify the name and specific addresses or categories based on the application you're
protecting.
Note: Enable hit logs for all policies.

Table: Example Application Security Policy: AZ01-Example-001

© 2024 Nutanix, Inc. All rights reserved | 30


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Purpose Source Destination Port / Protocol


Allow corp clients to AddrCorpClient AppType: AZ01- TCP 80, 443
web Example-001; AppTier:
Web
Allow web to app AppType: AZ01- AppType: AZ01- TCP 8080
Example-001; AppTier: Example-001; AppTier:
Web App
Allow app to database AppType: AZ01- AppType: AZ01- TCP 3306
Example-001; AppTier: Example-001; AppTier:
App DB
Allow example app out AppType: AZ01- Allow all All
Example-001
Allow NCM Self- AddrCalmAZ01 AppType: AZ01- TCP 22, 5985–5986
Service to manage Example-001, all tiers
app

For all inbound traffic rules concerning VMs that exist in the Nutanix environment but
aren't part of an existing AppType, create new top-level categories that you can use to
add the relevant VMs to the policy as sources. For sources and destinations that don't
exist in any Nutanix cluster, create an Addresses entity to group these networks and IP
addresses for easy policy management.
This NVD creates address groups with the following addresses of corporate servers and
clients to allow differentiated access for devices that don't run as AHV VMs. Replace
these placeholders with addresses specific to your deployment.
Table: Address Groups
Name Addresses Purpose
AddrCorpAll 10.0.0.0/8 Identify all corporate IP
addresses.
AddrCorpClient 10.50.0.0/16 Identify all IP addresses that
belong to corporate client
devices.
AddrCorpServer 10.38.0.0/16 Identify all IP addresses that
belong to corporate server
devices.

© 2024 Nutanix, Inc. All rights reserved | 31


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Name Addresses Purpose


AddrCalmAZ01 10.38.100.10, 10.38.100.11, Identify all NCM Self-Service
10.38.100.12 IP addresses in AZ01.
AddrCalmAZ02 10.38.200.10, 10.38.200.11, Identify all NCM Self-Service
10.38.200.12 IP addresses in AZ02.

The following application security policies protect infrastructure VMs that run on AHV.
This infrastructure is unique for each site, so you don't need to replicate the policies
between Prism Central instances. Create these infrastructure policies in each Prism
Central instance.
Table: Active Directory Application Security Policy InfraAD-001
Purpose Source Destination Port / Protocol
Allow all corp to Active AddrCorpAll AppType: See Microsoft
Directory ActiveDirectory documentation
Allow Active Directory AppType: Allow All All
out ActiveDirectory

For more information on Active Directory, see Microsoft's Active Directory and Active
Directory Domain Services Port Requirements article.
Table: Syslog Application Security Policy InfraSyslog-001
Purpose Source Destination Port / Protocol
Allow corp servers to AddrCorpServer AppType: Syslog UDP 6514, TCP 6514
syslog
Allow corp clients to AddrCorpClient AppType: Syslog TCP 9000, UDP 514,
syslog TCP 514
Allow syslog out AppType: Syslog Allow all All

This NVD modifies the forensic quarantine policy to allow quarantine of specific VMs
while also allowing access from the security operations team. VMs owned by the security
operations team for the explicit purpose of digital forensics and incident response have
the category Security: DFIR. Update this policy in both Prism Central instances.
Table: Quarantine Security Policy

© 2024 Nutanix, Inc. All rights reserved | 32


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Purpose Source Destination Port / Protocol


Allow security VMs to Security: DFIR Forensic: Quarantine All
investigate
Block all quarantine Forensic: Quarantine None None
outbound

Management Components
Management components such as Prism Central, Active Directory, DNS, and NTP are
critical services that must be highly available. Prism Central is the global control plane
for Nutanix, responsible for VM management, replication, application orchestration,
microsegmentation, and other monitoring and analytics functions. You can deploy Prism
Central in either a single-VM or scale-out (three-VM) configuration.
When you design your management components, decide how many Prism Central
instances you need. This NVD uses a scale-out Prism Central instance in each AZ, for a
total of two Prism Central instances. This setup provides better scalability and increased
disaster recovery functionality when you use additional Nutanix portfolio products, such
as Flow Network Security, NCM Self-Service, and Objects.

Management Conceptual Design


Nutanix recommends having a dedicated management cluster in the datacenter AZ for
both Nutanix and non-Nutanix environment management and control plane instances.
For this NVD, the management clusters contain at least four nodes and include scale-
out Prism Central instances in both AZs. The management clusters only run core
infrastructure management components, not general user VM workloads.

© 2024 Nutanix, Inc. All rights reserved | 33


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 7: Hybrid Cloud Management Plane

Management Detailed Design


In this NVD, all clusters run on AOS 6.5.
Table: Management Component Design Decisions
Design Option Validated Selection
Management cluster architecture Have one management cluster in each AZ.
Management cluster size Create a four-node (n + 1) cluster.
Management cluster node specifications See the Platform Selection section.

© 2024 Nutanix, Inc. All rights reserved | 34


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Prism Central deployment Deploy scale-out Prism Central for enhanced
cluster management.
Prism Central deployment location Deploy a scale-out Prism Central instance in
each management cluster in each AZ using
runbook disaster recovery automation.
Prism Central deployment size Use a large Prism Central deployment size: 3
VMs, each with 10 vCPU, 44 GB of memory,
and 2,500 GiB of storage.
Prism Central container name Use the default container name.
Active Directory authentication Use Active Directory authentication.
Connection to Active Directory Use SSL or TLS for Active Directory.

Monitoring
Monitoring in the NVD falls into two categories: event monitoring and performance
monitoring. Each category addresses different needs and issues.
In a highly available environment, you must monitor events to maintain high service
levels. When faults occur, the system must raise alerts promptly so that administrators
can take remediation actions as soon as possible. This NVD configures the Nutanix
platform's built-in capability to generate alerts in case of failure.
In addition to keeping the platform healthy, maintaining resource usage is also essential
to delivering a high-performing environment. Performance monitoring continuously
captures and stores metrics that are essential when you need to troubleshoot application
performance. A comprehensive monitoring approach should track metrics for the
following areas:
• Application and database
• Operating system
• Hyperconverged platform
• Network environment
• Physical environment

© 2024 Nutanix, Inc. All rights reserved | 35


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

By tracking a variety of metrics in these areas, the Nutanix platform can also provide
capacity monitoring across the stack. Most enterprise environments inevitably grow,
so you need to understand resource utilization and the rate of expansion to anticipate
changing capacity demands and avoid any business impact caused by a lack of
resources.

Monitoring Conceptual Design


In this NVD, Prism Central performs most of the event monitoring. We use SMTP-based
email alerts as the channel for notifications in this design.
Note: This NVD uses syslog for log collection; for more information, see the Security and Compliance
section.

To cover situations where Prism Central might be unavailable, each Nutanix cluster in
this NVD sends out notifications using SMTP as well. The individual Nutanix clusters
send alerts to a different receiving mailbox that's only monitored when Prism Central isn't
available.

Figure 8: Hybrid Cloud On-Premises Monitoring Conceptual Design

© 2024 Nutanix, Inc. All rights reserved | 36


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

By default, Prism Central captures cluster performance in key areas such as CPU,
memory, network, and storage utilization. When a Prism Central instance manages a
cluster, Prism Central transmits all Pulse data, so it doesn't originate from individual
clusters. When you enable Pulse, it detects known issues affecting cluster stability and
automatically opens support cases.
The network switches that connect the cluster also play an important role in cluster
performance. A separate monitoring tool that's compatible with the deployed switches
can capture switch performance metrics. For example, an SNMP-based tool can
regularly poll counters from the switches.

Figure 9: Hybrid Cloud Performance Metrics Systems

© 2024 Nutanix, Inc. All rights reserved | 37


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

The following table provides descriptions of the monitoring design decisions.


Table: Monitoring Design Decisions
Design Option Validated Selection
Platform performance monitoring Use Prism Central to monitor Nutanix platform
performance.
Network switch performance monitoring Use a separate tool that performs SNMP
polling to the switches and monitors network
switch performance.
Management cluster storage utilization warning On a management cluster with AOS 6.5.x,
threshold leave the Prism Element storage utilization
warning threshold at 75 percent (the default
value).
Workload cluster storage utilization warning On a workload cluster with AOS 6.5.x, leave
the Prism Element storage utilization warning
threshold at 75 percent (the default value).
Prism Element health check CPU utilization For the Prism Element health check, leave the
warning threshold host CPU utilization warning threshold at 75
percent (the default value).
SMTP alerting Use SMTP alerting; use enterprise SMTP
service as the primary SMTP gateway for
Prism Element and Prism Central.
SMTP alerting source email address Configure the source email address to be
clustername@<yourdomain>.com to uniquely
identify the source of emails. For Prism
Central, use the Prism Central host name in
place of clustername.
SMTP alerting Prism Central recipient email Configure the Prism Central
address recipient email address to be
primaryalerts@<yourdomain>.com.

SMTP alerting Prism Element recipient email Configure the Prism Element
address recipient email address to be
secondaryalerts@<yourdomain>.com.

NCC reports Configure daily NCC reports to run at 6:00


AM local time and send them by email to the
primary alerting mailbox.

© 2024 Nutanix, Inc. All rights reserved | 38


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Nutanix Pulse Configure Nutanix Pulse to send telemetry
data to Nutanix.

Security and Compliance


Nutanix recommends a defense-in-depth strategy for layering security throughout any
enterprise datacenter solution. This design section focuses on validating the layers that
Nutanix can directly oversee at the control and data plane levels. For more information
on network-based security, see the Network Design section, and for additional details,
see the Security and Compliance Layer section of the Nutanix Hybrid Cloud Reference
Architecture.

Security Domains
Nutanix recommends isolating the management and backup clusters from the rest of
the network by firewalls. Backup clusters are often prime targets for compromise. When
designing your network security architecture, remember that the backup and workload
clusters have significant traffic between them. In addition, Nutanix management IPMI
interfaces must only be directly accessible from the management domain.

Authentication and Authorization


All Nutanix control plane endpoints use Active Directory–hosted LDAPS. Active Directory
itself is redundant across the management clusters in both AZs. Only administrative
accounts map to admin roles, which are controlled through a named Active Directory
group.
This NVD rotates all default passwords for all accounts that aren't integrated with Active
Directory, such as emergency accounts or local accounts for out-of-band interfaces.
Because clusters don't have lockdown mode enabled, password SSH is enabled by
default.
For more information on self-service and hosted VM access, see the Self-Service with
Automation section.

AOS Hardening
This NVD enables additional nondefault hardening options in each AOS cluster:

© 2024 Nutanix, Inc. All rights reserved | 39


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Advanced Intrusion Detection Environment (AIDE)


• Hourly security configuration management automation (SCMA)
Both features are trivial to enable, introduce little to no discernible system overhead,
and help detect and prevent internal system configuration changes that might otherwise
compromise service availability. These features add to the intrinsic hardening built into
AOS.

Syslog
For each control plane endpoint, system-level internal logging goes to a centralized third-
party syslog server that runs in the local management cluster in each AZ. The system
sends logs for all available modules when they reach the syslog Error severity level.
TCP transport using TLS is preferred where available. Syslog coverage extends to
microsegmentation event logging from Prism Central with Flow Network Security.
Note: This NVD assumes that the centralized syslog servers in each AZ can replicate log messages
between sites, allowing inspection if the primary log system is unavailable.

Certificates
SSL endpoints serve all Nutanix control plane web pages. This NVD replaces the default
self-signed certificates with certificates signed by an internal certificate authority from a
Microsoft public key infrastructure (PKI). Any client endpoints that interact with the control
plane should have the trusted certificate authority chain preloaded, preventing browser
security errors.
Note: Certificate management is an ongoing activity, and certificates need to be rotated periodically. The
NVD signs all certificates for one year of validity.

Data-at-Rest Encryption
Nutanix AOS can perform data-at-rest encryption (DaRE) at the cluster level; however,
as the NVD doesn't have a stated requirement that warrants enabling it, this design
doesn't use it. If requirements change, you can enable DaRE nondisruptvely after cluster
creation and data population. Once you enable DaRE, existing data is encrypted in place
and all new data is written in an encrypted format.
To enable DaRE, you must also deploy an encryption key management solution.

© 2024 Nutanix, Inc. All rights reserved | 40


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Even if you decide not to use DaRE, you can still use in-guest encryption techniques
such as system-level encryption, database encryption (for example, Microsoft SQL
Transparent Data Encryption (TDE)), or encrypted file storage; however, in-guest
encrypted data can't be compressed in most cases. As this design enables compression
but in-guest encrypted data isn't likely to be compressible, using in-guest encryption
might affect the amount of available storage.
Table: Security Design Decisions
Design Option Validated Selection
DaRE Disable DaRE; don't deploy a key
management server.
SSL endpoints Sign control plane SSL endpoints with an
internal certificate authority (Microsoft PKI).
Certificates Provision certificates with a yearly expiration
date and rotate accordingly.
Authentication Use Active Directory LDAPS authentication
(port 636).
Control plane endpoint administration Use a common administrative Active Directory
group for all control plane endpoints.
Cluster lockdown mode Don't enable cluster lockdown mode (allow
password-driven SSH).
Nondefault hardening options Enable AIDE and hourly SCMA.
System-level internal logging Enable error-level logging to external syslog
server for all available modules.
Syslog delivery Use UDP transport for syslog delivery.

Table: Security Configuration References


Configuration Target Key:Value
Active Directory AD-admin-group:ntnx-ctrl-admins
Syslog Server infra-az-syslog:6514 (udp)

© 2024 Nutanix, Inc. All rights reserved | 41


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Datacenter Infrastructure
This design assumes that datacenters in the hosting region can sustain two AZs without
intraregional fate-sharing—in other words, that failures in one datacenter's physical
plant or supporting utilities don't affect the other datacenter. To ensure high availability
at the physical level, this NVD addresses the connection between the hardware running
Nutanix software and the datacenter's power and networking components.

Rack Design
This NVD recommends dedicating one rack to the management and backup clusters,
and another rack to the workload clusters. Give each rack a pair of 10 or 25 Gbps
datacenter switches and a 1 Gbps out-of-band management switch. You can add
more racks as needed, depending on top-of-rack network switch density as well as the
datacenter's power, weight, and cooling density capabilities per square foot.
Rack 1 has the following requirements, assumptions, and constraints:
• Two top-of-rack switches
• One management switch
• Minimum power: 4,660 VA
• Minimum thermal: 15,888 BTU per hour
• Minimum weight: 365 lb
• Nutanix Mine cluster (four 2RU NX-8155-G8)
• Management cluster (four 1RU NX-1175-G8)
Rack 2 has the following requirements, assumptions, and constraints:
• Two top-of-rack switches
• One management switch
• Minimum power: 12,672 VA
• Minimum thermal: 43,216 BTU per hour
• Minimum weight: 654 lb

© 2024 Nutanix, Inc. All rights reserved | 42


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Workload cluster (16 × 1RU NX-3170 G8)

Figure 10: Hybrid Cloud Rack Layout

When you scale the environment, consider physical rack space, network port availability,
and the datacenter's power and cooling capacity. In most environments the workload
clusters are the most likely to grow, followed by the backup clusters.

© 2024 Nutanix, Inc. All rights reserved | 43


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

In this design's physical rack space, one generic 42RU rack has 3RU reserved at the
top for two data switches and one out-of-band switch. The Nutanix nodes should be
populated in the racks starting at the bottom and working up.
For network ports, each of the nodes requires two ports on the datacenter switches and
one port on the out-of-band management switches. Each node must use the top-of-rack
switches in the same rack.
For power, cooling, and weight, you need the minimums specified; account for future
expansion. Datacenter selection is beyond the scope of this design; have a conversation
about fully loaded racks with datacenter management before the initial deployment.
Planning to properly support the environment's long-term growth might change where in
the facility you want to set up the equipment.

© 2024 Nutanix, Inc. All rights reserved | 44


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

3. Backup and Disaster Recovery


This NVD uses Nutanix Disaster Recovery and Nutanix Mine Integrated Backup to
provide a business continuity and disaster recovery (BCDR) solution to protect against
different types of events. This section defines the high-level disaster recovery, backup,
and backup storage designs.
For applications with native BCDR capabilities (for example, Microsoft SQL Always On
availability groups), use the native disaster recovery resilience. For applications that lack
this capability, use infrastructure BCDR. This design provides the following recovery point
objective (RPO) levels for data protection:
• Gold Tier RPO: 0 minutes
• Bronze Tier RPO: 2 hour
• Recovery from backup RPO: 24 hours
The solution provides the following recovery time objective (RTO) levels:
• Gold Tier RTO: 2 hours
• Bronze Tier RTO: 4 hours
To protect workloads against security threats like ransomware attacks, this NVD also
provides protection to an external backup system.
NVD BCDR requirements:
• Use crash-consistent snapshots.
• Use Prism Central categories for backup automation.
• Place workloads from different protection tiers into separate protection policies.
• Configure Nutanix snapshot schedules to retain the lowest number of snapshots while
meeting the retention policy.
• Provide an RPO of 0 min for a subset of all applications.
• Provide an RPO of at least 2 hours for a subset of all applications.

© 2024 Nutanix, Inc. All rights reserved | 45


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Provide an RTO of 2 hours for a subset of all applications.


• Protect approximately 25 percent of VMs (2,000) using Nutanix Disaster Recovery.
• Support full failover (including networking).
• Support automatic IP address reassignment on workloads after failover.
• Provide maximum automation and orchestration for failover and failback.
• Provide VM-centric disaster recovery capabilities.
• Support disaster recovery testing without affecting production workloads.
• Simplify disaster recovery, reducing human interaction to a minimum.
• Support the following disaster recovery events:
› Datacenter outage
› Single cluster outage
› Ransomware attack
› Top-of-rack switch outage
› Single VLAN outage
› Human error
› Software bug
› Performance degradation caused by infrastructure (Nutanix cluster or network) or
hardware components
• Provide proactive migration in cases of impending natural disasters.
• Choose a backup vendor to use with Nutanix Mine.
• Use Nutanix Objects as an archival tier for backups.

© 2024 Nutanix, Inc. All rights reserved | 46


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Choose a backup solution that supports:


› Native Nutanix API integration
› Nutanix Files backup and restore using API
› Nutanix Files file-level backup and restore
› S3-compatible storage as a backup target
› Replication to a secondary location
› Archiving to S3-compatible storage
• Choose a target backup storage system that supports:
› Ransomware protection
› Write once, read many (WORM)
› File immutability

Note: You must confirm every assumption in the following list.

NVD BCDR assumptions:


• Disaster recovery avoidance causes minimal application and VM downtime.
• You provide redundant WAN connectivity between availability zones (AZs).
• You provide WAN connectivity with sufficient bandwidth and latency (round-trip time
(RTT) below 5 ms) to meet RPO requirements.
• Supporting infrastructure elements like DNS, Active Directory, and IPAM are available
in both AZs.
• The solution doesn't provide partial failover capabilities.
Table: NVD BCDR Risks
Risk Description Impact Likelihood Mitigation
Full outage of active Large Unlikely Fail over to remote AZ.
AZ

© 2024 Nutanix, Inc. All rights reserved | 47


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Risk Description Impact Likelihood Mitigation


Full outage of Medium Unlikely Fail management
management cluster cluster over to remote
AZ.
WAN link outage Large Unlikely Provide redundant
WAN connection.
Ransomware attack Large Likely Use a backup solution
with immutability.
Replicate backup data
between AZs.
Top-of-rack Large Unlikely Use two top-of-
switch outage or rack switches for
misconfiguration redundancy.
Single Nutanix cluster Medium Unlikely Replicate data and fail
outage over to remote AZ.
Single VLAN outage or Medium Unlikely Replicate data and fail
misconfiguration over to remote AZ.
Human error Large Likely Introduce automation.
Replicate data and fail
over to remote AZ.
Performance Large Unlikely Replicate data and fail
degradation caused over to remote AZ.
by infrastructure or
hardware components
(Nutanix clusters,
network)
Latency spikes Large Unlikely Implement WAN
above 5 ms in WAN monitoring to check
cause application latency on the link.
performance Create SEV1 ticket
degradation for WAN latency spike
events.

Table: NVD BCDR Constraints

© 2024 Nutanix, Inc. All rights reserved | 48


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Constraint Description Comment


Use Nutanix Mine for backup. Currently, only HYCU, Veeam, and Commvault
support Nutanix Mine for backup.
Use Nutanix Disaster Recovery for disaster Nutanix Disaster Recovery is the solution
recovery orchestration. of choice to provide disaster recovery
orchestration for approximately 25 percent of
VMs (2,000 VMs).
Use HYCU as the backup solution. Currently, only HYCU supports Prism Central
categories to automate VM assignment to
backup policies.

Table: NVD BCDR Design Decisions


Design Option Validated Selection
Boundaries for recovery plans Recovery plans don't span multiple protection
policies.
Categories or VM names in recovery plans Use categories in recovery plans so that you
can cover more VMs in each recovery plan
(the maximum number of VMs in a recovery
plan is up to 1,000 when you use categories
versus 275 when you use VM names). For
product and design maximums, see Design
Limits in the appendix.
Separate categories for different products Use separate categories for disaster recovery
and backup.
Disaster recovery orchestration product Use Nutanix Disaster Recovery for disaster
recovery orchestration, automation, and
testing.
Disaster recovery failover and testing Use Nutanix Disaster Recovery to orchestrate
disaster recovery.
Supported RPOs Use Nutanix Disaster Recovery with
synchronous and asynchronous replication to
support an RPO of 0 min. and an RPO of 2 h.
Disaster recovery management and VM Use Nutanix categories to simplify VM disaster
placement recovery and backup manageability.

© 2024 Nutanix, Inc. All rights reserved | 49


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Maximum entities per protection policy Asynchronous and synchronous protection
policies can all support a maximum of 1,000
VMs.
Maximum entities per recovery plan Asynchronous and synchronous recovery
plans can all support a maximum of 1,000
VMs.
Ratio of protection plans to recovery plans Keep a ratio of one protection policy to one
recovery plan (1:1).
Nutanix local and remote snapshot retention Keep a maximum of 12 h of snapshot history
policies on Nutanix for both local and remote sites.
Failover networks To simplify network management, use
dedicated failover networks to accommodate
VMs after failover.
VM protection configuration Protect 2,000 VMs in two protection tiers:
1,500 VMs at Bronze, and 500 VMs at Gold.
Backup product Use Nutanix Mine with HYCU for applications.
Maximum number of VMs assigned to a single Assign up to 1,500 VMs (for backup and
HYCU backup controller restore) per HYCU VM (based on HYCU
recommendations): 1,000 VM used in this
design.
Back up workloads within AZs or across AZs To optimize the backup window and save WAN
bandwidth, Mine clusters back up workloads in
the local AZ.
Backup policy RPO Set 24-hour RPO on backup policies.
Backup policies per HYCU backup controller Configure a single HYCU policy with up to
1,000 VMs per backup controller.
Storage solution for backup repository Use Nutanix Objects as the backup target.
Number of S3 buckets for backup repository Use one object store with one bucket as the
backup repository.
S3 storage advanced features Enable WORM and set it for 365 days.
Replication method for backups between AZs Use HYCU to manage backup replication.

© 2024 Nutanix, Inc. All rights reserved | 50


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Backup and Disaster Recovery Conceptual Design


Nutanix Prism Central is the management and control plane for disaster recovery
capabilities. Both disaster recovery and backup use categories to sort VMs into logical
groups to automate their association with a protection policy, a recovery plan, and a
backup policy.

Figure 11: Hybrid Cloud BCDR Conceptual Design

Disaster Recovery
The following sections provide a logical and detailed overview of this NVD's disaster
recovery solution.

Disaster Recovery Logical Design


This NVD provides comprehensive disaster recovery protection for applications across
both AZs in a single region. Applications can use underlying infrastructure to provide
disaster recovery resilience based on three protection levels with bidirectional replication
between AZs. The design provides granular disaster recovery to the single VM, IP
address, or IP subnet level.
Disaster recovery testing, failover, and failback are fully orchestrated and require only
minimal human involvement.

© 2024 Nutanix, Inc. All rights reserved | 51


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 12: Hybrid Cloud On-Premises BCDR Logical Diagram

Disaster Recovery Detailed Design


This NVD uses categories in Prism Central to automate VM placement in the target
protection policy. To simplify failover and failback, the design assigns VMs to a local
category (for example, it assigns VMs that run on AZ02 to a category with the prefix
AZ02).
The following table provides guidance on how to design Nutanix Disaster Recovery
categories for 2,000 VMs (approximately 25 percent of the 7,440 VMs per pod).
Table: Nutanix Disaster Recovery Categories

© 2024 Nutanix, Inc. All rights reserved | 52


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Tier Category Name Value Max VMs


Asynchronous AZ01-DR-Bronze-01 RPO2h 375
Asynchronous AZ01-DR-Bronze-02 RPO2h 375
Asynchronous AZ02-DR-Bronze-01 RPO2h 375
Asynchronous AZ02-DR-Bronze-02 RPO2h 375
Synchronous AZ01-DR-Gold-01 RPOZero 125
Synchronous AZ01-DR-Gold-02 RPOZero 125
Synchronous AZ02-DR-Gold-01 RPOZero 125
Synchronous AZ02-DR-Gold-02 RPOZero 125

The following two tables provide details on protection policy configuration for 2,000 VMs.
Each protection policy has VMs located in a single AZ.
Table: Protection Policy Configuration for AZ01
Policy Name Category No. of VMs Source Target RPO
Cluster Cluster
AZ01-AZ02- AZ01-DR- 375 AZ01-CLS-01 AZ02-CLS-01 2h
Bronze-01 Bronze-01
AZ01-AZ02- AZ01-DR- 375 AZ01-CLS-02 AZ02-CLS-02 2h
Bronze-02 Bronze-02
AZ01-AZ02- AZ01-DR- 125 AZ01-CLS-01 AZ02-CLS-01 0 min.
Gold-01 Gold-01
AZ01-AZ02- AZ01-DR- 125 AZ01-CLS-02 AZ02-CLS-02 0 min.
Gold-02 Gold-02

Table: Protection Policy Configuration for AZ02


Policy Name Category No. of VMs Source Target RPO
Cluster Cluster
AZ02-AZ01- AZ02-DR- 375 AZ02-CLS-01 AZ01-CLS-01 2h
Bronze-01 Bronze-01
AZ02-AZ01- AZ02-DR- 375 AZ02-CLS-02 AZ01-CLS-02 2h
Bronze-02 Bronze-02

© 2024 Nutanix, Inc. All rights reserved | 53


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Policy Name Category No. of VMs Source Target RPO


Cluster Cluster
AZ02-AZ01- AZ02-DR- 125 AZ02-CLS-01 AZ01-CLS-01 0 min.
Gold-01 Gold-01
AZ02-AZ01- AZ02-DR- 125 AZ02-CLS-02 AZ01-CLS-02 0 min.
Gold-02 Gold-02

The following two tables provide detailed information about recovery plans. To simplify
failover and failback, the design assigns VMs to a recovery plan from the AZ. For
example, VMs located in AZ01 are assigned to the recovery plan for AZ01. You can
implement all recovery plans for a specific AZ in parallel because each AZ covers a
maximum of 1,000 VMs and the product maximum for VMs recovered in parallel is 1,000.
For other product maximums, see the appendix.
Table: Details of Recovery Plans for AZ01 VMs
Name Stage VM Delay Source Failover Test No. of
Category Network Networks Failover VMs
Network
AZ01- Stage1 AZ01-DR- 0 Source- Failover- Test-PG 375
AZ02- Bronze-01 PG PG
Bronze-01
AZ01- Stage1 AZ01-DR- 0 Source- Failover- Test-PG 375
AZ02- Bronze-02 PG PG
Bronze-02
AZ01- Stage1 AZ01-DR- 0 Source- Failover- Test-PG 125
AZ02- Gold-01 PG PG
Gold-01
AZ01- Stage1 AZ01-DR- 0 Source- Failover- Test-PG 125
AZ02- Gold-02 PG PG
Gold-02

Table: Details of Recovery Plans for AZ02 VMs

© 2024 Nutanix, Inc. All rights reserved | 54


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Name Stage VM Delay Source Failover Test No. of


Category Network Networks Failover VMs
Network
AZ02- Stage1 AZ02-DR- 0 Source- Failover- Test-PG 375
AZ01- Bronze-01 PG PG
Bronze-01
AZ02- Stage1 AZ02-DR- 0 Source- Failover- Test-PG 375
AZ01- Bronze-02 PG PG
Bronze-02
AZ02- Stage1 AZ02-DR- 0 Source- Failover- Test-PG 125
AZ01- Gold-01 PG PG
Gold-01
AZ02- Stage1 AZ02-DR- 0 Source- Failover- Test-PG 125
AZ01- Gold-02 PG PG
Gold-02

The following two tables provide details about mapping between protection policies,
recovery plans, and categories for 2,000 VMs.
Table: Protection Policy to Recovery Plan Mapping for AZ01
Policy Name Recovery Category RPO RTO No. of VMs
Plan Name Name
AZ01-AZ02- AZ01-RP- AZ01-DR- 2h 4h 375
Bronze-01 Bronze-01 Bronze-01
AZ01-AZ02- AZ01-RP- AZ01-DR- 2h 4h 375
Bronze-02 Bronze-02 Bronze-02
AZ01-AZ02- AZ01-DR- AZ02-DR- 0 min. 2h 125
Gold-01 Gold-01 Gold-01
AZ01-AZ02- AZ01-DR- AZ02-DR- 0 min. 2h 125
Gold-02 Gold-02 Gold-02

Table: Protection Policy to Recovery Plan Mapping for AZ02


Policy Name Recovery Category RPO RTO No. of VMs
Plan Name Name
AZ02-AZ01- AZ02-RP- AZ02-DR- 2h 4h 375
Bronze-01 Bronze-01 Bronze-01

© 2024 Nutanix, Inc. All rights reserved | 55


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Policy Name Recovery Category RPO RTO No. of VMs


Plan Name Name
AZ02-AZ01- AZ02-RP- AZ02-DR- 2h 4h 375
Bronze-02 Bronze-02 Bronze-02
AZ02-AZ01- AZ02-DR- AZ02-DR- 0 min. 2h 125
Gold-01 Gold-01 Gold-01
AZ02-AZ01- AZ02-DR- AZ02-DR- 0 min. 2h 125
Gold-02 Gold-02 Gold-02

Note: An RPO of 2 hours has a snapshot and replication schedule of 1 hour, assuming that the most recent
in-flight snapshot replication might fail part way through.

Backup
The following section provides a logical and detailed overview of this NVD's backup
solution. It also covers the logical design and use of Nutanix Mine Integrated Backup as
an external backup system and Nutanix Objects as the backup target storage.

Backup Logical Design


This NVD provides a backup option for workloads running in both AZs. To protect backup
data against cluster failure and datacenter failure, data replicates bidirectionally between
two backup instances across both AZs in one region. This design optimizes the backup
solution to back up workloads that run locally to the backup cluster. Using categories
helps organize VMs and ensures that the Nutanix Mine instance that's local to the AZ can
back them up.

© 2024 Nutanix, Inc. All rights reserved | 56


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 13: Hybrid Cloud On-Premises Backup Architecture Logical Design

Nutanix Mine Integrated Backup Logical Design


To provide additional protection against data loss, this NVD has an external backup
system: Nutanix Mine Integrated Backup. Each datacenter contains an instance of the
backup system local to the workloads you want to back up and restore.
To provide maximum performance and the desired RPO and RTO across the
environment, each Nutanix Mine setup has multiple backup proxies. To simplify backup
policy management, Nutanix Mine has a 1:1 mapping between the backup policy and the
backup proxy. This approach helps scale the solution linearly as it grows.
For maximum performance, all backup components use the same network subnet:

© 2024 Nutanix, Inc. All rights reserved | 57


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• CVM
• AHV
• Object networks (storage and client)
• HYCU backup VMs
Nutanix Mine is a high-performance backup target that's compatible with S3 storage.
S3-compatible storage provides advanced security features to help protect data against
common security threats.

Backup Detailed Design


The following sections describe the physical components of BCDR with Nutanix Mine as
the external backup system.
Categories in Prism Central automate VM placement in the target backup policy. The
backup system uses categories (named using the AZ##-Backup-## naming convention)
to identify VM location (local AZ versus remote AZ) and RPO tier. Each AZ has four
backup categories, a single category can have at most 1,000 VMs, and each category
has an RPO of 24 hours.
One HYCU backup server can have one backup policy and a total of 1,000 VMs.
Backup proxies transfer backup data from source clusters to target storage. Each HYCU
backup proxy (four on each AZ) has the resource configuration shown in the following
list:
• 10 vCPU
• 16 GB of memory
• 200 GB of storage
• Appliance OS
Table: HYCU Backup Proxy Virtual Hardware Configuration
Virtual Hardware Value Type
Virtual CPU 10 vCPU
Virtual memory 16 GB Memory

© 2024 Nutanix, Inc. All rights reserved | 58


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Virtual Hardware Value Type


Virtual storage 200 GB VirtIO-SCSI
Virtual NIC 1 VirtIO-Net
Virtual CD-ROM 1 IDE

Nutanix Objects Storage


The NVD uses Nutanix Objects as backup target storage for all backup data. For
maximum performance, deploy an instance of Nutanix Objects and HYCU that has three
worker nodes and two load balancer nodes.

© 2024 Nutanix, Inc. All rights reserved | 59


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Figure 14: Nutanix Objects Logical Design for Hybrid Cloud


© 2024 Nutanix, Inc. All rights reserved | 60
Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

The object stores on the Nutanix Mine cluster in each AZ that hosts backup data
(AZ01Backup01.domain.local and AZ02Backup01.domain.local) have the following
resources:
• Two load balancer VMs with 2 vCPU and 4 GB of memory each
• Three worker VMs with 10 vCPU and 32 GB of memory each
• Maximum available storage allocated
The Nutanix S3 bucket has versioning turned off and 365 days of WORM. Name buckets
using the AZ##Backup## convention and append -copy to copies. The number of S3
buckets depends on the number of backup VMs. The backup VM, the S3 bucket for
primary backup data, and the S3 bucket for the backup data copy on the remote site
have 1:1 matching.

© 2024 Nutanix, Inc. All rights reserved | 61


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

4. Self-Service with Automation


This design incorporates Nutanix Cloud Manager (NCM) Self-Service so that
organizations can streamline the way they provision, scale, and manage new or existing
applications across multiple environments while managing consumption and governance.
In a common enterprise scenario, you must configure every application deployment with
IP addresses from an IPAM system or DNS, join directory services for authentication, or
get a virtual IP (VIP) address from a load balancer. The blueprints in this design include
integrations with these foundational services.
Use Prism Central categories in NCM Self-Service blueprints to mitigate the risk of not
applying a category, something that's likely to happen in a manual deployment. To learn
how to sync categories across Prism Central instances and allow NCM Self-Service to
recover after failure, see KB 12251.
NCM Self-Service with automation requirements by component:
• NCM Self-Service:
› Provide self-service for Windows, Linux, LAMP, and WISA applications.
› Be secure by design. Following DevSecOps principles, protect all application
networking and data from ransomware attacks.
› Support up to 7,500 VMs.
› Let IT users deploy applications in different clusters and locations.
› Provide cloud governance.
› Present application costs.
› Notify IT users when their applications are ready.
› Provide a seamless hybrid multicloud experience.
› Standardize the virtual hardware specifications for VMs.

© 2024 Nutanix, Inc. All rights reserved | 62


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Integrate with:
› IPAM for configuring VM addresses
› Directory services for authentication
› Backup for VM protection
› Datacenter load balancers for configuring application VIP addresses

Note: You must confirm every assumption in the following list.

NCM Self-Service with automation assumptions by component:


• NCM Self-Service:
› NCM Self-Service can access any third-party system that the blueprint must
integrate with.
› As part of the blueprints, NCM Self-Service has WinRM (HTTP or HTTPS) or SSH
access to the networks where VMs are deployed.
› NCM Self-Service can connect to NCM Cost Governance (formerly Beam) in the
cloud.
› VMs deployed by NCM Self-Service can communicate with email infrastructure to
send notifications.

© 2024 Nutanix, Inc. All rights reserved | 63


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Integration:
› IPAM infrastructure has sufficient resilience for the system to request, register, and
release IP addresses, even during critical outages.
› Directory services infrastructure has sufficient resilience for adding and removing
VMs, even during critical outages.
› Backup services infrastructure has sufficient resilience for backing up and restoring
VMs, even during critical outages.
› Email infrastructure has sufficient resilience to send, receive, and access emails,
even during critical outages.
› Load balancer infrastructure has sufficient resilience for handling API requests,
even during critical outages.
› Blueprints are also available in a source code management system.
NCM Self-Service with automation risks by component:
• NCM Self-Service:
› During NCM Self-Service upgrades, the service is unavailable.
› During NCM Self-Service downtime, the service is unavailable.
› Single-instance NCM Self-Service is a single point of failure.
› In the event of a disaster, applications recovered in another Prism Central instance
are unavailable in NCM Self-Service until you run the Prism Central–to–Prism
Central sync script.
• Integration:
› During IPAM downtime, new NCM Self-Service deployments might fail.
› During directory service downtime, new NCM Self-Service deployments might fail.
› During load balancer downtime, new NCM Self-Service deployments that need load
balancing might fail.
NCM Self-Service with automation constraints by component:

© 2024 Nutanix, Inc. All rights reserved | 64


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• NCM Self-Service:
› Blueprints must use existing approved VM templates.
› VM names must adhere to existing naming conventions.
› Virtual hardware has a maximum of three sizes.
• Integration:
› The IPAM solution is Infoblox.
› The backup solution is Nutanix Mine with HYCU.
› The network security solution is Nutanix Flow Network Security.
› The BCDR solution is Nutanix Disaster Recovery.
› The directory service is Microsoft Active Directory.
› The load balancer is F5.
Table: NCM Self-Service with Automation Design Decisions
Design Option Validated Selection
NCM Self-Service deployment model Use a standalone single virtual appliance.
NCM Self-Service deployment size Use a large NCM Self-Service deployment.
NCM Self-Service recoverability Protect NCM Self-Service using a Nutanix
Disaster Recovery protection policy and a
recovery plan as well as a Mine category for
backup and archiving.
NCM Self-Service project structure and role- Don't use the default NCM Self-Service project;
based access control (RBAC) configuration instead, use dedicated NCM Self-Service
projects with RBAC based on your Nutanix
Services architecture workshop.
Active Directory authentication Use Active Directory for NCM Self-Service
access and project RBAC.
NCM Self-Service policy engine Enable the NCM Self-Service policy engine; it's
required for functionalities like quotas.
NCM Self-Service showback Enable showback for Nutanix AHV provider
accounts.

© 2024 Nutanix, Inc. All rights reserved | 65


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


Show App Protection Status Enable Show App Protection Status to provide
application tracking in the event of a disaster
when recovered in a different location.
Self-service method The NCM Self-Service marketplace is the self-
service portal for IT users.
SSL or TLS connection to Active Directory Use WinRM over HTTPS for Windows
blueprints.
Security Blueprints use Prism Central categories for
security (Flow Network Security), protection
and recovery (Nutanix Disaster Recovery),
and backup (Nutanix Mine) policies to address
security earlier in the development process
(DevSecOps).
Email notification after application deployment An in-guest script sends emails from the VMs
deployed by NCM Self-Service.
Blueprint development method Develop blueprints using NCM Self-Service
DSL, which generates multi-VM blueprints
even if there's a single service (IaaS).
Blueprint development project Develop blueprints in a dedicated project and
make them available to other projects through
the marketplace manager.
Standard sizing model Standardize the virtual hardware on small (1
vCPU, 8 GB of memory), medium (2 vCPU, 16
GB of memory), and large (4 vCPU, 32 GB of
memory) sizing models.
Integration with IPAM Blueprints using Escript tasks in the precreate
stage communicate with the Infoblox API for
CRUD tasks.
Integration with F5 Blueprints using Escript tasks in the create
stage communicate with the F5 API for CRUD
tasks.
Integration with Active Directory Blueprints using PowerShell or Shell script
tasks in the package install stage communicate
with Active Directory.

© 2024 Nutanix, Inc. All rights reserved | 66


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Design Option Validated Selection


LAMP Blueprint Protection Policy Use the Bronze SLA for HAProxy, the Silver
SLA for Apache, and the Gold SLA for
MongoDB.

Self-Service with Automation Conceptual Design


NCM Self-Service can deploy workloads in any AZ as part of the marketplace request. In
this request, users can specify different aspects of their workloads such as compute type,
location, and data protection SLA and preview how much the resources they're asking
for cost. Use the NCM Self-Service Portal to manage the service catalog, governance,
policies, operations, showback, and integrations (such as IPAM, load-balancing, Active
Directory, and email) for your datacenters.

Figure 15: NCM Self-Service with Automation Conceptual Design for Hybrid Cloud

Self-Service with Automation Logical Design


NCM Self-Service uses a modular approach for meeting enterprise multitenancy
requirements following governance policies. This design provides guidance for
implementing NCM Self-Service in a deployment with multiple AZs. Each AZ in the
design has a project that contains all the AZ's Nutanix resources. The projects don't
span multiple AZs, so each project uses compute, storage, and networking resources
from one account that is configured to use the Prism Central instance on each AZ as the

© 2024 Nutanix, Inc. All rights reserved | 67


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

application target. Each project has blueprints presented to the end users in the Nutanix
marketplace.

Figure 16: NCM Self-Service with Automation Logical Design

Accounts
NCM Self-Service needs at least one provider account so that projects can deploy
workloads. By default, enabling NCM Self-Service in Prism Central creates the
NTNX_LOCAL_AZ account. This account automatically discovers the AHV clusters

© 2024 Nutanix, Inc. All rights reserved | 68


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

registered in Prism Central. Because the NVD uses a standalone NCM Self-Service
instance, no clusters are registered in Prism Central in this case. This NVD adds two
Nutanix accounts that connect to the Prism Central instances that manage the clusters in
each AZ.

Projects
Projects are like tenants, delivering governance and multitenancy. Usually, projects are
aligned with environments (development or production), operating systems (Windows or
Linux), departments (human resources or finance), or applications (Exchange or SAP).
A project must have at least one account, RBAC using Prism Central roles and Active
Directory, and an environment before project users can request workloads from the
marketplace. This NVD has one project for blueprint design and four projects to validate
the security aspects of tenant workloads.

Blueprints
Blueprints are project-specific and define how to automate workload deployment. An
important part of this design is the use of Prism Central categories in a blueprint to drive
DevSecOps and help prevent ransomware. To make a blueprint available for other
projects, you must publish it in the marketplace. This NVD has four blueprints: Windows,
Linux, WISA, and LAMP. All the blueprints integrate with IPAM, Active Directory, and
email. WISA and LAMP also integrate with the load balancer.

Marketplace
When projects submit blueprints for approval, an administrator reviews, categorizes, and
assigns a version number to them. After they publish a blueprint, an administrator can
assign it to projects for consumption through the marketplace page. This NVD uses two
projects with different blueprint assignments to validate its security.

Integrations
Integrations are part of the blueprint and occur at different stages of the life cycle. In
this NVD, most integrations use NCM Self-Service Escript (a Python library), with some
instances of PowerShell and Shell scripts for integrations where only a CLI is available.

© 2024 Nutanix, Inc. All rights reserved | 69


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Categories
Security policies with Flow Network Security, protection and recovery policies with
Nutanix Disaster Recovery, backup policies with Nutanix Mine, and HYCU archiving to
Nutanix Objects all use Prism Central categories. Using categories in blueprints helps
prevent ransomware because every deployment is secure by design.

Self-Service with Automation Detailed Design


Nutanix Services customizes this NVD to meet your specific requirements following an
architecture workshop. In the following tables, items marked TBD represent a value that
you determine collaboratively with Nutanix Services during the workshop.

NCM Self-Service
NCM Self-Service is a standalone instance in this NVD.
Table: Self-Service with Automation Deployment Model
Setting Value
Deployment One instance of NCM Self-Service on AHV
Resources 10 vCPU, 52 GB of memory, 581 GB disk
Network Management (requires IP address)
Protection Nutanix Disaster Recovery protection policy
and recovery plan

This NVD uses the following settings to protect the NCM Self-Service virtual appliance
from a disaster scenario.
Table: Nutanix Categories
Category Name Value Assigned Used By
AppType CalmAppliance Calm_on_AHV (VM) Nutanix Disaster
Recovery
AZ01-Backup-01 RPO24h Calm_on_AHV (VM) Nutanix Mine (backup)

Table: Protection Policy

© 2024 Nutanix, Inc. All rights reserved | 70


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Setting Value
Policy name AZ01-AZ02-Calm
Category AppType: CalmAppliance
Source cluster AZ01-MGMT-01
Target cluster AZ02-MGMT-01
RPO 15 min.

Table: Recovery Plan


Setting Value
Name AZ01-RP-Calm
Stage Stage1
Category AppType: CalmAppliance
Delay 0
Source network Source-PG
Failover network Failover-PG
Test failover network Test-PG

The AZ01-AZ02-Calm protection policy maps to the AZ01-RP-Calm recovery plan


because both have AppType: CalmAppliance category and an RPO of 15 minutes.
This NVD uses a security policy that allows traffic from the AppType: CalmAppliance
category to the AppType: AZ01-Example-001, AppTier: Web, AppTier: App, and AppTier:
DB categories using the TCP 22, 5985–5986 port and protocol. This security policy
allows the NCM Self-Service virtual appliance to connect to VMs.
Table: NCM Self-Service with Automation Settings
Setting Value
Default landing page Yes
Marketplace apps No
Showback Yes
Policy engine Yes (requires IP address)
Protection status Yes

© 2024 Nutanix, Inc. All rights reserved | 71


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Table: NCM Self-Service with Automation Accounts


Account Provider Cluster Cost Sync Quotas
Settings
Region A: Nutanix Clusters AZ01 TBD 15 min. N/A
AZ01
Region A: Nutanix Clusters AZ02 TBD 15 min. N/A
AZ02

Table: NCM Self-Service with Automation Project: Blueprints-design


Project RBAC Accounts Allow List Environments
Subnets and
Quotas
Blueprints-design Blueprint_ All All clusters and Windows and
Designers (Active subnets, no Linux
Directory group) quotas

Table: Other NCM Self-Service with Automation Projects


Cluster Project RBAC Accounts Allow List Quotas Environ.
Subnets
Clusters TBD TBD All TBD TBD TBD
AZ01
Clusters TBD TBD All TBD TBD TBD
AZ02

The general workflow of a workload deployment is as follows:


1. Launch: Capture user inputs (variables, hardware specifications, credentials).
2. Precreate: Integrate IPAM.
3. Create substrate: Bootstrap the VM name, network configuration, credentials, and
cloud-init (Linux) or sysprep (Windows).
4. Install packages: Install Nutanix Guest Tools (NGT), OS updates and hardening,
agents, and packages.
5. Create service-specific dependencies, configurations, and integrations.
6. Create email notifications and integration notifications.

© 2024 Nutanix, Inc. All rights reserved | 72


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

For the blueprints used in this design, see the Nutanix Validated Designs GitHub
repository.
The blueprints in the following table all use the IPAM, Active Directory, and email
integrations. All blueprints except the two single-service blueprints also require load
balancing.
Table: NCM Self-Service with Automation Blueprints
Blueprint Component Software Version Component
Dependencies
Windows Single service Windows Server 2019 N/A
Linux Single service CentOS 8.2 N/A
WISA Load balancer BIG-IP 16.1.0 Build Scale-out web
0.0.19 Final
WISA Scale-out web Windows Server 2019 Database
+ IIS 10
WISA Database Windows Server 2019 N/A
+ SQL Server 2019
LAMP Load balancer TBD Scale-out web
LAMP Scale-out web CentOS 8 + PHP 8 Database
LAMP Database CentOS 8 + MariaDB N/A
10.6

Table: NCM Self-Service with Automation Marketplace


Blueprint Available To Version Category
Windows TBD 1.0 TBD
Linux TBD 1.0 TBD
WISA TBD 1.0 TBD
LAMP TBD 1.0 TBD

Directory Services
This NVD adds every workload provisioned using NCM Self-Service to Active Directory
and uses Windows Server 2019 for Active Directory integration.

© 2024 Nutanix, Inc. All rights reserved | 73


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Table: NCM Self-Service with Automation Active Directory Connection Details


Connection Details
Domain nutanix.nvd
Domain Controller X.X.X.X or FQDN
Username svc_selfservice
Password xxx

IPAM
Based on user input, Infoblox provides every workload provisioned using NCM Self-
Service with an IP address in the selected network and configured DNS.
Table: NCM Self-Service with Automation IPAM Connection Details
Connection Details
Infoblox API X.X.X.X or FQDN
Username svc_selfservice
Password xxx
Networks TBD

Load Balancing
WISA and LAMP workloads integrate with the F5 load balancer for the web server tier.
Table: NCM Self-Service with Automation Load Balancer Connection Details
Connection Details
F5 API X.X.X.X or FQDN
Username svc_selfservice
Password xxx

This NVD uses the AddrLoadBalancer (X.X.X.X/32) Flow Network Security policy
to let the F5 load balancer send HTTP and HTTPS traffic to the application tier. In
the AddrLoadBalancer example load-balancing security policy, the destination is all
applications in the AppType: AZ01-Example-0001 and AppTier: App categories and the
port and protocol are TCP 80,443.

© 2024 Nutanix, Inc. All rights reserved | 74


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Notifications
Every workload provisioned using NCM Self-Service sends an email to the requester.
This NVD uses the following software versions for notification integration.
Table: NCM Self-Service with Automation Notifications Software Versions
OS Notification Library
Windows Email Send-MailMessage
Linux Email smtplib and email.message

Table: Self-Service with Automation Notifications Connection Details


Connection Details
SMTP X.X.X.X or FQDN
Port 465
Sender [email protected]

Username svc_selfservice
Password xxx
Recipients NCM Self-Service requester, distribution list, or
both

© 2024 Nutanix, Inc. All rights reserved | 75


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

5. Ordering
This bill of materials reflects the validated and tested hardware, software, and services
that Nutanix recommends to achieve the outcomes described in this document. Consider
the following points when you build your orders:
• All software is based on core licensing whenever possible.
• Nutanix Professional Services or an affiliated partner selected by Nutanix provides all
services.
• Nutanix based the functional testing described in this document on NX series models
with similar configurations to validate the interoperability of software and services.

Note: Because available hardware, software, and services can change without notice, contact Nutanix
Sales when ordering to ensure that you have the correct product codes.

Substitutions
• Nutanix recommends that you purchase the exact hardware configuration reflected
in the bill of materials whenever possible. If a specific hardware configuration is
unavailable, choose a similar option that meets or exceeds the recommended
specification.
• You can make hardware substitutions to suit your preferences; however, such
changes might result in a solution that doesn't follow the recommended Nutanix
configuration.
• Avoid software product code substitutions except when:
› You need different quantities to maintain software licensing compliance.
› You prefer a higher license tier or support level for the same software product code.
• Adding any software or workloads that aren't specified in this design to the
environment (including additional Nutanix products) might affect the validated density

© 2024 Nutanix, Inc. All rights reserved | 76


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

calculations and result in a solution that doesn't follow the recommended Nutanix
configuration.
• Nutanix Professional Services substitutions to accommodate customer preferences
aren't possible.

Sizing Considerations
This NVD is based on a block-and-pod architecture. A block consists of 32 nodes, or
two 16-node workload clusters—one in each datacenter for BCDR. A pod consists of the
following components:
• Two 4-node management clusters
• Enough 32-node blocks (sets of two 16-node workload clusters) to meet the desired
capacity
• Two Nutanix Mine backup clusters
Once the number of nodes, VMs, or clusters exceeds the maximum specified for the
solution, create a new pod with a new management cluster and Prism Central instance.
For smaller environments, you can downsize the workload clusters to 4, 8, or 12 nodes
based on your capacity requirements, but don't change the hardware configuration or
sizing associated with the management clusters or the Nutanix Mine backup clusters.
Note: Contact your HYCU sales team for more information regarding licensing.

Bill of Materials
The following sections show the bills of materials for the primary and secondary
datacenter management clusters, the primary and secondary datacenter workload
clusters, and the primary and secondary datacenter backup clusters.

Primary Datacenter Management Cluster


Table: Primary Datacenter Management Cluster: Hardware
Item Specification
Platform NX-1175S-G9

© 2024 Nutanix, Inc. All rights reserved | 77


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Item Specification
Configuration One node
Type All flash
Support Level Production
NRDK Support No
NR Node Support No

Table: Primary Datacenter Management Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 6326 (2.9 1
GHz, 16 cores)
Memory 64 GB (3,200 MHz DDR4 8
RDIMM)
HDD No HDD included 0
SSD 1.92 TB 2
Network adapter 25 GbE, 2-port, SFP+ 1

Use the following software for the primary datacenter management cluster:
• Nutanix Cloud Infrastructure (NCI) Pro
• NCI Advanced Replication
• NCI Flow Network Security
• Nutanix Cloud Manager (NCM) Starter
Nutanix recommends the following Nutanix Professional Services for the primary
datacenter management cluster:
• Infrastructure Deploy - On-Prem NCI Cluster
• Nutanix Unified Storage Deployment
• FastTrack for NCM Intelligent Operations

© 2024 Nutanix, Inc. All rights reserved | 78


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Primary Datacenter Workload Cluster


Table: Primary Datacenter Workload Cluster: Hardware
Item Specification
Platform NX-3170-G8
Configuration One node
Type All flash
Support Level Production
NRDK Support No
NR Node Support No

Table: Primary Datacenter Workload Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 5318Y (2.1 2
GHz, 24 cores)
Memory 64 GB (3,200 MHz DDR4 24
RDIMM)
HDD No HDD included 0
SSD 3.84 TB 6
Network adapter 25 GbE, 2-port (NVIDIA 1
Mellanox ConnectX-5)

Use the following software for the primary datacenter workload cluster:
• NCI Pro
• NCI Advanced Replication
• NCI Flow Network Security
• NCM Ultimate
Nutanix recommends the following Nutanix Professional Services for the primary
datacenter workload cluster:
• Infrastructure Deploy - On-Prem NCI Cluster

© 2024 Nutanix, Inc. All rights reserved | 79


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Nutanix Unified Storage Deployment

Primary Datacenter Backup Cluster


Table: Primary Datacenter Backup Cluster: Hardware
Item Specification
Platform NX-8155-G8
Configuration One node
Type Hybrid
Support Level Production
NRDK Support No
NR Node Support No

Table: Primary Datacenter Backup Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 6326 (2.9 2
GHz, 16 cores)
Memory 32 GB (3,200 MHz DDR4 8
RDIMM)
HDD 18 TB, 3.5 in. 8
SSD 3.84 TB 2
Network adapter 25 GbE, 2-port (NVIDIA 1
Mellanox ConnectX-5)

Use the following software for the primary datacenter backup cluster:
• Nutanix Objects dedicated
• Nutanix Mine software
• HYCU License Bundle for Nutanix Mine, 1Ct for 3YR
Nutanix recommends the following Nutanix Professional Services for the primary
datacenter backup cluster:
• Infrastructure Deploy - On-Prem NCI Cluster

© 2024 Nutanix, Inc. All rights reserved | 80


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• Nutanix Unified Storage Deployment


• Nutanix Unified Storage Deployment - Objects

Note: You only need Nutanix Unified Storage Deployment - Objects on the bill of materials so that your
Nutanix representative can quote 18 TB HDD. A quantity of 1 TiB is sufficient.

Secondary Datacenter Management Cluster


Table: Secondary Datacenter Management Cluster: Hardware
Item Specification
Platform NX-1175S-G9
Configuration One node
Type All flash
Support Level Production
NRDK Support No
NR Node Support No

Table: Secondary Datacenter Management Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 6326 (2.9 1
GHz, 16 cores)
Memory 64 GB (3,200 MHz DDR4 8
RDIMM)
HDD No HDD included 0
SSD 1.92 TB 2
Network adapter 25 GbE, 2-port, SFP+ 1

Use the following software for the secondary datacenter management cluster:
• NCI Pro
• NCI Advanced Replication
• NCI Flow Network Security
• Nutanix Cloud Manager (NCM) Starter

© 2024 Nutanix, Inc. All rights reserved | 81


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Nutanix recommends the following Nutanix Professional Services for the secondary
datacenter management cluster:
• Infrastructure Deploy - On-Prem NCI Cluster
• Nutanix Unified Storage Deployment

Secondary Datacenter Workload Cluster


Table: Secondary Datacenter Workload Cluster: Hardware
Item Specification
Platform NX-3170-G8
Configuration One node
Type All flash
Support Level Production
NRDK Support No
NR Node Support No

Table: Secondary Datacenter Workload Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 5318Y (2.1 2
GHz, 24 cores)
Memory 64 GB (3,200 MHz DDR4 24
RDIMM)
HDD No HDD included 0
SSD 3.84 TB 6
Network adapter 25 GbE, 2-port (NVIDIA 1
Mellanox ConnectX-5)

Use the following software for the secondary datacenter workload cluster:
• NCI Pro
• NCI Advanced Replication
• NCI Flow Network Security

© 2024 Nutanix, Inc. All rights reserved | 82


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

• NCM Ultimate
Nutanix recommends the following Nutanix Professional Services for the secondary
datacenter workload cluster:
• Infrastructure Deploy - On-Prem NCI Cluster
• Nutanix Unified Storage Deployment

Secondary Datacenter Backup Cluster


Table: Secondary Datacenter Backup Cluster: Hardware
Item Specification
Platform NX-8155-G8
Configuration 1-node
Type Hybrid
Support Level Production
NRDK Support No
NR Node Support No

Table: Secondary Datacenter Backup Cluster: Per-Node Hardware Configuration


Component Description Quantity
Processor Intel Xeon-Gold 6326 (2.9 2
GHz, 16 cores)
Memory 32 GB (3,200 MHz DDR4 8
RDIMM)
HDD 18 TB, 3.5 in. 8
SSD 3.84 TB 2
Network adapter 25 GbE, 2-port (NVIDIA 1
Mellanox ConnectX-5)

Use the following software for the secondary datacenter backup cluster:
• Nutanix Mine software
• HYCU License Bundle for Nutanix Mine, 1Ct for 3YR

© 2024 Nutanix, Inc. All rights reserved | 83


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Nutanix recommends the Nutanix Unified Storage Deployment - Objects Nutanix


Professional Services subscription for the secondary datacenter backup cluster.

Nutanix Professional Services


With the following Nutanix Professional Services, Nutanix can implement this NVD as
designed, built, and tested:
• NCI Flow Network Security Microsegmentation Design Workshop
• NCI Flow Network Security Microsegmentation Deployment
• NCI Disaster Recovery Design Workshop
• NCI Disaster Recovery Deployment
• FastTrack for NCM Intelligent Operations
• FastTrack fo NCM Self-Service
• FastTrack for NCM Cost Governance
These services are outcome-based, with fixed prices for the scope. For more information,
see the Nutanix Professional Services information available on Nutanix.com.

© 2024 Nutanix, Inc. All rights reserved | 84


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

6. Test Plan
The test plan for the Nutanix Hybrid Cloud: AOS 6.5 with AHV On-Premises Design
validates a successful implementation (spreadsheet automatically downloads when
you click the link). Compare the result for each test plan item with the Expected Result
column, then select the correct response from the dropdown menu in the Result column.

© 2024 Nutanix, Inc. All rights reserved | 85


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

7. Appendix

Windows VM Performance Tuning


For Windows VMs, consider the following performance tuning settings:
• In the base OS image, navigate to the Configure Advanced Settings for Maximum
Performance System Properties page and click the Advanced tab. In the Performance
section, click the Settings button and navigate to the Visual Effects tab. Select the
Adjust for best performance option and click OK.
• To set the VM graphics adapter hardware acceleration to full, open the Control Panel.
In the Display section, navigate to the Settings tab and click the Advanced button. In
the Troubleshooting tab, set the Hardware Acceleration option to full.
• Disable screensavers and Windows search indexing.

Linux VM Performance Tuning


For Linux VMs, consider the following performance tuning settings:
• Specifically for Java Virtual Machine (JVM) systems, enable large pages with enough
memory to cover the JVM HEAP and any other memory requirements. Reserve the
memory required for the JVM HEAP, any other JVM memory, and basic OS functions.
Start with 2 vCPU and increase only if necessary. To enable large pages for a Sun
JVM, use the parameter -XX:+UseLargePages. To enable large pages on an IBM
JVM, use the parameter \–Xlp.
• Edit the grub config and add the following settings to the correct kernel boot line:
transparent_hugepage=never iommu=soft elevator=noop powersaved=off
selinux=0 noselinux apm=off scsi_mod.use_blk_mq=1 dm_mod.use_blk_mq=1

• Add the following settings to sysctl.conf:


vm.overcommit_memory = 1

vm.dirty_background_ratio = 5

vm.dirty_ratio = 15

© 2024 Nutanix, Inc. All rights reserved | 86


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

vm.dirty_expire_centisecs = 500

vm.dirty_writeback_centisecs=100

vm.swappiness = 0
fs.aio-max-nr=3145728

fs.file-max = 6815744

net.ipv4.ip_local_port_range = 1024 65000

net.core.rmem_default = 262144

net.core.wmem_default = 262144

net.ipv4.tcp_rmem = 4096 262144 16777216

net.ipv4.tcp_wmem = 4096 262144 16777216

net.core.rmem_max = 16777216

net.core.wmem_max = 16777216

# these should be on by default, but just to be sure


net.ipv4.tcp_window_scaling = 1

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_sack = 1

# disable ip forwarding

net.ipv4.ip_forward = 0

# Controls source route verification

net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing

net.ipv4.conf.default.accept_source_route = 0

# Controls the use of TCP syncookies

net.ipv4.tcp_syncookies = 1

# Allow many more connections

net.core.netdev_max_backlog = 5000

net.core.somaxconn = 10000

net.ipv4.tcpkeepalive_intvl = 15

net.ipv4.tcp_fin_timeout = 15

© 2024 Nutanix, Inc. All rights reserved | 87


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

net.ipv4.tcp_keepalive_probes = 5

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_max_syn_backlog = 5000

# Controls the maximum number of shared memory segments, in pages,


dependent on memory configured on server, try the defaults first before
using these

#kernel.shmall = 4294967296 # 90% memory, in pages

#kernel.shmmni = 4096

kernel.shmmax = 5368709120

kernel.sem = 250 256000 128 1024

# Large pages, (for Java workloads etc) replace the nr pages with the
amount of memory to reserve divided by 2M,

# which is the page size, and replace the group id (gid) with the ID of the
group id that locks the pages in memory, replace values in <>

# vm.nr_hugepages = 2304

# vm.hugetlb_shm_group = 1002

• Add the following lines to limits.conf:


<gid java user> soft nofile 131070 # This ensures there are enough file
descriptors to handle all the TCP ports and filesystem handles

<gid java user> hard nofile 131070 # Set same as above

@<gid java user> soft memlock 4718592 # This needs to be sufficient to


cover the number of reserved huge pages

@<gid java user> hard memlock 4718592 # Should be same value as above

Design Limits
The following design limits apply to the following versions at the time of publishing.
Table: Hybrid Cloud On-Premises AOS 6.5 Design Limits
Name Product Maximum NVD Design Maximum
AHV: Maximum powered-on 128 124
VMs per host
AHV: Nodes per cluster 32 16

© 2024 Nutanix, Inc. All rights reserved | 88


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

Name Product Maximum NVD Design Maximum


Flow Network Security: Target 7,500 7,500
group VMs per Prism Central
instance
Nutanix Disaster Recovery: 1,000 375
Async guest VMs
Nutanix Disaster Recovery: 2,000 375
Async protection policy guest
VMs
Nutanix Disaster Recovery: 1,000 750
Async guest VMs parallel
failover
Nutanix Disaster Recovery: 1,000 125
Synchronous or Metro guest
VMs
Nutanix Disaster Recovery: 1,000 125
Synchronous or Metro
protection policy guest VMs
Nutanix Disaster Recovery: 1,000 250
Synchronous or Metro guest
VMs parallel failover
Nutanix Disaster Recovery: 2,000 2,000
Maximum protected VMs of
any type per pod
HYCU Maximum VMs 1,500 1,000
protected per backup server
VM
Prism Central: Maximum VMs 25,000 7,500
Prism Central: Maximum 400 N/A
clusters
Prism Central: Maximum 2,000 N/A
nodes
NCM Self-Service: Appliance 7,500 7,500
VM–managed blueprints

© 2024 Nutanix, Inc. All rights reserved | 89


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

References
1. Nutanix Hybrid Cloud Reference Architecture
2. Nutanix Cloud Manager Self-Service
3. Flow Network Security Guide
4. Nutanix Disaster Recovery (formerly Leap)
5. Nutanix Mine with HYCU User Guide
6. Nutanix Objects
7. Data Protection and Disaster Recovery
8. Physical Networking

© 2024 Nutanix, Inc. All rights reserved | 90


Hybrid Cloud: AOS 6.5 with AHV On-Premises Design

About Nutanix
Nutanix offers a single platform to run all your apps and data across multiple clouds
while simplifying operations and reducing complexity. Trusted by companies worldwide,
Nutanix powers hybrid multicloud environments efficiently and cost effectively. This
enables companies to focus on successful business outcomes and new innovations.
Learn more at Nutanix.com.

© 2024 Nutanix, Inc. All rights reserved | 91


List of Figures
Figure 1: Architecture for Hybrid Cloud On-Premises Validated Designs......................................................6

Figure 2: Hybrid Cloud On-Premises Conceptual Pod Design.................................................................... 12

Figure 3: Hybrid Cloud On-Premises Block-and-Pod Architecture.............................................................. 18

Figure 4: Hybrid Cloud On-Premises Scaling.............................................................................................. 22

Figure 5: Availability Chart........................................................................................................................... 23

Figure 6: Physical Network Architecture...................................................................................................... 26

Figure 7: Hybrid Cloud Management Plane.................................................................................................34

Figure 8: Hybrid Cloud On-Premises Monitoring Conceptual Design..........................................................36

Figure 9: Hybrid Cloud Performance Metrics Systems................................................................................37

Figure 10: Hybrid Cloud Rack Layout..........................................................................................................43

Figure 11: Hybrid Cloud BCDR Conceptual Design.................................................................................... 51

Figure 12: Hybrid Cloud On-Premises BCDR Logical Diagram...................................................................52

Figure 13: Hybrid Cloud On-Premises Backup Architecture Logical Design............................................... 57

Figure 14: Nutanix Objects Logical Design for Hybrid Cloud...................................................................... 60

Figure 15: NCM Self-Service with Automation Conceptual Design for Hybrid Cloud.................................. 67

Figure 16: NCM Self-Service with Automation Logical Design....................................................................68

You might also like