AZ-120 - Azure Recommendation For Managing SAP Workload
AZ-120 - Azure Recommendation For Managing SAP Workload
• The virtual networks the SAP application is deployed into don't have access to the internet.
• The database VMs run in the same virtual network as the application layer.
• The VMs within the virtual network have a static allocation of the private IP address. This is
important when deploying SAP HANA because some configuration attributes for HANA
reference IP addresses.
• To separate and isolate traffic to the DBMS VM, assign different NICs to the VM. Every NIC
gets a different IP address, and every NIC is assigned to a different virtual network subnet. For
example, one NIC can connect to the management subnet, and one NIC to facilitate
connectivity from the on-premises network or other Azure virtual networks.
• Traffic restrictions to and from Azure VMs hosting SAP workloads aren't controlled by using
operating system firewalls, but rather by using network security groups (NSGs).
• Divide virtual network address space into subnets. Each subnet can be associated with an
NSG that defines the access policies for the subnet. Place application servers on a separate
subnet so you can secure them more easily by managing the subnet security policies, not the
individual servers. Associate NSGs with subnets, rather than individual network adapters to
minimize management overhead. When an NSG is associated with a subnet, it applies to all
the Azure VMs connected to that subnet. For the listing of ports required by SAP workloads,
refer to TCPIP Ports used by SAP Applications.
• A supported VM size without accelerated networking enabled can only have the feature
enabled when it's stopped and deallocated.
• SQL Server running with datafiles stored directly on blob storage are likely to greatly benefit
from accelerated networking.
• It's possible to have one or more accelerated NICs and a traditional non-accelerated network
card on the same VM.
• SAP application server to database server latency can be tested with ABAP report /SSA/CAT -
> ABAPMeter.
• Inefficient “chatty” ABAP code or intensive operations such as large Payroll jobs or IS-Utilities
Billing jobs have shown significant improvement after enabling accelerated networking.
• To take advantage of accelerated networking in load balancing scenarios, make sure to use
Standard Azure load balancer (rather than Basic).
In highly available configurations, the incoming traffic to the DBMS VM is always routed through the
load balancer. The outgoing traffic route from the DBMS VM to the application layer VM depends on
the configuration of the load balancer.
The load balancer offers an option of DirectServerReturn. If that option is configured, the traffic
directed from the DBMS VM to the SAP application layer isn't routed through the load balancer.
Instead, it goes directly to the application layer. When DirectServerReturn isn't configured, the return
traffic to the SAP application layer is routed through the load balancer.
Microsoft recommends that you configure DirectServerReturn combined with load balancers that are
positioned between the SAP application layer and the DBMS layer. This configuration reduces
network latency between the two layers.
networking support of Azure for SAP workloads
Network security
• “For any SAP production system installed on Azure, it is required that you operate in Virtual
Private Networks that are connected to your datacenters with Azure site-to-site or
ExpressRoute connectivity. End-user access to the application should be routed through your
company's intranet and the Azure site-to-site or ExpressRoute connections to the applications
hosted in Azure Virtual Machine Services. This way, the network, and other security policies
defined for on-premises applications are extended to the application in the Azure Virtual
Machines.
• A design that is NOT supported is the segregation of the SAP application layer and the DBMS
layer into different Azure virtual networks that are not peered with each other. Segregate the
SAP application layer and DBMS layer using subnets within an Azure virtual network instead of
using different Azure virtual networks. If you decide not to follow the recommendation and
instead segregate the two layers into different virtual networks, the two virtual networks need to
be peered. Network traffic between two peered Azure virtual networks is subject to transfer
costs. With the huge data volume, in many Terabytes, exchanged between the SAP application
layer and DBMS layer substantial costs can be accumulated if the SAP application layer and
DBMS layer is segregated between two peered Azure virtual networks.”
Network performance
• “It is NOT supported at all to run an SAP Application Server layer and DBMS layer split
between on-premises and Azure. Both layers need to completely reside either on-premises or
in Azure. It is also NOT at all supported to have SAP instances split between on-premises and
Azure. Per individual SAP system, the DBMS and all SAP application instance(s) must be in
the same location, either Azure or on-premise.
• The location of the Azure data center or region relative to the own datacenter can impact the
latency experienced between on-premises and Azure hosted SAP systems. To minimize
latency between on-premises and Azure, it is advisable to select Azure Regions that are close
to the own location.
• Out of functionality, but more important out of performance reasons, it is not supported to
configure Network Virtual Appliances (NVA) on Azure in the communication path between the
SAP application and the DBMS layer of an SAP NetWeaver, Hybris, or S/4HANA based SAP
system. The communication between the SAP application layer and the DBMS layer needs to
be a direct one. For more information check SAP Note #2731110. The restriction does not
include Azure Security Group (ASG) and Network Security Group (NSG) rules if those ASG
and NSG rules allow direct communication. Further scenarios where NVAs are not supported
are in communication paths between Azure VMs that represent Linux Pacemaker cluster nodes
and SBD devices or in communication paths between Azure VMs and Windows Server SOFS
set.”
Network Reliability
• “Customers should use a good quality (low latency, sufficient bandwidth, no packet loss)
connection between their datacenter and Azure. Customers also should verify and monitor that
the bandwidth between on-premises and Azure is sufficient to handle the communication
workload.”
Azure VMs and SAP HANA on Azure (Large Instances) can benefit from the use of Accelerated
Networking and Proximity Placement Groups.
Write Accelerator
Write Accelerator is a disk capability for M-Series Azure VMs with Premium storage-based Azure
managed disks. Its purpose is to improve the I/O latency of writes. Write Accelerator is ideally suited
where log file updates are required to persist to disk in a highly performant manner for modern
databases.
Write Accelerator should be used for the volumes that contain the transaction log or redo logs of a
DBMS. It is not recommended to use Write Accelerator for the data volumes of a DBMS as the
feature has been optimized to be used against log disks.
When using Write Accelerator for Azure VM disks, these restrictions apply:
• The disk caching must be set to 'None' or 'Read Only'. All other caching modes are not
supported.
• Azure Disk Backup does support backup of Write Accelerator-enabled disks. However, during
restore the disk will be restored as a normal disk. Write Accelerater cache can be enabled on
the restored disk after mounting it to a VM.
• Only smaller I/O sizes (<=512 KiB) are taking the accelerated path. In workload situations
where data is getting bulk loaded or where the transaction log buffers of the different DBMS are
filled to a larger degree before getting persisted to the storage, chances are that the I/O written
to disk is not taking the accelerated path.
Paging/swap file
Use the following recommendations when configuring the paging/swap file:
• Windows operating system pagefile should reside on the D: drive (non-persistent disk)
• Linux swapfile should reside under /mnt/resource and be configured in the configuration file of
the Linux Agent /etc/waagent.conf. Add or change the following settings:
o ResourceDisk.EnableSwap=y
o ResourceDisk.SwapSizeMB=[size in MBs]
• To activate the changes, you need to restart the Linux Agent by running:
o sudo service waagent restart
Managed disks
The use of managed disks is recommended for all SAP workloads. Note that managed disks are required to
implement Write Accelerator.
Write Accelerator is a disk capability of M-Series Azure VMs with Premium storage-based Azure managed disks.
Its purpose is to improve the I/O latency of writes. Write Accelerator is ideally suited where log file updates are
required to persist to disk in a highly performant manner for modern database
Premium Storage
Premium Storage provides significantly better performance than Standard Storage, especially for critical
transaction log writes. Microsoft recommends using Azure Standard SSD storage as the minimum for Azure VMs
hosting the SAP application layer and for non-performance sensitive DBMS deployment and using Azure
Premium SSD storage for all other Azure VMs DBMS workloads.
Multi-disk volumes
Stripe multiple Azure data disks using Storage Spaces to increase I/O bandwidth up to the target virtual
machine's IOPS and throughput limits. On Linux, use the MDADM utility to stripe disks together. The MDADM is
a small program which will allow you to configure and manage RAID devices in Linux.
Storage latency is critical for DBMS systems, even for SAP HANA, which, for the most part, keeps data in-
memory. The critical path in storage is usually around the transaction log writes of the DBMS systems. However,
operations like writing savepoints or loading data in-memory after crash recovery can also be critical. Therefore, it
is mandatory to leverage Azure Premium Disks for /hana/data and /hana/log volumes. In order to achieve the
minimum throughput of /hana/log and /hana/data as required by SAP, build a RAID 0 volume using MDADM or
LVM over multiple Azure Premium Storage disks. As stripe sizes for the RAID 0 the recommendation is to use:
Caching
When you mount disks to VMs, you can choose whether the I/O traffic between the VM and those disks located
in Azure storage is cached. Standard and Premium storage use two different technologies for this type of cache.
• None
• Read
• Read/Write
• None + Write Accelerator, which is only for Azure M-Series VMs
• Read + Write Accelerator, which is only for Azure M-Series VMs
Premium Storage specific recommendation is to use Read caching for disks hosting SAP database data files and
No caching for the disks containing SAP database log files.
The same principle applies to SAP HANA, where the caching for volumes using Azure Premium Storage should
be set as follows:
• /hana/data - no caching
• /hana/log - no caching (with exception for M-Series VMs)
• /hana/shared - read caching
For M-Series deployments, Microsoft recommends that you use Azure Write Accelerator for your DBMS
deployment. As a matter of fact, SAP HANA certification for Azure M-Series virtual machines requires that Azure
Write Accelerator be enabled for the /hana/log volume. There are limits of Azure Premium Storage VHDs per VM
that can be supported by Azure Write Accelerator. The current limits are:
SAP Note #2015553 describes storage-specific provisions for Azure VMs hosting SAP workloads:
“We strongly recommend using Azure Premium Storage for all SAP production systems in Azure VMs. Even for
non-production systems, which require reliable and predictable performance, you should use Azure Premium
Storage instead of Azure Standard Storage for placing your DBMS files.”
• “To increase the total number of IOPS per volume presented to the guest operating system in the VM,
multiple disks can be striped using functionality operating systems offer. Each disk is protected from
physical drive failure by the means of mirroring, so using a software RAID level higher than RAID-0 isn't
necessary.”
• “DB log files should be stored on different disks than the DB data files.”
• “Azure Virtual Machines automatically offer a D:\ drive within the VM instance. This drive isn't persisted
and should NOT be used at all for any DBMS files/directories or any SAP files/directories.”
• The use of managed disks is recommended for SAP workloads.
The SAP HANA certified Azure storage types that can be considered for SAP HANA deployments
are:
The minimum SAP HANA certified conditions for the different storage types are:
• Azure Premium SSD - /hana/log is required to be cached with Azure Write Accelerator.
The /hana/data volume could be placed on Premium SSD without Azure Write Accelerator or on Ultra disk
• Azure Ultra disk at least for the /hana/log volume. The /hana/data volume can be placed on either
Premium SSD without Azure Write Accelerator or in order to get faster restart times Ultra disk
• NFS v4.1 volumes on top of Azure NetApp Files for /hana/log and /hana/data
SAP HANA certification for Azure M-Series virtual machines is exclusively with Azure Write Accelerator for
the /hana/log volume. As a result, production scenario SAP HANA deployments on Azure M-Series virtual
machines are expected to be configured with Azure Write Accelerator for the /hana/log volume
NFS v4.1 volumes on Azure NetApp Files
Important considerations
When considering Azure NetApp Files for the SAP Netweaver and SAP HANA, be aware of the following
important considerations:
Data protection
• Use the Azure Resource Manager deployment model.
• Enable Azure Defender for all of your storage accounts.
• Turn on soft delete for blobs.
• Turn on soft delete for containers.
• Lock storage account to prevent accidental or malicious deletion or configuration changes.
• Store business-critical data in immutable blobs.
• Require secure transfer (HTTPS) to the storage account.
• Limit shared access signature (SAS) tokens to HTTPS connections only.
Networking
• Configure the minimum required version of Transport Layer Security (TLS) for a storage account.
• Enable the Secure transfer required option on all of your storage accounts.
• Enable firewall rules.
• Allow trusted Microsoft services to access the storage account.
• Use private endpoints.
• Use VNet service tags.
• Limit network access to specific networks.
• Configure network routing preference.
If there isn't enough free space available, the disk can be resized to 2048 GB. Oracle Database and redo log files
need to be stored on separate data disks. There's an exception for the Oracle temporary
tablespace. Tempfiles can be created on D:\ (non-persistent drive). The non-persistent D:\ drive also offers better
I/O latency and throughput (except for A-Series VMs). To determine the right amount of space for the tempfiles,
you can check the sizes of the tempfiles on existing systems.
Only single-instance Oracle using NTFS formatted disks is supported. All database files must be stored on the
NTFS file system on managed (recommended) or on unmanaged disks. Microsoft strongly recommends using
Azure Managed Disks with Premium SSD storage for your Oracle Database deployments. Network drives or
remote shares like Azure file services aren't supported for Oracle Database files.
• Write runbooks - Author PowerShell, PowerShell Workflow, graphical, Python 2, and DSC runbooks in
common languages.
• Build and deploy resources - Deploy virtual machines across a hybrid environment using runbooks and
Azure Resource Manager templates. Integrate into development tools, such as Jenkins and Azure
DevOps.
• Configure VMs - Assess and configure Windows and Linux machines with configurations for the
infrastructure and application.
• Share knowledge - Transfer knowledge into the system on how your organization delivers and maintains
workloads.
• Retrieve inventory - Get a complete inventory of deployed resources for targeting, reporting, and
compliance.
• Find changes - Identify changes that can cause misconfiguration and improve operational compliance.
• Monitor - Isolate machine changes that are causing issues and remediate or escalate them to
management systems.
• Protect - Quarantine machines if security alerts are raised. Set in-guest requirements.
• Govern - Set up Azure RBAC for teams. Recover unused resources.
Assuming there is an XFS file system spanning four Azure virtual disks, the following steps provide a
consistent snapshot that represents the HANA data area:
• Survey and inventory the current SAP landscape. Identify the SAP Support Pack levels and determine if
patching is required to support the target DBMS. In general, the operating system compatibility is
determined by the SAP kernel and the DBMS compatibility is determined by the SAP_BASIS patch level.
• Build a list of SAP OSS Notes that need to be applied in the source system, such as updates for
SMIGR_CREATE_DDL. Consider upgrading the SAP kernels in the source systems to avoid a large
change during the migration to Azure (eg. If a system is running an old 7.41 kernel, update to the latest
7.45 on the source system to avoid a large change during the migration).
• Develop and document the high availability and disaster recovery solution. The documentation should
break up the solution into the DB layer, ASCS layer, and SAP application server layer. Separate solutions
might be required for standalone solutions such as TREX or Livecache.
• Develop a sizing and configuration document that details the Azure VM types and storage configuration.
How many premium disks, how many datafiles, how are datafiles distributed across disks, usage of
storage spaces, NTFS format size = 64kb. Also, document backup/restore and DBMS configuration such
as memory settings, max degree of parallelism, and traceflags.
• Develop a network design document including VNet, Subnet, NSG and UDR configuration.
• Document and implement security and hardening concept. Remove Internet Explorer, create an Active
Directory container for SAP service accounts and servers, and apply a firewall policy blocking all but a
limited number of required ports.
• Create an OS/DB migration design document detailing the package and table splitting concept, number of
R3loads, SQL Server traceflags, sorted/unsorted, Oracle RowID setting, SMIGR_CREATE_DDL settings,
Perfmon counters (such as BCP rows/sec and BCP throughput kb/sec, CPU, memory), RSS settings,
Accelerated Networking settings, log file configuration, BPE settings, TDE configuration.
• Create a “Flight Plan” graph showing progress of the R3load export/import on each test cycle. This allows
the migration team to validate if tunings and changes improve R3load export or import performance. The
X axis is the number of packages complete, and the Y axis is the elapsed time. This flight plan is also
critical during the production migration so that the planned progress can be compared against
the actual progress and any problem identified early.
• Create a performance testing plan. Identify the top ~20 online reports, batch jobs and
interfaces. Document the input parameters (such as date range, sales office, plant, company
code, etc.) and runtimes on the original source system. Compare to the runtime on Azure. If
there are performance differences run SAT, ST05, and other SAP tools to identify inefficient
statements.
• Audit deployment and configuration, and ensure that cluster timeouts, kernels, network
settings, NTFS format size are all consistent with the design documents. Set perfmon counters
on important servers to record basic health parameters every 90 seconds. Verify that the SAP
servers are in a separate AD container and that the container has a group policy applied to it
with firewall configuration.
• Check that the lead OS/DB migration consultant is licensed! Request the consultant name, s-
user, and certification date. Open an OSS message to BC-INS-MIG and ask SAP to confirm the
consultant is current and licensed.
• If possible, have the entire project team associated with the VLDB migration project within one
physical location and not geographically dispersed across several continents and time zones.
• Make sure that there is a proper fallback plan is in place and that it is part of the overall
schedule.
• Select fast thread count Intel CPU models for the R3load export servers. Do not use “Energy
Saver” CPU models as they have much lower performance and do not use 4-socket servers.
Intel Xeon E5 Platinum 8158 is a good example.
Various SKUs are available for HANA Large Instances, supporting up to 20 TB single instance (60 TB
scale-out) of memory for S/4HANA or other SAP HANA workloads. Two classes of servers are
offered:
• Type I class of SKUs comprising S72, S72m, S96, S144, S144m, S192, S192m, and S192xm
• Type II class of SKUs comprising S384, S384m, S384xm, S384xxm, S576m, S576xm, S768m,
S768xm, and S960m
For greenfield implementations, SAP Quick Sizer is available to calculate memory requirements of the
implementation of SAP software on top of HANA
The storage used in HANA Large Instances has a file size limitation of 16 TB. Unlike in file size
limitations in the EXT3 file systems, HANA is not aware implicitly of the storage limitation enforced by
the HANA Large Instances storage. As a result, HANA will not automatically create a new data file
when the file size limit of 16 TB is reached. As HANA attempts to grow the file beyond 16 TB, HANA
will report errors and the index server will crash at the end. To prevent HANA trying to grow data files
beyond the 16 TB file size limit of HANA Large Instance storage, you need to set the following
parameters in the global.ini configuration file of HANA:
• datavolume_striping = true
• datavolume_striping_size_gb = 15000
• The HANA Large Instance units of your customer tenant are connected through another
ExpressRoute circuit into your virtual networks. To separate load conditions, the on-premises to
Azure virtual network ExpressRoute circuits and the circuits between Azure virtual networks
and HANA Large Instances don't share the same routers.
• The workload profile between the SAP application layer and the HANA Large Instance consists
typically of small requests and burst data transfers (result sets) from SAP HANA into the
application layer.
• The SAP application architecture is more sensitive to network latency than typical scenarios
where data is exchanged between on-premises and Azure.
• The Azure ExpressRoute gateway has at least two ExpressRoute circuits: one circuit that is
connected from on-premises and one that is connected from HANA Large Instances. This
leaves only room for another two additional circuits from different MSEEs to connect to on
ExpressRoute Gateway. All the connected circuits share the maximum bandwidth for incoming
data of the ExpressRoute gateway.
With ExpressRoute Global Reach, customers can link ExpressRoute circuits together to make a
private network between on-premises networks. Global Reach can be used for HANA Large
Instances in two scenarios:
• Enable direct access from on-premises to your HANA Large Instance units deployed in different
regions.
• Enable direct communication between your HANA Large Instance units deployed in different
regions.
Network considerations for disaster recovery with SAP HANA on Azure (Large Instances)
To take advantage of the disaster recovery functionality of HANA Large Instances, you need to design
network connectivity to the two Azure regions. You need an Azure ExpressRoute circuit connection
from on-premises in your main Azure region, and another circuit connection from on-premises to your
disaster recovery region.
This measure covers a situation in which there's a problem in an Azure region, including a Microsoft
Enterprise Edge Router (MSEE) location.
As a second measure, you can connect all Azure virtual networks that connect to SAP HANA on
Azure (Large Instances) in one region to an ExpressRoute circuit that connects HANA Large
Instances in the other region. With this cross connect, services running on an Azure virtual network in
Region 1 can connect to HANA Large Instance units in Region 2, and the other way around. This
measure addresses a case in which only one of the MSEE locations that connects to your on-
premises location with Azure goes offline.
backup considerations for SAP HANA on Azure (Large Instances)
Do it yourself (DIY)
After you make sure that there's enough disk space, perform full database and log backups by using
one of the following disk backup methods. You can back up (using native tools such as SAP HANA
Cockpit) either directly to volumes attached to the HANA Large Instance units or to NFS shares that
are set up in an Azure virtual machine (VM)
Infrastructure backup and restore functionality. You also can use the backup and restore functionality
that the underlying infrastructure of SAP HANA on Azure (Large Instances) provides. This option
fulfills the need for backups and fast restores.
• Although the hardware can sustain 255 snapshots per volume, you want to stay well below this
number. The recommendation is 250 or less.
• Before you perform storage snapshots, monitor and keep track of free space.
• Lower the number of storage snapshots based on free space. You can lower the number of
snapshots that you keep, or you can extend the volumes. You can order additional storage in 1
TB units.
• During activities such as moving data into SAP HANA with SAP platform migration tools
(R3load) or restoring SAP HANA databases from backups, disable storage snapshots on
the /hana/data volume.
• During larger reorganizations of SAP HANA tables, avoid storage snapshots if possible.
Storage snapshots are a prerequisite to taking advantage of the disaster recovery capabilities of SAP
HANA on Azure (Large Instances).
SAP HANA on Azure (Large Instances) security considerations
Data transferred between HANA Large Instance and VMs are not encrypted. As an alternative, you
have the option of enabling application-level encryption between the HANA DBMS and JDBC/ODBC-
based applications.
The storage used for HANA Large Instance allows transparent encryption of the data as it's stored on
the disks. When a HANA Large Instance unit is deployed, you can enable this kind of encryption. You
also can change to encrypted volumes after the deployment takes place. The move from non-
encrypted to encrypted volumes is transparent and doesn't require downtime.
• By default, HANA Large Instances use storage encryption based on transparent data
encryption (TDE) for the data at rest.
• Data in transit between HANA Large Instances and the virtual machines are not encrypted. To
encrypt the data transfer, enable application-specific encryption. See SAP Note #2159014.
• Isolation provides security between the tenants in the multi-tenant HANA Large Instance
environment. Tenants are isolated using their own VLAN.
• Azure network security best practices provide helpful guidance.
• As with any deployment, operating system hardening is recommended.
• For physical security, access to Azure datacenters is limited to authorized personnel only. No
customers can access the physical servers.