Db2 Optimization Techniques For SAP DB Migration To The Cloud
Db2 Optimization Techniques For SAP DB Migration To The Cloud
Dino Quintero
Frank Becker
Holger Hellmuth
Joern Klauke
Thomas Rech
Alexander Seelert
Hans-Jürgen Zeltwanger
Tim Simon
Redbooks
Draft Document for Review June 20, 2023 5:35 pm 8531edno.fm
IBM Redbooks
June 2023
SG24-8531-00
8531edno.fm Draft Document for Review June 20, 2023 5:35 pm
Note: Before using this information and the product it supports, read the information in “Notices” on
page xi.
This edition applies to the following software and operating system levels:
AIX 7.2 TL 5 SP 3
Db2 Version 11.1 MP4 FP6 SAP4
Db2 Version 11.5 MP7 FP0 SAP2
SAP ERP 6.08 (BS7i2016), kernel release 753
Red Hat Enterprise Linux 8.4
Red Hat Enterprise Linux 8.5
iii
8531edno.fm Draft Document for Review June 20, 2023 5:35 pm
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Chapter 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 SAP NetWeaver and Db2 in the cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Methods outside the scope of this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Heterogeneous system copy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Unicode conversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Database vendor change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Contents vii
8531TOC.fm Draft Document for Review June 20, 2023 5:35 pm
viii Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531TOC.fm
Contents ix
8531TOC.fm Draft Document for Review June 20, 2023 5:35 pm
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright
and trademark information” at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
AIX® IBM® Redbooks®
Aspera® IBM Cloud® Redbooks (logo) ®
BLU Acceleration® InfoSphere® z/OS®
Db2® POWER®
FASP® pureScale®
Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive
licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
OpenShift, Red Hat, are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in the United
States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
xii Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531pref.fm
Preface
For many years, SAP migrations have been a standard process. We see an increasing
number of customers changing their database software to IBM Db2 for UNIX, Linux and
Windows or moving their existing Db2 based infrastructure from on-premises into the cloud.
When moving to the cloud, often a heterogeneous system copy is needed due to a change of
the underlying hardware architecture and operating system.
This book provides in-depth information about the best practices and recommendations for
the source system database export, the advanced migration techniques, database layout and
configuration, database import recommendations, SAP NetWeaver Business Warehouse, in
addition to background information about Unicode. We summarize our recommendations in
one chapter that can be used as a quick reference for experienced migration consultants. It
describes optimization strategies and best practices for migrating SAP systems to IBM Db2
for Linux, UNIX and Windows. It is intended for experienced SAP migration experts and
discusses IBM Db2® specific recommendations and best practices. It addresses advanced
SAP migration techniques, considerations for database layout and tuning, and presents
unique Db2 capabilities.
All techniques discussed within this book are based on extensive tests and experiences
collected from countless migration projects. However, it is important to understand that some
advanced optimizations described in this document may have side effects or may introduce
risks to the overall process due to their complexity. Other optimizations may require changes
to the production system. Therefore, these features must be chosen wisely. They should be
used only if the migration downtime window makes it necessary.
We want this book to be as helpful as possible. If you like to provide feedback on our
recommendations, have suggestions or questions, you are welcome to contact the authors.
Authors
This book was produced by a team of specialists from around the world working at IBM
Redbooks, Austin Center.
Dino Quintero is a Systems Technology Architect with IBM® Redbooks®. He has 28 years of
experience with IBM Power technologies and solutions. Dino shares his technical computing
passion and expertise by leading teams developing technical content in the areas of
enterprise continuous availability, enterprise systems management, high-performance
computing (HPC), cloud computing, artificial intelligence (AI) (including machine and deep
learning), and cognitive solutions. He is a Certified Open Group Distinguished Technical
Specialist. Dino is formerly from the province of Chiriqui in Panama. Dino holds a Master of
Computing Information Systems degree and a Bachelor of Science degree in Computer
Science from Marist College.
Frank Becker joined IBM in 2001 as a student in business informatics. After his certification
as an SAP Technology Consultant for OS/DB migration he acquired extensive skills in
homogeneous and heterogeneous SAP system copy projects around the world. This was
followed by combined upgrade and unicode conversions projects as a technical analyst,
coach for the SAP Basis and member of the project management team. He was deeply
involved in proof of concepts regarding IBM advanced SAP system copy procedure using IBM
InfoSphere® Data Replication Change Data Capture for downtime minimization. He holds a
degree in business informatics from the Berufsakademie Dresden (Germany).
Holger Hellmuth is a software engineer in the Db2 for LUW SAP Porting Team in Walldorf
(Germany). Previously he worked in technical marketing and sales support for SAP on Db2
for Linux, UNIX, and Windows at the IBM SAP International Competence Center in Walldorf.
He has about 30 years of experience with SAP and Db2. Holger works at IBM for 35 years.
He holds a degree in physical engineering from the University of Applied Science in Heilbronn
(Germany). His area of expertise includes Db2 for SAP Software, Db2 backup and restore
and the implementation of Db2 on modern hardware and software environments. He has
written extensively on the values of Db2 LUW for SAP NetWeaver based implementations.
Joern Klauke leads the SAP on Db2 LUW development team. He joined IBM in 2008 and for
fourteen years gained experience as a software developer for SAP on Db2 for Linux, UNIX,
and Windows at the IBM Research and Development Lab in Böblingen (Germany) developing
the scripted interface and NX842 compression for Db2 backups and log archives. He also
worked as a support analyst for Db2 Advanced Support. Joern holds a degree in Computer
Science from the Martin-Luther-University of Halle (Germany).
Thomas Rech has been working in the Db2 area since 1996. Starting as a Db2 course
instructor, he soon moved to the SAP on Db2 environment in 1997. He joined the SAP/IBM
Db2 Development Team, followed by a role as technical sales consultant. He led several
lighthouse projects like the first Db2/SAP implementation on HP-UX, the internal SAP
implementation of Db2, and the world’s largest SAP Business Warehouse. He took over the
role of IT architect and team lead in the IBM SAP Db2 Center of Excellence, helping
customers in critical situations and everything around Db2 and SAP. Thomas now works as IT
architect in the Db2/SAP development team out of the IBM Research & Development
Laboratory in Böblingen/Germany. He holds a degree in computer science from the University
of Applied Sciences in Worms.
Alexander Seelert joined IBM in 1998 as a certified SAP Technology Consultant for OS/DB
Migration. He acquired deep skills in project management, migrations and upgrades for any
kind of SAP applications and databases. Together with selected clients, he used advanced
migration methods to migrate complex systems using the IBM Data Replication application
and SAP near zero downtime to different selected Cloud Hyperscalers. In 2022, he became
an IBM Technical Sales Specialist. In this new role, he is providing more insight and individual
solutions for Hybrid Cloud scenarios.
Hans-Jürgen Zeltwanger is a member of the SAP on Db2 for Linux, UNIX, and Windows
development team at IBM Germany Research & Development. He joined IBM in 1997 and
has more than 25 years of experience with SAP and Db2. Hans-Jürgen started as a
developer for SAP Ready-to-Run systems, followed by nearly a decade as a pan-European
technical sales consultant. As team lead in the IBM SAP Db2 Center of Excellence,
Hans-Jürgen successfully planned and executed countless proofs of concept and SAP
migration projects. For many years now, he has been involved in the development of training
material for using Db2 as the database for SAP NetWeaver-based products and is leading the
development of the SAP on Db2 Learning Journey. Hans-Jürgen holds a degree in electrical
engineering from the Baden-Württemberg Cooperative State University and a degree in
industrial engineering from Pforzheim University.
Tim Simon is an IBM Redbooks Project Leader in Tulsa, Oklahoma, US. He has over 40
years of experience with IBM, primarily in a technical sales role working with customers to
help them create IBM solutions to solve their business problems. He holds a BS degree in
Math from Towson University in Maryland. He has worked with many IBM products and has
extensive experience creating customer solutions by using IBM Power, IBM Storage, and IBM
zSystems throughout his career.
xiv Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531pref.fm
Phil Downey
Technical Product Manager, IBM Research Discovery Tooling Research
IBM Australia
Karin Eberhart
SAP Db2 LUW Installation Tools, Data and AI Software
IBM Germany
Michael Hoffmann
BTS Automation Integration DACH Global Sales - Cloud Platform Sales
IBM Germany
Gürsad Kücük
Worldwide Customer/Partner Engagement Leader of Db2 and SAP Software
IBM Germany
Carola Langwald
SAP on Db2 LUW Development Support
IBM Germany
Thomas Matthä
Senior Development Engineer, SAP Certified Consultant
IBM Germany
Beck Tang
SAP Integration and Support, Hybrid Data Management, IBM Analytics IBM Hybrid Cloud
IBM Canada
Frank-Martin Haas
SAP SE
Karen Kuck
SAP SE
Jörg Schön
SAP SE
Find out more about the residency program, browse the residency index, and apply online at:
Preface xv
8531pref.fm Draft Document for Review June 20, 2023 5:35 pm
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
[email protected]
Mail your comments to:
IBM Corporation, IBM Redbooks
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
xvi Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch01.fm
Chapter 1. Introduction
The information technology landscape is always changing and one of the changes that is now
prevalent in the industry is a growing interest in running workloads in the cloud. There are
many benefits that can be achieved by moving to the cloud. Often they are financial in nature,
but another consideration is that outsourcing the information technology mission frees up
company resources to focus on their primary business objectives.
Moving workloads to another platform, whether it is a new cloud platform or not, involves
migration of your data from your existing database environment to the new database. This
book provides some tips and techniques to optimize that migration when using Db2 in an SAP
environment.
1.1 Overview
SAP offers software for various hardware platforms and database systems. The procedure to
migrate an existing SAP system from one database system or operating system to another is
known as a heterogeneous system copy or OS/DB migration.
In addition to the long-time existing platforms like IBM Power Servers (with IBM AIX® as the
operating system) or servers running HP-UX, a new set of infrastructure platforms is getting
more and more attraction – the public cloud.
This is not only true for various cloud native applications by SAP, but also for classic SAP
NetWeaver based applications like Enterprise Resource Planning (ERP) or SAP Business
Warehouse (BW).
Those SAP NetWeaver based systems can be operated in the public cloud in an
Infrastructure as a Service (IaaS) model.
SAP has developed a set of tools that allows customers to export their source database in a
database independent format and import it into the target database. These tools are
integrated in the SAP Software Provisioning Manager (SWPM) as part of the Software
Logistics Toolset (SL Toolset). The same set of tools allows converting a non-Unicode SAP
system into a Unicode one. The tools also allow the declustering of SAP table clusters.
This book describes various optimizations and best practices for converting a Db2 database
to Unicode and for migrating an SAP system from a non-Db2 database to Db2. The focus is
on Db2 specific configuration and optimization options.
There are three main models in cloud computing that do the management for you:
Infrastructure-as-a-Service or IaaS
Platform-as-a-Service or PaaS
Software-as-a-Service or SaaS
The models differ in which layers are managed by the cloud provider. The differentiation for
each model is shown in Figure 1-1 on page 3.
IaaS
IaaS is defined as access to cloud-hosted computing infrastructure - server, storage, and
networking resources.
The user can provision the machines through the use of a graphical user interface (GUI) or by
using API calls. The cloud service provider hosts, manages, and maintains the hardware and
computing resources in its own data centers. IaaS is accessed using an internet connection,
and customers pay for the resources through a subscription or on a pay-as-you-go basis.
Typically, IaaS customers can choose between virtual machines or bare metal servers on
dedicated physical hardware.
Benefits of this IaaS model are the increased flexibility to build and scale resources as
needed. It lets you avoid up-front expenses, and you do not need to plan for spike workloads.
SAP Business Suite on Db2 is a good example for such an IaaS model.
PaaS
Platform-as-a-service (PaaS) is another step further from on-premises infrastructure. In the
PaaS model a provider offers the services from the IaaS model and delivers this platform to
the user as an integrated solution, solution stack, or service through an internet connection.
The SAP Integration Suite, IBM Red Hat OpenShift or IBM Db2 Warehouse on Cloud are
examples for a PaaS model.
Chapter 1. Introduction 3
8531ch01.fm Draft Document for Review June 20, 2023 5:35 pm
SaaS
The Software-as-a-service (SaaS) model, also known as cloud application services, is the
most comprehensive form of cloud computing service. It provides the services from IaaS and
PaaS and enhances those models with services for the application and data. The model
delivers an entire application that is managed by a provider, using a web browser. The
customer does not have to care about software updates, bug fixes, or maintenance for the
whole stack. You may already know SaaS applications as they are common. An example is
your email application where you access your mails using a web browser. Other examples are
SAP SuccessFactors or Red Hat Insights.
Since IBM Db2 with SAP NetWeaver and SAP NetWeaver based products are available in an
IaaS model only, this book mainly concentrates on this model. The positive side effect of this
is that all of the findings and recommendations in this book are also valid for virtualized
on-premises environments and bare metal server implementations as well.
Before reading this publication you should already be familiar with the following tools and
terms:
SAP Software Provisioning Manager (SWPM)
SAP Migration Monitor and SAP Distribution Monitor
SAP Package Splitter
SAP tables split tools like R3ta and SAPup
If you want to refresh your knowledge about the above tools, refer to the SAP documentation:
System Copy for SAP Systems Based on the Application Server ABAP of SAP NetWeaver
7.3 EHP1 to 7.52 on UNIX or System Copy for SAP Systems Based on the Application
Server ABAP of SAP NetWeaver 7.3 EHP1 to 7.52 on Windows.
Target Databases: SAP ASE; SAP MaxDB; Oracle; IBM Db2 for z/OS; IBM Db2 for
Linux, UNIX, and Windows; MS SQL Server
In addition to the essential tools for a heterogeneous system copy, other options exist that will
not be covered in detail and were not used while writing this book. Some examples are:
– The SAP Update Manager includes a feature called Database Migration Option (DMO).
– The IBM Db2 family is available on multiple platforms: IBM Db2 for z/OS®, IBM Db2 for
IBM i or IBM Db2 for Linux, UNIX, and Windows. In this book, we only cover IBM Db2
for Linux, UNIX, and Windows.
While exporting or importing, the SAP system must be offline. No user activity is permitted.
Usually, you allow for a weekend’s time frame to perform a heterogeneous system copy.
In cases where the system is large or when the time frame is tight, you must apply special
optimizations to the export and import process. This book describes those optimizations and
preferred practices available for IBM Db2.
Chapter 1. Introduction 5
8531ch01.fm Draft Document for Review June 20, 2023 5:35 pm
Db2, most of the chapters in this book will provide valuable information. For the export, you
may need to consult documentation for the specific source database type and combine that
information with the information about Db2 you find in this book.
This chapter provides the foundations that can be used to plan and size your environment for
a cloud implementation using Db2 as the database engine.
For details about the prerequisites, supported virtual machines, and bare-metal server types
that are available with SAP, see the following SAP Notes:
– 2552731 - SAP Applications on Alibaba Cloud: Supported Products and IaaS VM
Types
– 1600156 - DB6: Support statement for Db2 on Amazon Web Services
– 2233094 - DB6: SAP Applications on Azure Using IBM Db2 for Linux, UNIX, and
Windows - Additional Information
– 2927211 - SAP Applications on IBM Cloud Virtual Private Cloud (VPC) Infrastructure
environment
– 2414097 - SAP Applications on IBM Cloud Classic Infrastructure environment
– 2855850 - SAP Applications on IBM Power Virtual Servers
– 2456432 - SAP Applications on Google Cloud: Supported Products and Google Cloud
machine types
– 3000343 - SAP Applications on Google Cloud: Supported Products on Google Cloud
Bare Metal Solutions
Also, the option to use massively parallel processing (MPP) also known as Database
Partitioning Feature (DPF) is supported when running Db2 in the cloud.
A noted exception of a feature that is missing from this list of fully supported features is the
usage of cluster managers for high availability solutions. The usage of cluster managers
requires a documented blueprint by the cloud service provider.
For Microsoft Azure and Amazon Web Services, the cluster manager “Pacemaker” is
documented and the recommended solution. For the Google Cloud Platform, this cluster
manager is not certified yet, but planned. Stay tuned and check whether the certification is
available by the time this book is published.
Common for all Db2 and SAP-supported cluster managers on the different cloud platforms is
that they are based on the Db2 High Availability Disaster Recovery (HADR) feature. HADR
provides a high availability solution for both partial and complete site failures. HADR protects
against data loss by replicating data changes from a source database (“primary database”) to
the target databases (“standby databases”).
Cluster managers add the functionality of automated failover to the standby database in case
of a planned or unplanned outage of the primary database server. The HADR functionality
can be used for migration to the cloud infrastructure as well. In section 11.2, “Db2 HADR” on
page 186, we will discuss the option to run the primary server on-premises and build the
standby database server in the cloud environment. This option reduces the required
downtime to a minimum.
The Db2 pureScale® feature is a shared disk cluster with multiple active database servers
that delivers high-availability and scalability functionality. IBM Db2 pureScale is not currently
supported by any of the cloud vendors in combination with SAP applications.
When you move your SAP NetWeaver-based infrastructure to a cloud service provider, you
have a series of VM instances or bare metal servers available that are certified for SAP. You
are responsible for choosing which instance to use as the instances differ in CPU and
memory configuration. In addition to the CPU/memory configuration, multiple types of disks
are available for your storage configuration.
The question then is, how do you choose which machine type to use so that your system
performance meets the requirements. To determine the correct configuration, an SAP sizing
is required.
Customers perform the sizing by themselves. You can find SAP NetWeaver sizing information
at https://www.sap.com/sizing. One tool that is often used to size the SAP landscape is the
SAP Quick Sizer
As a result of this sizing process, you will get an SAP Application Performance Standard
(SAPS) number. The SAP number for each of the certified virtual machines and bare-metal
servers is documented in the SAP Notes mentioned in section 2.1, “SAP in the cloud” on
page 24. Also included in those documents are the infrastructure characteristics of the
servers, including the number of CPUs and memory.
Table 2-1 provides some sample configurations derived from SAP Note 2927211 - SAP
Applications on IBM Cloud Virtual Private Cloud (VPC) Infrastructure environment.
mx2(d)-8x64 8 64 10.283
bx2(d)-2x8 2 8 2.306
With the results of the SAP sizing and the SAPS information in the SAP Notes for each cloud
service provider, you can map the SAP landscapes to the correct virtual machines.
If you have an on-premises system on an operating system that is also supported by the
cloud service provider, a good sizing method is based on the data collection from the existing
system that will be moved to the cloud. Make sure that your cloud based environment
provides at least the same SAPS capacity as your exiting system.
To gather this sizing data, you can use operating system tools like: iostat, vmstat or the
performance monitor on Windows operating systems. In addition, several 3rd party tools exist
for monitoring and collecting data. As there is a large variety of tools available, we have not
attempted to list of them in this book.
The data we used to write this book was gathered using the free of charge tool called Nigel's
Performance Monitor (nmon), developed by Nigel Griffins which is a standard tool that comes
with the AIX operating system. For your Linux environment, it can be found and downloaded
from SourceForge.
The tool provides the ability to collect data over a period in defined intervals and the collected
data can be processed with Microsoft Excel to visualize the data graphically. as shown in
Figure 2-1.
No matter which tool you use to collect the system resource data, you need to ensure that the
resource usage is collected during peak workloads. The data should be collected and
analyzed during a high workload period, for example, during a quarter-end closing process.
The nmon tool or comparable other tools do not provide SAPS numbers, but after you have
gathered the necessary CPU and memory requirements, you can map those to the certified
cloud instances that are offered by your cloud vendor.
The cloud service providers offer different options for storage that range from classic
hard-disk drives (HDD) to solid-state disk drives (SSD). The hard-disk drive offerings are
usually cost-effective entry-level resources that typically are not feasible for SAP NetWeaver
database servers and may only be used for test or education systems.
Storage that is generally offered for SAP NetWeaver-based systems is usually SSD based
storage devices for the database server. Other storage types may be used for backup, and
temporary or archived data.
Storage services offered by the cloud service providers have different names and
characteristics. The following list provides examples of the different storage services that are
typically used for SAP applications:
– Azure: Ultra Disks, Premium SSD, Standard SSD or Standard HDD
– AWS: General Purpose SSD, Provisioned IOPS SSD, Throughput Optimized HDD or
Cold HDD.
– GCP: Persistent Disks with the flavors Standard, Balanced, SSD and Extreme.
– IBM Cloud®: 3 different IOPS tiers or customized IOPS profiles.
The offerings for each cloud vendor also differ in characteristics such as:
• Availability options
• Replication capabilities
• Storage encryption
• Snapshot capabilities
You need to carefully compare these details when comparing different cloud service
providers.
The most important storage characteristic for Db2 and other relational database management
systems is the number of input/output operations per second (IOPS). This applies especially
to online transaction processing (OLTP) workloads like SAP Business Suite.
Again, the different cloud service providers document the IOPS performance differently, both
across vendors and sometimes within the same vendor for the different classes of storage.
Some of these are:
• IOPS per gigabyte of capacity
• IOPS per volume.
For example:
• if you provision a disk in the IBM Cloud you can choose between three, five or ten
IOPS/GB.
• On Microsoft Azure you can provision the Standard SSD with up to 6.000 IOPS
depending on the size of your disk.
• Also on Microsoft Azure you can provision Ultra Disks which offer up to 160.000
IOPS depending on the size of your disk.
Be aware of the fact that the IOPS performance often increases with the allocated capacity.
Table 2-2 shows an example of the correlation between capacity and IOPS. There is some
difference in the details and ranges for the different cloud service providers, but in general the
concept remains the same for each of the different cloud service providers.
10-99 100-1000
100-199 100-2000
200-499 100-4000
500-999 100-10000
During the sizing process, you may end up in a situation where you need to allocate more
storage capacity than required to store the data, due to the IOPS requirements. For example
using the values in Table 2-2, if your database size is 700 GB with 20.000 IOPS required, you
need to allocate two volumes (even thought a single volume would provide enough disk
space) to meet your IOPS requirements.
A good tool to help with the IOPS sizing is the history monitoring of your source system. If you
monitor the IOPS – in the best case separately for log volumes and data volumes – over a
longer period, you can assess the real number of IOPS required when you move to the cloud.
The good thing about IOPS monitoring is that the numbers are comparable, independent of
the operating system. If you plan to move from AIX on-premises to an IaaS model on Linux,
the historical IOPS data will be similar to the IOPS you will generate in the cloud.
3.1 Introduction
The migration or heterogeneous system copy to the IaaS model requires a downtime of the
system. In this book we will show you features and options to minimize the system downtime
during this process. While various Db2 features can improve performance, the most impactful
optimizations come from ensuring that you are effectively using the resources available to you
through the optimal scheduling of the R3load processes.
The outcome of our tests, along with information gathered in consulting on customer
migrations, is a series of tips and recommendations. We have separated the
recommendations into three areas:
1. A general set that is true for each heterogeneous system copy.
2. A set of recommendations for standard migrations.
3. Recommendations for downtime critical or advanced system copies that require tuning.
The SAP tools for heterogeneous system copies evolve over time and new functionality may
be added. For the most recent information and recommendations, please refer to SAP Note
3311643 - DB6: Recommendations for migrations or system copies.
further optimization options. For more information see section 3.3, “Resource usage
during the migration” on page 34.
3. Choose a suitable number for the concurrently running R3load import and export
processes. A good start for both is 80%-100% of the number of CPUs on the export or
import server. For more information see Chapter 5, “Export optimization techniques” on
page 65 and section 6.6, “Import optimization” on page 106.
4. Perform one or more rehearsal migrations, execute the migtime package and analyze the
bottlenecks of the rehearsal migration. A good source of monitoring information is the
resource usage section at the end of each R3load log file. For more information about this
see section 4.4, “Time analyzer (MIGTIME)” on page 47
5. Run the migration check reports SMIGR_CHECK_DB6 and SMIGR_CREATE_DDL prior
to the migration and verify the correctness of the objects in the source database. For more
information see Chapter 4, “Tools overview” on page 43 and Chapter 6, “Basic Db2 layout
and configuration options” on page 85.
For more information see Chapter 6, “Basic Db2 layout and configuration options” on
page 85.
3. After each rehearsal migration, use all self-tuning memory management (STMM)
optimized parameter settings for the next rehearsal or final migration.
For more information see section 6.14.1, “Using self-tuning memory management
(STMM)” on page 127.
4. Export the data unsorted whenever possible.
For more information see section 5.2, “Unsorted versus sorted export” on page 66.
5. Use the following R3load options during the Import:
– COMPRESS_ALL - Use Db2 automatic dictionary creation to enable Db2
compression.
– ANY_ORDER - The migration throughput might show only a low single digit
improvement, but it does not affect any migration negatively.
– -nolog - Use this option as it reduces logging IO for small tables.
Important: Do not use the -nolog parameter of R3load when using parallel import with
INSERT.
1. Use a dedicated application server for the export. Exporting the system consumes a lot of
CPU resources. The workload is not only database related but may also include CPU
usage for codepage conversion, export dump file compression or declustering. If you
decide to use dedicated application servers for the export and are on Db2 11.5.7.0 SAP5
or higher, enable the Db2 “query block prefetch” feature. More details about the “query
block prefetch” feature can be found in the SAP Blog: “New Query Block Prefetch Feature
with Db2 11.5.6" from Frank-Martin Haas. For more information see section 5.5, “Export
server scenarios – local or remote” on page 75.
2. Preallocate the tablespaces. We recommend that you analyze the tablespace sizes after a
first test migration and then create the tablespaces manually with the initial size of the
tablespaces. You can combine this feature with setting the Db2 registry variable
DB2_USE_FAST_PREALLOCATION = OFF to minimize fragmentation on the file system
level as much as possible. For more information see section 6.4.8, “Preallocation of
tablespaces” on page 93.
3. Use the environment variable DB6LOAD_CPU_PARALLELISM to optimize the import of
selected large tables. The default for CPU parallelism of 4 for R3load is a good balance
between overall performance and resources usage. You may set this environment variable
to a larger number of 8 -12 in a dedicated migration monitor environment for the largest
tables. The import may improve by 10% for those tables while most of the tables are
imported with a balanced configuration. Using it for all tables may result in a similar
performance increase but with increased CPU usage. For more information see section
6.7.2, “DB6LOAD_CPU_PARALLELISM” on page 111.
4. Optimize Db2 statistics collection. SWPM does not enable Db2 automatic RUNSTATS for
the database until the import is completed. A simple method of optimization is to enable
the automatic RUNSTATS feature when the import starts. This may add a few additional
CPU cycles and I/O operations on the target but can be beneficial. Use the advanced
statistics collection feature with the db2look utility only if required and for a selected set of
tables. For more information see section 6.12, “Optimizing statistics collection” on
page 123.
5. Optimize any table splits. If you are using table splitting, use a reasonable number of splits
for the export and import in the range of 10-20 splits per table or up to 100 GB of data
volume per table split, and ensure that you set an appropriate number of concurrent
R3load processes in the orderby.txt file. The total number of concurrent R3load processes
for the split tables and the remaining packages should not exceed the number of available
CPU or vCPU. Use the R3load parameter “SPLITTED_LOAD” to import the tables
together with the environment variable DB6LOAD_TEMP_TBSPACE and without the
R3load option “DEF_CRT”. We strongly recommend that you read section 7.1, “Socket
transfer and table split overview” on page 134 carefully and assess the feasibility of the
R3load Option “LOAD_FORCED” and the usage of parallel INSERT.
After our intensive work with the various tuning options for heterogeneous system copies, we
have found options that are either specific to a certain aspect of the process or have a
significant impact on the process flow. Some of them come with the promise of significant
improvements. We strongly recommend that you read the full details in this book if you are
considering the use of these optimizations to fully understand the boundary conditions and
side effects:
Note: Please do not confuse extent size with page size. Do not use a different page
size for the tablespace.
The R3load socket option is an alternative to table splitting as it does not write the
compression dump files to disk and therefore saves I/O and CPU resources for dump file
compression. Consider this option for tables with LOB fields that do not compress well to
mitigate the effect of increased network bandwidth.
As with other computer systems, the main resource for a migration is the available compute
power which is determined by the number and performance of the available central
processing units (CPU), the available main memory and the throughput of write and read
operations to and from disks. Another important resource is the network connection and its
capability to transport data from the source to the target system.
The available resources must be used in the best possible way. An optimal resource usage
means:
Utilize but do not over-utilize your resources.
Ensure the constant utilization of resources.
Balance the resources (CPU, IO, memory, and network).
Optimize the process to achieve the best overall throughput and do not perform local
optimizations.
Avoid unnecessary steps, tasks, and resource consumption.
Utilize resources that are beneficial for a certain step in the migration.
This indicates that the storage subsystem is heavily loaded and there is no headroom for
peak workloads. If the disk is constantly busy at 100%, disk queuing may increase leading to
increased load on the system due to increased wait times, and this may slowdown overall
performance.
The same is true if the CPU or network bandwidth are constantly at or around 100%
utilization. Figure 3-2 on page 36 shows CPU resources being used 100% over a long period.
The graph also shows that the number of blocked processes is high between 08:30 and
10:00. Considering these facts together indicates an overloaded system where decreasing
the workload will most likely help to improve overall performance.
The information in Figure 3-2 is a good example of a system that is overloaded by too many
parallel R3load processes. You might be tempted to start many, sometimes even hundreds of
R3load processes in parallel, but a good rule of thumb is not to start more R3load import
processes than 80% - 100% of the number of CPUs available.
the export or import on an additional machine. A typical scenario could be a migration from an
older generation of machine on-premises with an import to a more modern machine in the
cloud. In this case, it could be beneficial to use the target machine also to execute parts of the
export. Of course, you must schedule the export in such a way that it does not overload the
network bandwidth.
Let’s have a look at an example: In one of our customer projects, we found that a large z-table
determined the overall migration time with 27 hours.
Figure 3-4 shows the import times of the table EDI40 (of the same migration) that was split
into 60 parts. The import runtime varies between two hours and four hours. This means that
the import runtime varies by a factor of 2 although the imports were almost identical in the
number of records.
Further analysis shows that the system was short on CPU during a large portion of the time
and the import for the large z-table ran alone for several hours. So, the number of parallel
R3load processes were reduced to make CPU resources free for the z-table import. The
import times for the EDI40 table increased, but the large z-table import was expedited. As a
result the duration of the full migration was reduced from 27 hours to 24 hours allowing the
downtime target to be met.
Some of the most important examples have been mentioned in this section already but you
will find more details later in this book.
If you use the R3load Option LOAD_FORCED together with a split export, Db2 loads each of
the parts sequentially into the target database. If you choose to create the indexes on the
table before the data is loaded, the complete index is rebuilt after each part of the table is
loaded. Db2 will eventually switch from this rebuild to an incremental mode based on internal
calculations. However a rebuild happens several times. To avoid this, either create the indexes
after the LOAD completes or use the incremental indexing mode.
Another example is the buffer pool flushes after each LOAD operation completes. You cannot
prevent them from happening, but you can decrease their negative impact by allocating a
smaller buffer pool. This leads to a much smaller number of pages to be flushed and is
therefore more effective.
3.3.6 Utilize the most suitable resources for the correct purpose
The largest part of this publication is about this optimization. It is no surprise that the best way
to fulfill the migration tasks is using the correct resources. To achieve this correct resource
usage, a good knowledge of Db2 and R3load is required, and this is what this book wants to
address.
Table 3-1 gives you some guidance as to the major resources used by the various steps of
the migration process.
Data (De)compression x
Codepage Conversion x
Data Fetch x x
Endianness Conversion x
Unicode Conversion x
Index Create x x x
Optional data decompression. The source database may use data compression
features and so data need to be uncompressed when exported with R3load.
Data fetch. This process transfers the data from the database to the R3load application.
The networking resource usage depends on where you run the R3load export. If you run
the R3load export on a dedicated application server or the target database server, data
needs to be transferred over the network.
Codepage conversion from UTF-8 to UTF-16. Db2 stores data in UTF-8 format while
SAP applications use UTF-16. This codepage conversion can be executed quickly, but
requires a small amount of CPU resources.
Optional endianness conversion. This conversion step is required if you move from a
system with big-endian processor architecture like IBM POWER® to X86 based Linux
systems with little-endian processor architecture. During R3load processing you can
switch the endianness by specifying the appropriate codepages (4102 or 4103). This
endianness conversion can be done during the export or the import.
Optional declustering and depooling of data. During the export, you can decluster
table clusters or depool pool tables. For details, check SAP Note 2227432.
Optional Unicode conversion. If your system is still on non-Unicode, you can convert to
Unicode during the export. This step cannot be executed during the Import.
R3load compresses the dump files to save space for the export. The compression
rates depend on the table content and structure. Typically the export dump is only 10% of
the data table size in the database.
R3load writes the data into the export dump files.
When the export of a table or package is completed, the data is moved or copied to the target
system. In case of a migration to the cloud, this may be through the use of a shared file
system or data transfer tools. This step usually impacts network resources and to some
extent the storage resources as well.
An important fact about the usage of R3load is that the executable is single threaded.
Therefore, some CPU resource intensive steps cannot easily be scaled. A good example is
the compression and decompression of the export dump files. The only option to scale those
steps is to split the packages and tables into multiple R3load processes. For many of the
database related steps, Db2 can use symmetric-multi-threading (SMT) and parallelization
within the database management system. One example is the use of Db2 intra-partition
parallelism during the export or index creation.
More examples:
Memory optimization: Some of the most prominent examples of optimal resource usage
is the efficient use of memory. As discussed previously, a smaller buffer pool can help to
avoid the impact of buffer pool flushes. Furthermore, a smaller buffer pool will free up
memory that can be used for sort operations. It makes a significant difference if you sort in
memory or spill intermediate objects to disk.
Optimal CPU usage (optimal in the sense of relation of CPU usage to throughput): A
Db2 load can use all the CPUs on the host. Using a CPU parallelism of 16 can ensure
better throughput than a CPU parallelism of four. However, the benefit is typically only in
the range of a 3-5% improvement. Therefore, by default R3load uses a CPU parallelism of
four to achieve the optimal performance with a given CPU resource usage.
Efficient use of network resources: If you decide to use a dedicated host for exporting
data, you should consider using the Db2 query prefetch feature. By doing so, you avoid
unnecessary time spent in waiting for the database to send data. On the other hand, the
optimized data transfer between the Db2 server and the remote R3load might reduce the
bandwidth available between your on-premises source and the migration target.
Therefore, you may consider using additional local servers for the export instead of the
target machine.
There are many more examples of optimal resource usage which you will find in the next
chapters of this publication.
As a first approach, you can use the migration monitor as it is started by the SAP Software
Provisioning Manager (SWPM) as part of the SAP SL Toolset 1.0.
One of the most powerful options of the migration monitor is the “orderBy” configuration
option. It allows you to schedule different packages or tables with a different number of
R3load Jobs. This configuration option is often used for table splits during export and import.
In addition, you can also use different load arguments for different packages.
You can use the “orderBy” configuration option to achieve different optimization goals.
Start a package or table prior to other tables with different options. For example: if you use
INSERT for a table containing large LOB fields instead of the LOAD feature of Db2. This
allows you to avoid interference with the other processes due to logging.
Control the number of processes per package or table. This configuration setting is often
used for table splits.
Keep in mind that the number of total processes is the sum of the jobs defined in the
“import_monitor_cmd.properties” file and the configuration in the “orderBy” file.
Specify different arguments for export or import.
The following example shows the usage of different load arguments for a specific table. The
“import_monitor_cmd.properties” file specifies the default load arguments for the process.
This is shown in Example 3-1.
orderBy=/export/orderby.txt
In this configuration, the migration monitor uses 20 parallel R3load processes to import
packages and the Db2 compression feature OPT_COMPRESS to load the data into the
target database. In addition, the migration monitor uses the information in the file
“/export/orderby.txt” to import the data in a specific order.
The orderby.txt file contains the following information for the table SOFFCONT1. This is
shown in Example 3-2.
jobNum=1
This configuration option sets the number of parallel R3loadjobs for this table to 1 and does
not use the OPT_COMPRESS feature for this table. Using this option, the table might not be
fully compressed on the target but the additional time for the 2-step compression will be
eliminated.
Note: The examples do not show a complete configuration but only the parts required to
illustrate the use case.
Using the “orderBy” file to configure and schedule the R3load processes is often enough to
achieve a decent migration throughput. In some cases, it can be beneficial to run multiple
instances of the migration monitor. To do so, you must setup the multiple instances manually
and configure them accordingly.
You can use multiple migration monitor instances to achieve different optimization goals.
Start a package or table first
The migration process starts when the specific instance of the migration monitor is
started. So, a simple scheduling will start the instance of the migration monitor manually.
For example, it is possible to start the migration process for all small tables after the large
tables are completed.
You need to assess if the potential increase in network usage, and the competition for
resources between the export process and the import process provides a benefit to the
overall migration. An enhancement of this concept is to use dedicated application servers to
either run import or export processes. The application servers would be available on a
temporary basis and the SAP applications are never started on the application server. Only
the migration monitor and the R3load processes should be run on this server. If you are
utilizing a cloud infrastructure, the flexibility of that cloud environment provides the ability to
easily create these temporary servers.
Using a server for the export other than the database server can be beneficial if the original
database server hardware is older because the single thread performance of older CPUs is
poor compared to newer machines. Also use a different server if the number of CPUs on the
export database server is limited. This allows the offloading of the R3load specific processing
of codepage conversion or dump file compression, leaving only the database workload
running on the original database server.
Figure 3-5 on page 43 shows a possible setup of multiple migration monitor Instances.
This chapter is not intended to replace the SAP training or other available documentation. The
chapter will introduce the tools that are required for the Db2 optimizations and will provide
some pointers in addition to hints and tricks that we found during our tests.
To simplify the migration process, the SAP installation tool provides software life-cycle
options. For example, the following options are available for a source system during a
heterogeneous system migration:
Export preparation
Table splitting preparation
Database instance export
SWPM will not use the currently installed kernel binaries, instead SWPM will use the
embedded LOADTOOLS to guarantee matching versions during the entire system copy
process. Also, a new replacement for the R3ta utility has been introduced which is called
SAPuptool. This tool will be used during table splits for SAP release 7.4 or higher as it is
much faster at calculating where a table will split.
Note: LOADTOOLS support starts with SAP release 7.4 and will not be used in SAP
release 7.31 or earlier.
If you need to manually update LOADTOOLS, check the following SAP notes for the most
recent information:
– 2472835 - Embedding load tools in Software Upgrade and SWPM
– 2557361 - How to update R3load, R3szchk and R3ldctl for SUM and SWPM
R3load
R3load is the “core” migration tool. It exports SAP ABAP table data from the source database
and imports it into the target database. In addition to this base functionality, it offers advanced
options. For example, it supports Db2 adaptive compression while importing data. To ensure
the functionality, R3load reads and writes several files that contain metadata. The functions
provided by R3load and its metadata files are shown in Figure 4-1.
R3ldctl
This tool unloads SAP ABAP data dictionary structures from the source system. It generates
structure files (*.STR files) that describe the definition of tables, indexes, and views. In
addition, it generates more files that are required during the migration:
Database specific template files (*.TPL files) with definitions of DDL statements (for
example, statements to create a table, to drop a table, etc.).
In case of a declustering/depooling during the export, the R3ldctl generates so called
logical files during the preparation, so the import can create the new structure accordingly
during the import.
The files, ending with “.WHR” can include SQL statements to export only chunks of a table
and are described in detail in section 7.3, “Table splitting” on page 139.
The task files (*.TSK) control the tasks of the utility and log the information about
completed or failed tasks.
The table of content file (*.TOC) stores the information that is used by R3load to find the
data offset and length of the data portion in a dump file.
Important: When performing production migrations, the R3ldctl tool must be executed
while the SAP application is offline and isolated to avoid potential loss of data.
R3szchk
R3szchk computes the size of tables and indexes for the target database.
The functions provided by R3ldctl and R3szchk are shown in Figure 4-2.
R3szchk may run for a long time as in some cases, the way to calculate the target databases
size uses “select count(*)” statements against all tables. This is true if the database
management system is changed during the migration. If you experience long runtimes of the
R3szchk, check if the R3szchk is using the “select count(*)” statement for the tables. You can
use db2top or other database monitoring tools to validate In-Flight Statements.
Note: R3szchk requires the STR Files created by R3ldctl to calculate the target size. If
running R3szchk and R3ldctl while the system is up and running, you must not make any
data definition changes (e.g. by new transports) as these changes will not be reflected in
the STR Files and so the export will not be consistent or will fail.
The tool is fully integrated into the system copy procedure of the Software Provisioning
Manager as part of the SAP SL Toolset 1.0. Although it is part of the system copy tools, the
migration monitor is an important component, and you should be familiar with the tool and
how to use it as part of the Software Provisioning Manager and when it is used manually.
The tool is described in SAP Note 784118 - System Copy Tools for ABAP Systems. Its
parameters, functions and control files are explained in more detail in the Migration Monitor
Guide that you can find in the SAP Help Portal.
After extracting the archive, up-to-date documentation is available in the file “TimeAnalyzer.pdf”.
You will find all the necessary information about how to start the scripts and to choose the
parameters in the documentation. Therefore, we only present some examples for the outcome
of the analysis.
The result of every analysis is a text file and optionally an HTML file with the graphical
representation and an XML file as an input file for the time-join script.
The following examples are taken from an export_time.txt file gathered by the
export_time-script.
This file has two parts. The first part, which is shown in Example 4-1, relates to the package
export times. The second part relates to the table export times.
-----------------------------------------------------------------------------------------
package time start date end date size MB MB/min
-----------------------------------------------------------------------------------------
VBFA-1 6:55:16 2021-06-03 16:54 2021-06-03 23:49 1688.43 4.07
0002_ZSDRRP 5:07:40 2021-06-03 16:53 2021-06-03 22:01 8352.69 27.15
VBAP-1 4:58:21 2021-06-03 16:54 2021-06-03 21:52 2939.31 9.85
CFIN_ACCIT_APP-1 4:55:12 2021-06-03 16:54 2021-06-03 21:49 3739.24 12.67
VBFA-80 4:25:56 2021-06-03 19:42 2021-06-04 00:08 1737.28 6.53
0057_APQD 4:16:53 2021-06-03 21:29 2021-06-04 01:46 6880.56 26.78
EDIDS-1 4:13:01 2021-06-03 16:54 2021-06-03 21:07 2950.09 11.66
0017_EDI40 4:06:57 2021-06-03 19:06 2021-06-03 23:13 58818.55 238.18
CFIN_ACCCR-1 4:06:15 2021-06-03 16:54 2021-06-03 21:00 3298.64 13.40
NAST-1 4:01:08 2021-06-03 16:54 2021-06-03 20:55 3350.71 13.90
0009_VBRK 4:00:46 2021-06-03 19:01 2021-06-03 23:02 15866.39 65.90
The list is sorted descending by the export time. Therefore, you get a good and fast overview
of which packages need the most time.
Example 4-2 shows the table export times shown in the second part of the file. The top, h or
m options of the export_time script control how many tables are shown.
Using this information, you can easily identify the tables that contribute the most to the export
time of a package consisting of more than one table. You then may consider further export
tuning activities for those tables.
You get the graphical representation of the first part of the export_time.txt when you use the
HTML option of the export_time script to generate a Package Time Diagram. Figure 4-3
shows a part of such a diagram.
The diagram is sorted by the start time of the exported package and therefore provides a
chronological overview of the export. The length of the bars represents the export time.
This file has two parts. The first part relates to the package import times. The second part
relates to the table import times. Part one is shown in Example 4-3.
----------------------------------------------------------------------------------------
package time start date end date size MB MB/min
----------------------------------------------------------------------------------------
VBOX__DT 17:49:36 2021-06-04 14:04 2021-06-05 07:53
VBFA__DT 8:14:31 2021-06-04 14:29 2021-06-04 22:44
VBRP__DT 5:08:06 2021-06-04 13:36 2021-06-04 18:44
FAGLFLEXA__DT 3:45:54 2021-06-04 02:38 2021-06-04 06:23
COEP__DT 3:18:29 2021-06-04 11:50 2021-06-04 15:09
0014_JEST 3:10:52 2021-06-03 21:15 2021-06-04 00:25 6564.81 34.39
0009_VBRK 2:22:07 2021-06-03 22:09 2021-06-04 00:31 15866.39 111.64
VBFS__DT 2:19:10 2021-06-04 04:25 2021-06-04 06:44
0002_ZSDRRP 2:14:27 2021-06-03 21:02 2021-06-03 23:17 8352.69 62.12
VBPA__DT 2:04:08 2021-06-04 03:18 2021-06-04 05:22
0011_LIKP 1:58:41 2021-06-03 22:16 2021-06-04 00:14 10719.99 90.32
0010_BSAS 1:49:29 2021-06-03 21:54 2021-06-03 23:44 10980.55 100.29
CDHDR__DT 1:43:31 2021-06-03 22:27 2021-06-04 00:10
GLPCA__DT 1:43:24 2021-06-04 03:16 2021-06-04 04:59
0013_VRPMA 1:39:24 2021-06-03 20:41 2021-06-03 22:20 5672.43 57.07
NAST__DT 1:39:19 2021-06-04 02:59 2021-06-04 04:38
CE1CCIP__DT 1:38:05 2021-06-04 07:54 2021-06-04 09:32
VAPMA__DT 1:37:12 2021-06-03 21:56 2021-06-03 23:34
0020_VLPMA 1:35:45 2021-06-03 20:33 2021-06-03 22:09 7322.53 76.48
0018_CHVW 1:35:05 2021-06-03 21:17 2021-06-03 22:52 7828.94 82.34
0001_S502 1:35:02 2021-06-03 19:33 2021-06-03 21:08 7289.13 76.70
BSIS__DT 1:34:51 2021-06-04 04:00 2021-06-04 05:35
0007_S033 1:31:44 2021-06-03 20:22 2021-06-03 21:54 7432.97 81.03
0006_LTAP 1:30:49 2021-06-03 21:15 2021-06-03 22:45 12185.81 134.18
LIPS__DT 1:29:42 2021-06-04 08:34 2021-06-04 10:04
0003_S501 1:29:08 2021-06-03 18:50 2021-06-03 20:19 6937.67 77.83
0028_CMFP 1:28:32 2021-06-03 22:45 2021-06-04 00:14 5774.55 65.22
0025_VEKP 1:27:05 2021-06-03 22:45 2021-06-04 00:13 8404.73 96.51
VBUP__DT 1:23:29 2021-06-04 02:14 2021-06-04 03:38
The list is sorted descending by the import time. Thus, you get a good and fast overview
which packages need the most time.
Example 4-4 shows the table that is related to the second section of the file. The top, h or m
options of the import_time script control how many tables are shown.
Using this information, you can easily identify the tables that contribute the most to the import
time of a package which comprises more than one table.
You get the graphical representation of the first part of the import_time.txt when you use the
HTML option of the import_time script to generate a package time diagram. Figure 4-4 shows
a part of such a diagram.
The diagram is sorted by the start time of the imported package and therefore gives a
chronological overview of the import. The length of the bars represents the import time. The
dashed lines for table D010TAB mean that the import was interrupted and continued later.
Note that in Figure 4-5 the records are in one line inside the text file.
Figure 4-6 on page 53 shows the graphical representation of the Package Join Time
Diagram.
A package may contain data from multiple tables e.g., tables A, B, and C in the figure.
A package may contain all of the data from a single table e.g., table D in the figure.
A package may contain only a subset of a table e.g., as is shown for table E in the figure.
To export a single table into multiple data packages, the R3load process requires a
WHERE file which defines the range of records that should be extracted from the table.
In most cases, the source database contains a small set of large tables. You can export these
tables in parallel to dedicated package files where each file contains the data of one table. For
large tables, performance can be improved even further by exporting and importing a single
table with multiple R3load processes in parallel.
Based on the selected execution option of migcheck, the checker can be executed on the
target database host to verify the import has been performed successfully using different
analysis strategies and deep investigation.
The Package Checker is used to validate that all packages are successfully loaded. It
validates only the TSK files and does not compare the actual tables within the database itself.
The Object Checker is used to verify that all objects (tables, views, indexes, primary keys) are
successfully created/loaded in the database, based on log files and TSK files only.
The Table Checker will validate the TOC file row values against the actual database table
records and shows differences. A sample output is shown in Example 4-5 on page 55.
tocDirs=/tmp/ibmas1/tmpexp1/ABAP/DATA
driver=com.ibm.db2.jcc.DB2Driver
unicode
numberOfDBConnections=100
url=jdbc:db2://localhost:5912/TAR:currentSchema=SAPTAR;deferPrepares=0;connectionCloseWithi
nFlightTransaction=2;
user=db2tar
password=mygoodpassword./table_checker.sh
./table_checker.sh
+ /db2/db2tar/sqllib/java/jdk64/jre/bin/java -cp
./.:./migcheck.jar:/db2/db2tar/sqllib/java/db2jcc4.jar com.sap.inst.lib.app.SecureStartup
'' com.sap.inst.migcheck.TableChecker
Table 'DDNTF_HIST' has different row count: 20330 and 20365.
Table 'DDNTF' has different row count: 1316922 and 1327594.
Both reports can be executed using SAP transaction SE38. SMIGR_CHECK_DB6 should be
run first, followed by SMIGR_CREATE_DDL if you choose this option.
4.7.1 SMIGR_CHECK_DB6
During the lifetime of the system, database administrators may have introduced Db2
database specific features to optimize the data storage.
In some cases, there might be a mismatch between the data placement in the database and
the parameters defined within the SAP Data Dictionary. One example is the placement of a
table without changing the SAP Data Class assignment. In such a case, the table will be
created according to the SAP assignment and as a result, the table will not be created in a
dedicated tablespace during migration. This is particularly true when range partitioned tables
exist and their data definition does not match the SAP boundary conditions.
Therefore, the SAP report SMIGR_CHECK_DB6 looks for these special cases and generates
a summary that can be used to fix existing issues prior to the starts of the migration. This
check should be executed well before the migration to allow for changes without time
pressure.
The result of the report reflects special Db2 objects and checks the data placement that will
exist on the target system.
The WLM concurrency threshold is active when tables of Type “column-organized” exist in the
source system. This could include objects that have been created for test purposes but are
not actively used anymore.
To protect yourself from these issues, execute the report prior each heterogeneous system
copy project and resolve problems that are reported prior to starting a migration.
The report can export the results to a file that can be used as a record about actions that are
required and what actions were taken as preparation. For more details about the report,
including the prerequisites, please refer to:
SAP Note 3246738 - DB6: Migration Check Tool SMIGR_CHECK_DB6.
4.7.2 SMIGR_CREATE_DDL
SMIGR_CREATE_DDL creates DDL files to support the creation of Db2-specific objects in
the target database. Execute this report on the source system before exporting the database.
The report historically was designed for SAP NetWeaver Business Warehouse related special
tables but it was extended to address ERP specific enhancements as well. These
enhancements are:
Support for range partitioned tables.
User customized LOB inline sizes
Both table types may exist on the source system and if no “.SQL” files exist, the information
gets lost during the migration as the tables will be created with the SAP default settings.
For example, if the system uses range partitioned tables to overcome the 4TB LOB size limit,
you may experience an error “ARCHITECTURAL PAGE LIMIT REACHED” as partitioning
information is missing in the target system and the tables are created without partitions.
For user customized LOB inline sizes the system will not face an immediate error, but
performance may degrade if you introduced a different LOB inline size for certain tables.
The report SMIGR_CREATE_DDL requires you to specify a location for the generated “.SQL”
files. During SWPM export, you are prompted for this location as well. During the SWPM
export processing, the “.SQL” files will be copied to the correct location for the import and
automatically used during import. Figure 4-8 shows the inputs generated for R3load to
customize the target database.
Figure 4-8 Input Files to R3load for Importing the Target Database
If you do not want certain objects to be created with the source database setting, you must
delete the tables manually from the “.SQL” files.
R3load can use these DDL files to create database objects that are based on the syntax
provided in the file.
Example 4-7 shows the input files that are used by R3load. The example was generated for a
standard E fact table of an InfoCube.
ind: /BIC/EBFABCUBE5~01
ind: /BIC/EBFABCUBE5~03
ind: /BIC/EBFABCUBE5~P
sql: CREATE UNIQUE INDEX "/BIC/EBFABCUBE5~P" on "/BIC/EBFABCUBE5"
("KEY_BFABCUBE5T",
"KEY_BFABCUBE51",
"KEY_BFABCUBE52",
"KEY_BFABCUBE5U",
"KEY_BFABCUBE5P")
CLUSTER
ALLOW REVERSE SCANS;
Note: The report SMIGR_CREATE_DDL may run for several hours as a dialog process. To
successfully execute the report, ensure the SAP profile parameter “rdisp/max_wprun_time” is set to
at least 7200 or 0 for no timeout. You may set it temporarily using SAP transaction RZ11
For more details about the report including the prerequisites, please refer to:
It is also important to validate the current installation prerequisites for your Db2 software
installation, validating operating system kernel settings and looking for known issues. It is
important to use only a certified SAP/Db2 software image for the installation purpose. All
other Db2 images, even from IBM itself are not supported.
Running a system copy requires deep essential skills in terms of the source and target
database. For the migration execution itself a certified OS/DB migration consultant is
mandatory as per SAP Note 82478 - SAP system OS/DB migration. To ensure a successful
production migration and the go live it is mandatory to utilize an SAP OS/DB migration check
service.
Consult the following SAP Notes and Guides before executing any technical activity:
System Copy for SAP Systems Based on the Application Server Java of SAP NetWeaver
7.5, and SAP Solution Manager 7.2 SR2 Java, on UNIX
Database Migration Option: Target Database IBM Db2 for Linux, UNIX, and Windows
Database Administration Guide for SAP on IBM Db2 for Linux, UNIX, and Windows
SAP Note 82478 - SAP system OS/DB migration.
SAP Note 101809 - DB6: Supported Db2 Versions and Fix Pack Levels
SAP Note 1718576 - Migration from SAP HANA to another database system
SAP Note 888210 - NW 7.**: System copy (supplementary note)
SAP Note 816773 - DB6: Installing the Application-Specific Db2 License from SAP
SAP Note 1680045 - Release Note for Software Provisioning Manager 1.0
Guides can be found in the Guide Finder for SAP NetWeaver section of the SAP Help Portal.
Note: Before executing the SAP system copy to Db2 you should implement the following
SAP code corrections in the current source SAP system:
SAP Note 2267446 - DB6: Support of tablespace pools.
SAP Note 1456402 - DB6: DBA Cockpit Correction Collection SAP Basis 7.02 / 7.30 /
7.31 / 7.40.
SAP Note 3246738 - DB6: Migration Check Tool SMIGR_CHECK_DB6
SAP Note 3208238 - DB6: Enhancements to SMIGR_CREATE_DDL for range
partitioning and inline LOB size.
Table clusters have some boundary conditions that will prevent unsorted exports. The
unsorted export is also not possible if a codepage conversion takes place during the
migration or if declustering is performed. Table clusters are discussed in 5.2.5, “Table clusters
and cluster tables” on page 70. For other exceptions regarding sorted export see SAP Note
954268.
Note: The unsorted export is a powerful optimization to improve the export time. However,
the data is also imported unsorted and thus the database performance may not be
completely optimized in the target system. Use the unsorted export wisely. You may want
to plan for a subsequent reorganization of unsorted tables in the target system.
Figure 5-1 shows improved runtime in minutes with unsorted export. The improvement
typically ranges from 30% to 50%, but can be more. For table SOC3 it is only about 10% and
so it may be wise to export the table sorted to optimize the data placement on the target side
during import.
When using the MigMon, you may use the ddlMap-option in the export-properties file which
names a file that contains the mapping between the package names and the two DDL
template files: DDL<DBS>.TPL and DDL<DBS>_LRG.TPL-. The following picture shows an
example for such a mapping file.
The DDL<DBS>_LRG.TPL template file does not contain the ORDER_BY_PKEY keyword in
the prikey-section and is generated by R3ldctl in addition to the DDL<DBS>.TPL file. It can be
used to set up exports for different tables with or without sorting.
In Example 5-2, the packages SAPCLUST and SAPSDIC are using the DDL<DBS>.TPL
template file while the table MSEG (which has a package of its own) and the package
SAPAPPL1 are unloaded by using the DDL<DBS>_LRG.TPL template file.
The sorted export does not increase the CPU usage significantly but one remarkable effect
during a sorted export may be spikes in the write operations per second. See Figure 5-2.
As shown in Figure 5-2 on page 68, there is a constant write throughput that reflects the IO of
writing the R3load dump files to disk. There is one significant peak in write operations at
around 13:53. This is not due to a magically increased throughput. It is due to write
operations of Db2 caused by a sorting operation that spills data to disk.
Table 5-1 shows the results of database metrics during a sorted and unsorted export.
Total sorts = 70 = 23
Total sort time (ms) = 2528650 = 157
Sort overflows =4 =0
Active sorts =0 =0
Buffer pool temporary data logical reads = 2438151 = 19
Buffer pool temporary data physical reads = 1228390 =0
Buffer pool data writes = 1218386 =0
Asynchronous pool data page writes = 1217750 =0
Buffer pool index logical reads = 2095225 = 5611
Buffer pool index physical reads = 762257 = 614
Total buffer pool read time (milliseconds) = 21442746 = 9702406
Total buffer pool write time (milliseconds) = 4358579 =0
Total elapsed asynchronous write time = 3931277 =0
Asynchronous data read requests = 1977833 = 3086773
Asynchronous index read requests = 384730 =2
The Db2 monitoring metric “Total sort time (ms)” shows a significantly longer time for sorting
with the sorted export. It also shows four sort overflows that result in the peak write workload
as data is materialized on disk. The data is written to temporary tablespaces and
subsequently read again.
As more data is written and read, more buffer pool accesses are performed. Bufferpool
accesses do not generate IO but do consume additional CPU cycles which explains the
increased runtime.
The monitoring data also shows significantly more index accesses when the export is sorted.
This is not a bad sign in itself, but it is a good indicator that sorting is involved and an index is
used for the access to the data.
If code page conversions are performed or if the table cluster is converted to one or more
transparent tables, then the table cluster must be exported sorted. This, together with the fact
that some of the table clusters are very large (for example, typical candidates are CDCLS,
EDI40, RFBLG) requires special attention to this table type.
Also keep in mind that for table clusters the process for compressing the R3load dump files
and the declustering is more CPU intensive.
Table clusters are the physical representation of one or multiple logical tables. For example,
the table cluster CDCLS contains the cluster tables CDPOS and PCDPOS. A logical row in a
cluster table is mapped to one or more physical rows in the table cluster.
Let us have a closer look to the definition of the table cluster DOKCLU that contains just the
cluster table DOKTL which is shown in Figure 5-3.
You can identify every logical row of a cluster table by the key fields of the table cluster. The
data for the logical rows of a cluster table are stored in the column VARDATA. If the data is
longer than the defined length of VARDATA, additional records for the same key are stored
with an increased number of PAGENO. The field PAGELG describes the length of the data
stored inside the VARDATA-field. See Figure 5-4 as an example.
Figure 5-4 Records of Table Cluster DOKCLU for one Logical Row
Figure 5-4 shows four records for the key value “DT RSMD_CONVERSION D E 0038” with an
increasing PAGENO. The first three records are filled up with data in the field VARDATA. This
can be seen from the PAGELG with a value of 3800 that just matches the length of the
VARDATA field.
The fourth record with PAGENO 3 has a PAGELG of 60 meaning that this record is not filled
up to the end indicates the end of the rows belonging together.
During code page conversions, the contents and the length of the records may change. Even
the number of the physical records belonging to a logical record may change.
Because the physical records are built together to a logical record, the data must be read in a
sorted way to find all physical records that belong to a logical record. Therefore, an unsorted
unload is not possible.
This also includes codepage conversions due to a switch to a hardware architecture with
different endianness. One example for this is, AIX on IBM Power Servers or HP-UX with
PA-RISC CPU to an x86 based infrastructure like Linux. In this case, the codepage changes
from 4102 (big-endian) to 4103 (little-endian).
The restriction of sorted export also applies to declustering during the export. The logical
records need be built first, followed by the conversion to one or more transparent table
records. If no changes to the contents of the records are made, the logical record does not
have to be constructed and the table cluster can be unloaded unsorted.
Note: Even if you specify to use a non-sorted export, R3load will ignore this and will
perform a sorted export if necessary and write a message to the R3load log file.
The size of the compressed export dump files ranges from 20% to 40% of the physical size of
the table for many transparent tables. For cluster tables and INDX-like tables, the dump file
size is around 80% of the physical table size. For some tables, the dump is even larger than
the table object – E.g., for table COVREF in the Figure 5-5 on page 71.
You can use this information to assess the space required for the export dump and the
amount of data that needs to be transferred from source to target.
A good start is to use the statement shown in Example 5-3 to get the allocated size of data,
long field (LF) and long objects (LOB) in the database. This will return the size of all tables
without the defined indexes.
Alternatively, you can sum up all physical table sizes by using the SAP DBA Cockpit. You can
use the panel “Top Space Consumers”. In this panel, sum up the Phys. Data Size, Phys. Long
Data Size and the Physical LOB Data size. You can also run the SQL statement shown in
Example 5-4 on the command line to get this same result.
With this, you will get the sizes of the largest 100 tables, and you may use this as input for
splitting packages.
If you look at the results, many tables benefit from the R3load compression and as such, the
migration process will benefit as well. Smaller R3load dump files mean less IO and less data
to be transferred over the network. However, for cluster tables and INDX-like tables, the
compression does not seem to be effective. In many cases, the data cannot be compressed
well due to the nature of the data. In these cases, CPU resources are wasted during the
export to compress the data with minimal savings in I/O or network bandwidth.
The default algorithm for compression is based on the Lempel–Ziv–Welch (LZW) algorithm
that normally delivers a decent lossless data compression. The new algorithm uses the
Run-length encoding algorithm (RLE) which requires less compute power but typically also
provides less optimal compression results when comparing to LZW.
Figure 5-6 shows that the table CONVREF exports more than 8 times faster. The tables STXL
and BALDAT also showed an improvement of factor 2 or more. So, this is a powerful
optimization for reducing your export runtime.
Typically, saving on one resource type, leads to increased usage in another. Our first
expectation is that the R3load dump file will increase in size along with increased I/O rates.
Figure 5-7 shows that the export dump files increase between 60% and 80% but in many
cases, this is worth the effort as it saves significantly on CPU resources and reduces export
time.
Important: During one test, the export dump size for table VBOX increased from 5 GB to
over 70 GB – a factor of 35. So, you need to carefully enable this feature.
The R3load parameter “-compress” affects only the export of the data. The R3load on the
import will detect the dump file compression method automatically and will choose the correct
option for the data decompression.
Note: The optimized compression is only beneficial for cluster and INDX-like tables and
will use the appropriate algorithm automatically. Always use the “adapt” keyword for this
optimization and do not use “rle” or “lzw”. If you choose the wrong compression algorithm,
the export dump size can increase significantly.
Usually there are a lot of tables in the APPL-data classes, and typically there are some large
tables in data class APPL1.
When exporting such a system without splitting the structure files, the processing of packages
which contain several large tables may dominate the total runtime of the export.
One solution is to split a single structure file into multiple files with the Package Splitting Tool.
You can invoke the package splitter manually or by using SWPM. For details how to use the
package splitter tool, please refer to the SAP System Copy Guide.
If you want to know how many tables belong to each data class in your system, you can use
the SAP DBA Cockpit. In the DBA Cockpit, go to Configuration → Data Classes. In addition,
the report SMIGR_CHECK_DB6 can be used to identify tables with a mismatch between the
data class definition and the physical location.
Note: Historically, data classes have been used for assign tables to tablespaces. Starting
with SAP SL Toolset 1.0 SP18, tablespace pools were introduced with Db2. With
tablespace pools, you can achieve a more even distribution of tables in your SAP system.
For details, please refer to the SAP Note 2267446 - DB6: Support of tablespace pools.
Starting the export from one or more dedicated application servers is an option to free
resources on the SAP database server and to shift the R3load related processing to different
machines.
This could make sense if the single thread performance of the source database server is
significantly slower than that of a dedicated application server. Beside dedicated application
servers, it may make sense to use the importing database server for this purpose.
The following sections describe exports tests that were performed using two different system
setups:
Exporting from the SAP database server
The exporting R3load is started on the same machine where the Db2 database resides
and thus competes with the database for system resources.
Exporting from an SAP application server
The exporting R3load is started on an SAP application server on a separate machine. The
application server is remotely connected to a Db2 database. Export dump files are written
to the application server machine. With this setup, R3load and Db2 use their own, distinct
system resources.
Figure 5-9 shows the export duration of some tables including a Unicode conversion. It
indicates that the runtime for all tables except CDCLS was around 1 hour, whereas the
CDCLS export took nearly 8 hours.
The NMON utility on AIX has a useful functionality to show the CPU consumed by Workload
Management Classes. This can be used to monitor CPU usage of different users and
processes. We have used the setup to monitor the CPU usage by R3load that is started as
the SAP administrator <sid>adm and for Db2 processes and threads that are running under
the Db2 instance owner db2<sid>.
Figure 5-10 on page 77 shows the CPU resource consumption. This illustrates that R3load is
using the majority of the available CPU resources, whereas Db2 is only using a minor part.
Figure 5-10 shows that at the beginning of the export – when 8 R3load processes were
running in parallel – the CPU utilization was nearly 100%. If the workload generated by
R3load could be moved to another machine, this would free CPU resources on the DB
Server.
Figure 5-12 and Figure 5-13 on page 78 show the CPU usage for both the SAP database
server and for the dedicated application server.
When you compare the CPU utilization on the database server to the previous test performed
it becomes clear that more CPU resources could be used by the Db2 database engine –
since the CPU load that was generated by R3load was completely offloaded to the application
server.
The R3load log files contain a resource usage section at the end of each file which is shown
in Example 5-6. This section can be used to identify candidates for shifting to dedicated
application servers. This resource usage section distinguishes between the time spent in the
database, the general R3load processing and the file processing. One example could be, if
the time spent in the GENERAL section is significantly higher compared to the DATABASE
section.
Note: The resource usage for the R3load compression is reported under the FILE section
in this report. A large amount of time in the FILE section does not necessarily indicate an
I/O bottleneck.
Example 5-6 Excerpt of the R3load log file with resource usage data
GENERAL times: 3998.905/1944.467/0.020 real/usr/sys
DATABASE times: 387.097/ 35.725/1.062 real/usr/sys
FILE times: 1367.156/ 587.576/15.46 real/usr/sys
In general, using dedicated application servers competes with the data transfer of other
export dump files to the target server. This may mean that the process may not be beneficial
to the overall migration process. If you want to use this optimization, ensure you closely
monitor the resource usage for CPU and network and shift only a portion of the export to
dedicated application servers.
In the example discussed in section 5.5.2, “Exporting from an SAP application server” on
page 77, a good setup would be to use a dedicated application server only for the CDCLS
table as it determines the overall runtime of the migration process.
To use a dedicated application server for only selected tables cannot be conducted by using
SWPM alone. Instead, you need to have two separate instances of the Migration Monitor
each running with a mutually exclusive set of tables. Before you use this feature, familiarize
yourself with the Migration Monitor and how to use it outside the SWPM.
With the query block prefetch feature enabled, the Db2 client will request another block before
the client application has processed all records.
Based on the tests conducted for this book, the export throughput increases up to 33% if the
export is done on a dedicated application server and 10% if the export is started on the
database server itself.
The test results shown in Figure 5-14 on page 79 were obtained from executing 6 parallel
R3load processes and the network bandwidth was not saturated. Adding more processes
which would saturate the network bandwidth would show the advantage provided by the
query prefetch feature disappearing.
This feature however, perfectly matches to the previously proposed setup for dedicated
applications servers during the export. This setup is most valuable if only a few critical tables
are exported on dedicated applications servers.
Even if the export is started on the database server, the feature can help to some extent as
the local Db2 client also communicates using the network protocol and the same
optimizations apply as well. The benefit is not significant as seen with a physical network, but
the test still showed a 10% improvement.
If you are running on Db2 Version 11.5.6 or higher, enable this feature for both local and
remote exports. If you are running on a Db2 version lower than 11.5.6 on the export server,
you may use the Db2 version 11.5.6 or later client for testing. As the query block prefetch
feature is a client enhancement, this can be used also with downstream Db2 releases.
The usage of different client and server versions will generate an error and the connection is
refused. If the R3load generates an error “ERROR => Db2 software level mismatch between
client and server”, please follow SAP Note 503122 - DB6: Connection refused due to DB
software mismatch to overcome this.
Note: A mixed version setup is only recommended for testing purposes. If you find the
feature useful, please upgrade to Db2 Version 11.5.6 or higher as part of the migration
process.
However, when you use Db2 columnar tables, also often referred to as BLU tables, a
threshold for the maximum number of concurrent queries is recommended and enforced
during installation of an Db2 fix pack update by the db6_update_db script.
For details, please refer to the database administration guides for Db2 and SAP Note
1152411 - DB6: Using Db2's Workload Management in an SAP Environment.
For example, if the source system has 128 cores but the Db2 WLM threshold is set to 16, only
16 parallel R3load process are concurrently active – regardless of whether they export
columnar oriented or row-based tables. The blocked R3load processes will not fail
immediately but will wait in a queue to be executed.
So, for heterogeneous system copies, the WLM threshold should be disabled or set to a high
number. You can use the SAP report SMIGR_CHECK_DB6 or the Db2 SQL query shown in
Example 5-8 to see if a threshold is set and enabled.
Once you have verified that the Db2 WLM threshold is enabled, you can either increase the
threshold to the maximum number of parallel R3load processes or you can disable and drop
the threshold using the SQL statements shown in Example 5-9 and Example 5-10.
Example 5-9 SQL Statement to increase the parallelism of the Db2 WLM threshold
ALTER THRESHOLD <THRESHOLDNAME> WHEN CONCURRENTDBCOORDACTIVITIES > <NEM_MAXVALUE> STOP EXECUTION
Example 5-10 SQL Statement to disable and drop Db2 WLM threshold
ALTER THRESHOLD <THRESHOLDNAME> DISABLE
DROP THRESHOLD <THRESHOLDNAME>
Note: If you run a test export on the production system, make sure to switch the
configuration back to its initial state. If you execute a test export on a system copy, ensure
to disable the Db2 WLM threshold again for the final migration.
Example 5-11 SQL Statement to get overflow accesses for table BALDAT
Plan the disk assignment together with your cloud service provider as different providers have
different types of disks and different concepts of how required throughput, IOPS and size can
be assigned to the virtual machines.
The heterogeneous system copy also includes the possibility to assign tables and indexes to
dedicated tablespaces. With this, you can optimize the layout with respect to manageability
and performance.
During the migration, the database size, and with this the size of tablespaces and containers,
are defined in the DBSIZE.XML file that is created during the export. The SAP installation tool
SWPM provides several options to choose from – for example the storage path option allows
you to specify the number of file systems (sapdata1 – sapdata<n>) used for the Db2 database.
The tool also lets you create tablespaces manually.
The number of containers should be configured wisely. It has an impact on performance and
you cannot easily change this setup while the system is running in production. A discussion
about the appropriate disk layout is beyond the scope of this book and depends also on the
underlying disk subsystem.
However, we would like to provide you some basic recommendations that remain the same for
all implementations:
Use separate disks for logging and for tablespaces.
Do not configure operating system I/O (for example, swap, paging or heavily used spool)
on Db2 data disks or disks that are used for logging.
Along with the configuration parameters described in this chapter, the appropriate layout and
data placement is essential for the optimal operation of the database. The migration also lets
you assign large tables to dedicated tablespaces to optimize the layout. As the growth of the
SAP database is monitored in production, a good estimation of the future growth is possible
and should be considered.
Figure 6-1 on page 87 shows a sample list of the largest tablespaces in an SAP database.
We use this example to show what a tablespace layout should NOT look like.
In this case the BTABD tablespace is by factors larger than the next smaller tablespace. This
may not have an impact on the performance for the daily workload, but it does have an impact
on administration. For example, when backing up your database Db2 is backing up
tablespaces in parallel to improve performance. This improvement cannot be fully exploited if
one table space is factorially larger than the others.
To optimize the layout, the largest tables can be assigned to dedicated tablespaces,
tablespace pools can be used, or both options can be combined.
With the traditional Db2 tablespace layout for SAP systems this type of functional grouping is
used by grouping master data and transaction data, for example. This layout typically results
in a few large, tablespaces containing one or more large tables. However, this kind of
unbalanced tablespace setup may result in some issues, for example increased backup run
times.
In a nutshell, we recommend using tablespace pools. Building a new target system as part of
a system copy process is a perfect opportunity to implement the tablespace pool concept.
In contrast to the traditional tablespace layout, where SAP data classes are assigned with
dedicated tablespaces, the tablespace pool concept assigns data classes to a tablespace
pool. Table 6-1 shows an example.
The assignment of data classes to tablespaces is stored in the following tables: TADB6 (for
tables), IADB6 (for indexes) and LADB6 (for long field and large object data). Note that the
LADB6 table based definition is only fully utilized when using tablespace pools.
A tablespace pool consists of multiple pool tablespaces. Depending on the pool size, an equal
number of tablespaces for regular data (D), index (I) and large tablespaces (L) is created.
Let’s assume that we create a tablespace pool named “DATA” with a size of 20. Then, the
following 60 pool tablespaces will be created as shown in Table 6-2.
… … …
When using a tablespace pool, the distribution of tables, indexes and long field or large object
data to tablespaces no longer follows a predefined, static definition from the SAP data
dictionary. Instead, objects are mapped to tablespaces dynamically and automatically using a
hash algorithm. This hash functionality is built into the DB6 database shared library (DBSL).
The input value for the hash algorithm is the table name and its output is the number of the
target pool tablespace. For instance, let's assume the algorithm would hash the name of table
“BSIS” to the value “05”. Then this table will be created in tablespace
<SAPSID>#DATA@05D and its indexes in <SAPSID>#DATA@05I.
Through the use of the hash algorithm, an equal distribution of tables across the available
pool tablespaces can be achieved. You can use SAP report “RSDB6TBSPOOLDISTRIB” to
simulate the distribution of objects to pool tablespaces.
Note: The pool does not have to exist when running the simulation, so you can execute
this report in your source system prior to the database export.
As of SL Toolset 1.0 SP18 the Software Provisioning Manager (SWPM) allows you to set up
tablespace pools, both for new installations and during an R3load based system copy.
If your source system already runs with tablespace pools, all is fine, the setup will be
preserved in your target system. If your source is based on the traditional layout, you can use
SWPM to set up tablespace pools.
Note: Starting at SL Toolset 1.0 SP18 the use of tablespace pools is the default in SWPM.
If you want to preserve the traditional layout, choose “custom mode” in SWPM and
uncheck the “use tablespace pools” check box in the respective dialogue.
For small to mid-sized systems, you may want to store all objects in a tablespace pool.
However, for larger systems, especially if they contain a few large tables, we recommend
associating these large objects with their own, dedicated tablespaces.
If you already have performed a separation of large tables in your source system and used
custom data classes for these objects, you are ready to go. The use of tablespace pools in
your target system only affects SAP’s standard data classes, i.e.: APPL0, APPL1, APPL2,
CLUST, POOL, SDIC, SDOCU, SLDEF, SLEXC, SLOAD, SPROT, SSDEF, SSEXC, SSRC,
TEMP, USER and USER1. Thus, all object associated with custom data classes (prefixes Z,
Y, or USR and USER2 to USER9) or with BW specific data classes (DFACT, DDIM, DODS)
will not be assigned to the tablespace pool but will stay in their original tablespaces.
If you are running a mid-sized to large system with a few large tables that are not residing in
their own tablespaces yet, consider assigning custom data classes to those tables in your
source system before export so that they will be assigned to their own, dedicated tablespaces
in the target system. One approach to do so is the following:
1. Create the new dedicated tablespaces in your source system (with minimal size), e.g.
using the DBA Cockpit using Space → Tablespaces → Add.
2. In DBA Cockpit, create a new data class and assign the newly created tablespaces to it
using Configuration → Data Classes. Check SAP note “515968 - DB6: Creating data
classes and tablespaces in DBA cockpit” for details.
3. In transaction SE11, go to the technical settings for the respective tables and modify the
data class
4. Do not perform the table conversion at this time (e.g. using report DB6CONV). The tables
will be moved into the new target tablespaces when you perform the import into target
system.
Note: The migration check report “SMIGR_CHECK_DB6” will show this as inconsistency
in tablespace assignment. However, you may ignore it, as it is intended in our case.
Alternatively: If you perform the assignment of new custom data classes well before the
migration, you may want to consider performing the table conversion in the source system
as described in SAP note “1513862 - DB6: Table conversion using DB6CONV version 6 or
higher”.
The new assignment of tables to tablespaces / data classes takes effect during the migration
and you can validate those by checking the DDLDB6.TPL File.
Example 6-1 shows a classic layout while Example 6-2 shows the tablespace pool layout.
Example 6-1 Excerpt from DDLDB6.TPL with the classic layout and custom dataclass ZAPP1
# table storage parameters
loc: USER6 NZW#ES40AD NZW#ES40AD NZW#ES40AD
APPL1 NZW#BTABD NZW#BTABI NZW#BTABD
ZAPP1 NZW#BTAB1D NZW#BTAB1I NZW#BTAB1D
Example 6-2 Excerpt of DDLDB6.TPL with tablespace pools and custom dataclass ZAPP1
# table storage parameters
To evaluate the correctness of the tablespaces assignment, please use the SAP report
“SMIGR_CHECK_DB6”.
With Db2, the default I/O mechanism for newly created tablespace containers on most UNIX,
Linux and Windows platforms is concurrent I/O or direct I/O (CIO/DIO) and should be used
during the heterogeneous system copy and for normal production also.
On the other hand, the extent size influences the size of the database as it is the minimum
allocation unit for tables. A large extent size may result in wasted disk space for small or
empty tables.
The current SAP recommendation is to use an extent size of 2, which together with the
recommendation of 16 KB pages is 32 KB. This is a good default for the tablespaces,
including Db2 system temporary tablespaces.
You may use a larger extent size if wasting space due to small tables is not an issue. Larger
extent sizes have other benefits: like improved backup performance. The extent size could
have an influence on the import also. We have seen a single digit increase in throughput with
extent size of 16. With larger extent sizes, we have seen a slight decrease in performance.
You may create tablespaces with extent size 16 for large tables that reside in a dedicated
tablespace or for dedicated tablespaces that serve the temporary tables during a split table
load.
Note: SAP SWPM always creates the tablespaces with extent size 2. If you want to use a
different extent size, you must create the tablespaces manually and grant the right to use
the tablespace to the role SAPAPP.
6.4.4 NUM_IOCLEANERS
This parameter allows specifying the number of asynchronous page cleaners for a Db2
database. Page cleaners write changed pages from the buffer pool to disk before a database
agent requires space in the buffer pool.
As we use the db2load API to load the biggest tables, this parameter is not that important for
the overall migration process. However, for index creation – if temporary data is written from
buffer pool to temporary tablespace – this will have some impact on performance.
During the migration evaluation project our tests showed no additional improvement by
defining more than 20 I/O cleaners for index creation. Unless the disk throughput is below the
expected performance for a given I/O configuration, we recommend that you set this
configuration parameter to AUTOMATIC and let Db2 determine the appropriate number.
Setting the prefetch size to AUTOMATIC is a good starting point and should only be changed if
the I/O performance is below the expected rate for the given disk subsystem.
To increase the parallelism even further, you can set the Db2 registry variable
DB2_PARALLEL_IO=<NUMBER>. With this, Db2 is informed to use a parallelism of <NUMBER> for
each Db2 Container.
The recommendation of 4 Storage paths is intended to be the minimum or a feasible value for
smaller systems. With larger databases, you should define up to 16 storage paths and control
parallelism using this.
We have not seen any performance improvement, when setting this variable and therefore,
the recommendation is to leave it as the default.
Note: You should contact the cloud service provider for details about the best storage
layout and logical volume manager (LVM) configurations. The IO sizing is critical for the
migration and the subsequent production operation.
Db2 11.5 and higher supports this modern disk format and can optimize the IO to type of disk
with the registry variable DB2_4K_DEVICE_SUPPORT=ON. If this variable is set to ON, this setting
adjusts the memory layout of data structures and the parameter of disk I/O operations to be
compatible with the requirements of this type of storage.
To create a database on disks with 4K sector size, this registry variable needs to be set to ON
otherwise the create database statement will fail.
If you enable this setting in an environment configured with storage devices that use a 512
byte sector size, performance may be degraded. Disks with 4K sector support may also
provide a 512 byte compatibility mode.
During the import, tablespaces with an initial size increase all the time and therefore it is an
option to manually set the increase size to a large value – e.g. 1 GB. Db2 enables fast
preallocation of containers on VxFS, JFS2, GPFS, ext4 (Linux only) and xFS (Linux only) file
systems by default and so the duration for the increase is short but still exists.
We also recommend analyzing the tablespace sizes after a first test migration and creating
the tablespace manually to move this effort outside the downtime window completely.
You can extract the tablespace definition after the first test migration with the db2look utility.
This utility can define the DDL Statements to create the tablespaces.
Example 6-3 DDL Statement extracted after a test migration with the db2look utility
CREATE LARGE TABLESPACE "SRC#DATA@19D" IN DATABASE PARTITION GROUP SAPNODEGRP_SRC
PAGESIZE 16384 MANAGED BY AUTOMATIC STORAGE
USING STOGROUP "IBMSTOGROUP"
AUTORESIZE YES
INITIALSIZE 32 M
MAXSIZE NONE
EXTENTSIZE 2
PREFETCHSIZE AUTOMATIC
BUFFERPOOL "IBMDEFAULTBP"
DATA TAG INHERIT
OVERHEAD INHERIT
TRANSFERRATE INHERIT
DROPPED TABLE RECOVERY OFF;
In the second step, extract the current size of the tablespaces using the MON_GET_TABLESPACE table
function and change the INITALSIZE parameter for each tablespace.
Before the next migration run, you can create the tablespaces with the known size. You can
combine this feature with setting the Db2 registry variable DB2_USE_FAST_PREALLOCATION = OFF to
minimize fragmentation on file system level as much as possible.
This will avoid many small wait situations that occur in the time between a tablespace full
event occurs and when the increase is completed. This is typically only a few seconds, but it
sums up at the end. We have seen up to 5% increase in throughput with this concept.
Important: Do not forget to reset the modified INCREASESIZE definition again after the
migration completes.
While archival logging allows point in time recovery, the circular logging mode allows only
offline backups and no roll forward after a restore because log files are not archived but
overwritten.
Database load is a special operation and a recovery situation during the migration is unlikely
(this almost always means failure of the migration and a restart). There is no need to
configure archival logging. In addition, the circular logging can be beneficial for performance
as it does not log LOB data. Therefore, the amount of logging I/O will be reduced.
We recommend that you run the database in circular logging mode by setting the
configuration parameters as shown in Example 6-4.
The Db2 database configuration variable LOGINDEXBUILD enables the database manager to
log the creation of indexes and will generate a significant amount of logging. By default, this
variable is turned off and you should ensure this is true for your environment.
However, a certain amount of logging occurs during the migration as some tables are
imported by INSERT and the db2load API logs the allocation of new extents. Although the
amount of logging is small compared to the production use, you can influence the behavior.
SAP provides a good setting for Db2 transactional logging, you can leave the defaults.
Although logging is limited during a heterogeneous system copy, ensure that there is enough
disk space available for the Db2 transactional log files and the size and number is configured
large enough.
Make sure to provide enough disk space and an appropriate number of correctly sized
primary and secondary log files, You should also enable circular logging to optimize the
migration in this area.
An exception may be if you use multiple parallel inserts for split table imports. In this case,
closely monitor the logging performance and if this becomes a bottleneck, increase the IOPS
available to Db2 logging.
Note: After the system copy completes, review and set the logging parameters according
to your system and adapt the configuration.
Next to the obvious space savings (and the related cost savings), we often see additional
performance benefits with using compression. When you consider that pages residing in the
Db2 buffer pools are stored in compressed format, it becomes clear that more data will fit into
the given buffer pools as the same amount of data can be stored in fewer pages. This reduces
the number of I/O operations and – as many systems are more I/O bound than CPU bound –
typically leads to performance benefits. We therefore recommend making use of Db2
compression in your target database during import.
Note: You can enable classic compression for a table or you can enable adaptive
compression which also enables classic compression. By default, with SAP installations
you will typically get adaptive compression if you decide to compress a table.
For more details, go to the IBM documentation for Db2 and search for “Compression” or refer
to the following SAP Notes:
– 1126127 – DB6: Deferred table creation and row compression
– 1354186 - DB6: LONG/LOB type mapping and database object check in DDIC
– 2481425 - DB6: How to calculate current LOB size for INLINE LENGTH
No compression
COMPRESS_NO_ALL: Tables are created with the “COMPRESS NO” attribute. Therefore, table
data will not be compressed. Also, corresponding indexes will not be compressed. This is
shown in Figure 6-2.
Automatic compression
COMPRESS_ALL: All tables will be created with the “COMPRESS” attribute. This means, that
both classic (static) and adaptive compression will be used. All indexes of the tables will be
compressed as well. Using the R3load option COMPRESS_ADAPTIVE_ALL is equivalent to
COMPRESS_ALL.
The table wide compression dictionary will be created by using the automatic dictionary
creation (ADC) feature, which creates the dictionary after a few megabytes of data has been
loaded. Automatic dictionary creation typically is a good compromise between space savings
and import runtime. This is shown in Figure 6-3.
The static compression dictionary is built by a table REORG after all data has been loaded.
First, the complete table is loaded in an uncompressed format, the compression dictionary is
built, all data is truncated and re-loaded.
While this results in the most optimal compression dictionary, the table will allocate space for
the uncompressed size at first. In addition, reorganization, truncation and re-loading of data
increases the import runtime significantly. We therefore we do not recommend the use of this
option in general, limit it to specific scenarios only.
The second option to influence the behavior is to set the environment variable
DB6LOAD_MAX_SAMPLE_SIZE. The default value is infinity, which means all available data is
sampled and loaded. It is possible to specify a number for the megabytes sampled. After this
amount of data is loaded in the sampling phase, the dictionary is build.
Both options must be set as environment variables and are valid for all R3load process
started in this session or migration monitor instance.
After the data sample has been loaded, a reorganization is performed to build the static
compression dictionary. The table is then truncated, and the full set of data is loaded. This
option results in a near ideal compression dictionary but may have a noticeable overhead on
import runtime. This is shown in Figure 6-5 on page 99.
The amount of data sampled may have an impact on the reorganization process to build the
compression dictionary. But for a table with one Terabyte of data, the table sample is just 10
GB and the reorganization process should complete in under 5 minutes.
There is a special use case where limiting the size of the sample that is used may provide
some benefit. This is related to the way that the OPT_COMPRESS feature processes data.
Remember that during the sampling phase, all available R3load Dump files are read, data is
decompressed and codepage conversion may occur. This is done for all records in the
R3load export dump, but only one out of 100 records will be written into the database table.
For tables that contain LOB Columns it is possible to reduce the amount of CPU and the
number of reads done during the processing of the R3load Dump file. This can be done with
the use of the B6LOAD_MAX_SAMPLE_SIZE parameter. This is in particularly true if, for example
the specified maximum size is reached after half of the R3load dump files are read, the
sampling phase stops and the remaining 50% of the R3load dump files will not be processed
during this phase. To show this effect, Example 6-5 shows an import of a table with the default
sampling size. With the default setting, the import and reorganization of the sample took 198
seconds, and the final import completed in 1385 seconds.
For comparison, the same table is imported with a max sample size of 4 MB. The results are
shown in Example 6-6.
export DB6LOAD_MAX_SAMPLE_SIZE=4
START REORG: 20230126002954
Table VBEP has been reorganized to create compression dictionary
A1 EDB6 017 END REORG: 20230126002955
The data sample is generated with frequency 100 and maximum sample size 4 MB.
6177 out of 617665 rows with maximum row size 696 have been sampled.
Finished import for object "VBEP" of type "table" in 2.457 sec, requesting
repeat
A2 ETSK 005 Finished import for object "VBEP" of type "table" in 1405.233
In Example 6-6 the import and reorganization of the sample took only 2.5 seconds and the
final import completed in 1405 seconds. The final import time is almost identical, but the
difference is in the sampling phase. With the default setting, the sampling phase took 198
seconds and within this, the reorganization took 9 seconds. With the small sampling size, the
sampling phase takes 2.5 seconds, out of which the reorganization contributes 1 second. So,
the largest part of the reduction is the sample load and not the reorganization process.
This setting can be an optimization in the case when the sampling phase of the OPT_COMPRESS
Option significantly contributes to the overall runtime. If you use this option, you need to be
aware that the compression rate may degrade.
Tables which are not listed in the file will not be compressed, unless CREATE TABLE statements
contained in the *.SQL files already contain a COMPRESS attribute.
You can use this option to enable specific tables without compression, adaptive compression,
or static compression alone. Alternatively, you can specify different compression options for
each package in the OderBy.txt file. As using static compression alone is not a recommended
or often used option, the LIST_COMPRESS parameter is likely to only be used in certain special
cases.
100 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Unused tables inside SAP applications lead to unnecessarily allocated disk space. To
address this, SAP has introduced the concept of virtual tables. When installing or migrating
an SAP system, database virtual tables can be created instead of physical tables. Virtual
tables are read-only views, that are defined on base tables with no physical representation in
the database. As soon as an application attempts to write data into one of these views, it gets
materialized into a physical table.
The R3load option “DEF_CRT” enables deferred table creation. It can be combined with other
options, for example compression. In the following we describe some examples on how to
enable deferred table creation
Example 6-7 shows an example on how to use the deferred table creation option with R3load.
At first R3load will create a view “STXL”. When there is no data in the export dump file,
R3load exits.
When loading data, Db2 returns a SQL Error “SQL0150N” that can be ignored. Afterwards,
the view is converted to a table and data is imported. Any compression flag that is used will
be applied. See Example 6-8.
Example 6-8 R3load with Deferred Table Creation Option – Log File
Finished create for object "STXL" of type "table" in 0.047 sec
Starting import for object "STXL" of type "table"
[IBM][CLI Driver][DB2/LINUXX8664] SQL0150N
Virtual Table STXL has been successfully converted (ignore SQL0150N)
Deferred table creation is beneficial and there is no good reason to disable this feature.
Check SAP Note “1058437 - DB6: R3load options for compact installation” for more details.
Like other optimization options, you can use different R3load compression options with the
Orderby.txt or dedicated Migration Monitor Instances. As an example, you may use
COMPRESS_ALL for most tables and OPT_COMPRESS only for a subset of the tables.
The next sections provide some analysis and recommendation for the use of the import
optimizations described.
optimization is the best possible compression rate to minimize the database size and to some
extend also improve import performance.
In short, the best compression rate can be achieved when the table is imported and
subsequently reorganized, and a new compression dictionary is created.
– This is the R3load option “FULL_COMPRESS[_ALL]”.
– R3load with Db2 automatic dictionary creation (ADC) (R3load Option “COMPRESS_ALL”)
delivers a good but not optimal compression rate.
– R3load with the optimized compression process (R3load Option “OPT_COMPRESS”) also
delivers excellent compression rates that are close to the optimal compression rate.
Figure 6-6 shows a comparison of the compression options on a database. The test database
is a little larger than 2TB without the use of compression.
The usage of ADC compression reduces the size to 800 GB and the options
FULL_COMPRESS and OPT_COMPRESS further reduces the size to 675 GB, which is
additional 15% better compression rate. One more fact to consider is that with full
compression, data is imported first without compression enabled and thus allocating the most
disk space, that is not given back to the filesystem unless you manually clean this up. The
same is true, if you do not enable compression during the migration but after the system is
copied.
102 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Important: Please do not interpret the results as absolute numbers that you will achieve in
every case. The compression rates and the import runtime will vary, for instance based on
the amount and type of large tables and the system resources available.
There are more options available to implement optimal compression but those require
additional manual work and the additional improvement in compression rates may be
minimal. Therefore, we do not recommend using these options unless a 100% optimal
compression rate is required,
You can reuse the table after a test import if an optimal compression dictionary is built – for
example by a reorganization with the FULL_COMPRESS option or by any other manual sampling.
This however requires that the database is reused and the table is manually truncated. To
accomplish this you need to change the task file accordingly. While this approach is valid, it is
complex and error prone. Thus, we do not recommend it generally and we do not further
document it in this book.
One method of enabling or optimizing Db2 compression can always be used in conjunction
with all of the R3load Options. It is to rebuild a compression dictionary after the heterogeneous
system copy is completed. You can do this in the SAP DBA Cockpit or issue the Db2 command
REORG TABLE <SCHEMANAME>.<TABLENAME> RESETDICTIONARY. Alternatively, you may use report
“DB6CONV” in your SAP application to perform a table conversion to optimize the
compression dictionary.
In any case, you may want to check if there are additional compression savings possible on the
target system. You can use the SAP DBA Cockpit for this and go to Space → Compression
Candidates to analyze potential compression improvements. If you want to do this, before the
SAP System is started you can use the Db2 ADMIN_GET_TAB_COMPRESS_INFO table
function to assess potential improvements.
Db2 also does not compress data in long varchar fields that have been used prior to LOB
fields in a table.
Large tables that are known to contain LOB fields are cluster tables or INDX-like tables – e.g.
STXL, DBTABLOG, COVREF and more. The SAP data type for such columns is typically
“RAW” or “LRAW”.
SAP uses a feature of Db2 which is known to as LOB Inlining. With this feature, for each
LOB column in the table, a part of the data is stored within the page and can be compressed
as well. Although this piece of the table can be technically compressed, the data itself may
prevent the compression algorithm from working efficiently. This would happen if the data is
already compressed by the application.
So consequently, compression rates for tables with a large portion of the data in a LOB
column may not be good.
Figure 6-8 depicts the compression results for the table STXL. On the primary axis is the
compressed size of the table. The table has a total size of 52 GB without compression. Even
with the most efficient way of enabling compression, the size is 49 GB, which is a reduction of
just 6%.
Figure 6-8 Example: Compression result and runtime for different R3load Options.
On the secondary axis you can see the runtime of the import. Without compression or ADC
compression, the runtime is about 50 minutes but dramatically increases with full
compression or opt compression. With full compression, the subsequent reorganization
contributes to the runtime. With the OPT_COMPRESS method, the R3load dump files are
completely read twice, decompressed by R3load and samples are imported. If the dump files
reside on slow disks – like in our example – this significantly extends the import runtime.
Based on these results, our recommendation for tables with LOB fields is to use ADC
compression as default and switch to Full compression or OPT compression only if additional
compression may be beneficial.
To do this you should execute the first test migration with ADC compression and analyze the
tables after they are imported.
104 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Note: As the database may be declustered during the export, perform the analysis on the
target system as some tables with LOB Fields may no longer exist on the target system –
for example the table RFBLG.
Typically, the largest tables determine the overall compression rate and therefore you should
analyze only the largest tables. You can use the SQL statements shown in Example 6-9 to
identify the 25 largest tables.
Example 6-9 SQL Statement to identify tables with column types that are not compressed
SELECT SUBSTR(A.TABNAME,1,18) AS TABNAME, SUBSTR(A.COLNAME,1,18) AS COLNAME,
SUBSTR(A.TYPENAME,1,18) AS COLTYPE, B.NPAGES
FROM SYSCAT.COLUMNS A, SYSCAT.TABLES B
WHERE A.TABSCHEMA = B.TABSCHEMA
AND A.TABNAME = B.TABNAME
AND A.TYPENAME IN ('BLOB','CLOB','DBCLOB','LONG VARCHAR')
AND A.TABSCHEMA = '<TABSCHEMA>'
AND B.TYPE ='T'
ORDER BY B.NPAGES DESC
FETCH FIRST 25 ROWS ONLY
Execute this statement on the target database, as table clusters are declustered and may no
longer contain LOB columns. For example, the table cluster CDCLS contains a RAW field but
is converted to the tables CDPOS and PCDPOS that do not contain LOB columns anymore.
Replace <<TABSCHEMA> with the table schema in your database and change the number of
rows to fetch if you want to analyze more tables.
The result is a list of maximum 25 tables for further analysis as shown in Example 6-10.
Example 6-10 Sample output of SQL Query for tables with column types that are not compressed
TABNAME COLNAME COLTYPE NPAGES
------------------ ------------------ ------------------ --------------------
STXL CLUSTD BLOB 2101600
DBTABLOG LOGDATA BLOB 1750537
COVREF CLUSTD BLOB 311907
SOFFCONT1 CLUSTD BLOB 297153
DDNTF FIELDS BLOB 10569
In a second step, analyze the tables with the Db2 ADMIN_GET_TAB_COMPRESS_INFO table
function as shown in Example 6-11.
As a result of the query, you get the estimated compression savings for the table if you
perform a reorganization. Replace <TABSCHEMA> by the SAP Schema and <TABNAME> by
the table to be analyzed. The saving reported is also close to the compression results you can
get with the R3load Options “FULL_COMPRESS” or “OPT_COMPRESS”.
Looking at the query results as shown in Example 6-12, the tables STXL and DBTABLOG
may be candidates for optimized compression.
Example 6-12 Sample output of SQL Query to assess further compression savings
TABSCHEMA TABNAME PCTPAGESSAVED_CURRENT PCTPAGESSAVED_ADAPTIVE
---------- ---------- --------------------- ----------------------
SAPSR3 DBTABLOG 52 84
TABSCHEMA TABNAME PCTPAGESSAVED_CURRENT PCTPAGESSAVED_ADAPTIVE
---------- ---------- --------------------- ----------------------
SAPSR3 STXL 4 7
The tables have been imported using ADC Compression and, in this example, the table
DBTABLOG is a candidate for further optimization as significant savings are possible. The
table STXL however will not benefit from enhanced compression.
To optimize the compression process without increasing the runtime, you may use
OPT_COMPRESS as a global option and specify the option “COMPRESS” (which equals to ADC) in
the orderBy.txt file for the table STXL and all others that may not benefit from compression.
6.5.6 Conclusion
Table 6-3 summarizes the pros and cons of the different options for Db2 compression.
moderate to
OPT_COMPRESS optimal no
high
Although all information provided here is based on real life migrations and intensive tests,
there is no hard coded set of parameters and values available that apply to all migrations. The
parameters must be adopted for a given system environment.
Db2 provides different methods of populating tables with a bulk of data. The options are Db2
Load Command, the db2load API, the Ingest Command, the Import Command and SQL Insert
106 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
with different features for optimization. SAP uses only db2load API and Insert in the R3load
executable. Therefore, we use the word “import” as a generic term for populating tables,
irrespective of the method used. The method is either the db2load API or Insert.
SAP uses only Insert or the db2load API. The Insert procedure is sufficient for installations
and migrations of smaller databases. If a large amount of data must be moved from one
system to another system, the db2load API is much faster because it writes formatted pages
directly into the database.
This is true for most of the SAP tables but in some situations, Insert can be the better choice.
The processing with Db2 LOAD is optimized on throughput. The overall load process is divided
in several distinct phases: loading, building indexes and statistics, delete phase and the index
copy phase. R3load is using the db2load API with a subset of the available Db2 LOAD features.
Therefore, only two phases are normally shown in the Db2 diagnostic log file when using
R3load:
1. Load – Phase during which the data is written to the table
2. Build – Phase during which existing indexes are created
Note: The data imported with the db2load API is not logged. Therefore, a subsequent roll
forward will mark all tables populated by db2 Load as not accessible. Therefore, a Db2
backup after a successful import is mandatory!
In a test with table SWWLOGHIST (60 million records, 25 GB data) we compared an import
using INSERT with an import using db2load API. It took approx. 12 hours and 30 minutes to
insert all data into the table while the same table was loaded within 32 minutes or about 20
times faster.
Figure 6-9 includes different test results. Db2 LOAD typically shows significant performance
improvements over INSERT for most of the tables like GLPCA and SWWLOGHIST.
Figure 6-9 Runtime Comparison – Db2 LOAD versus INSERT (Improvement Factor)
However, the improvement factor is less on tables CDCLS RFBLG or BSIS and so these
tables could be a candidate for insert and using the table split option that allows multiple
R3load processes to populate the table in parallel. However, the db2load API is twice as fast
compared to a single Insert Stream, even in the worst case scenario. Therefore, the db2load
API is the default for R3load in SWPM.
The advantage of the db2load API is significant for tables with a few columns like VBOX,
CDHDR, ACCTCR or SWWLOGHIST.
There are two classes of tables with less improvement when using the db2load API:
For tables with many columns, e.g., 90 and more, the performance advantage can be less.
– Examples for such tables are BSIS, COEP or ACCTIT.
108 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Another class of tables that are typically only a factor of 2-5 faster compared to insert are
tables with LOB columns where most of the data is not Inlined.
– Typical examples for this are cluster tables, INDX-like tables, or tables with LRAW
32000 fields.
– Tables that fall under this category are CE1IDEA, SOFFCONT1 or STXL.
There is another class of tables that does not benefit from Db2 LOAD – small tables with less
than 200 KB of data. In this case, the initialization overhead of the utility is large compared to
the performance improvement. Because of this factor, you will see tables that are imported by
R3load with INSERT, even if you specify the use of the db2load API. R3load automatically
switches to INSERT for small tables.
Table 6-4 can be used as an example to assess the differences in throughput for Db2 LOAD.
These are just examples and should not be considered as a reference for your environment.
The actual throughput depends on the IT infrastructure, the schedule of R3load Jobs and
table content.
Note: If you want to use Db2 LOAD instead of INSERT you can use the R3load Parameter
LOAD_FORCED to use LOAD in any case. We will discuss a viable scenario for this option
later in section 7.3, “Table splitting” on page 139.
Please check SAP Note 454173 - DB6: Accelerated R3load migration through CLI LOAD for
details about the various R3load options.
Tip: As a best practice, do not set the environment variable temporarily in the user
environment. We recommend creating a copy of the import_monitor script and adding the
environment variable there. Another option is to use a dedicated script that sets the
environment variable and calls the import_monitor script.
DB6LOAD_CPU_PARALLELISM=<n>
This variable controls the Db2 LOAD CPU_PARALLELISM parameter.
DB6LOAD_DATA_BUFFER_SIZE=<n>
This variable controls the Db2 LOAD BUFFER_SIZE parameter.
DB6LOAD_DISK_PARALLELISM=<n>
This variable controls the Db2 LOAD DISK_PARALLELISM parameter.
DB6LOAD_INDEXING_MODE =<n>
This variable controls the Db2 LOAD INDEXING_MODE parameter, where 0 equals
AUTOSELECT (Default), 1 equals REBUILD, 2 equals INCREMENTAL and 3 equals DEFERRED
DB6LOAD_FORCE_LOAD
This variable forces Db2 to use Db2 LOAD even for small or split tables if it is set to any
value (for example, 1). You can also specify the force option directly in the R3load
arguments as LOAD_FORCED.
110 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
6.7.2 DB6LOAD_CPU_PARALLELISM
Use this parameter to exploit intra-partition parallelism. The parameter specifies the
maximum number of processes or threads used by the db2load API to parse, convert, and
format data records. The maximum value allowed is 30. If this parameter is not specified, the
db2load API selects a default value that is based on the number of CPUs on the system,
which normally is a good choice.
The LOAD Utility is designed to load a small number of tables in parallel and to use a larger
number of CPU. For an SAP heterogeneous system copy, typically 50 or more R3load
process are running in parallel. In addition, experiences and test have shown that with a CPU
parallelism of 4 then 80-90% of the maximum possible throughput is achieved.
Therefore, R3load uses a CPU parallelism of 4 as default. This allows for a decent throughput
and ensures the CPU resources are optimally used.
The throughput decreases significantly if the CPU parallelism is set to 1 which is not
recommended. The effective CPU parallelism can decrease automatically when there is not
enough memory available in the Db2 Utility Heap.
This means, even if you set DB6LOAD_CPU_PARALLELISM to a dedicated value or use the R3load
default, Db2 may decrease this number if not enough memory is available. In section 6.7.3,
“Db2 utility heap and DB6LOAD_DATA_BUFFER_SIZE” on page 111 we will explain this
context, how to detect it and we will give recommendations.
Typically, there is no reason to specify a larger CPU parallelism. One scenario could be, if the
target system temporarily has a large amount of CPU resources available and the number of
R3load processes are limited. For example, if your target machine has 416 virtual CPUs and
you are running with 50 R3load processes in parallel, it could be beneficial to increase the
CPU parallelism.
Another use case could be to specify a larger CPU parallelism for the table (or a small
number of tables) that determines the overall import process. This would mean that most of
the tables are imported with the default CPU parallelism of four while a few tables are
imported with a higher degree of parallelism.
To do so, two separate instances of the Migration Monitor with a mutual exclusive set of tables
are required. If you want to use this feature, familiarize yourself with the Migration Monitor and
how to use it outside the SWPM.
This setup means, let the migration monitor instance started by SWPM run with the default
setting – CPU parallelism of 4 and start a second Instance with a larger setting. To do so, set
the environment variable DB6LOAD_CPU_PARALLELISM only in the terminal session for this
second instance.
Based on the number of processes defined for loading data, the page size and extent size of
the tablespace the table resides in, the memory is calculated for a LOAD operation. This
memory is allocated from the utility heap. If there is not enough memory available, the
number of CPUs used for LOAD will be reduced.
The Db2 Utility Heap can be configured with a fixed value between 16 and 2147483647 4K
pages, which means up to 8 TB. It can also be set to “AUTOMATIC” and this means, the
memory is allocated within the boundaries specified by the Db2 configuration parameter
“INSTANCE_MEMORY”.
As a sufficient buffer size for the Db2 LOAD utility is essential for optimal performance, it is
best practice to specify a lower boundary for the Db2 utility together with the automatic
setting. A good starting point is 500.000 pages or more using:
UPDATE DB CFG FOR <DBSID> USING UTIL_HEAP_SZ 500000 AUTOMATIC
Figure 6-10 shows the effect of a slowdown due to shortage of utility heap and the resultant
reduced performance. The table CDCLS was exported in chunks and imported with R3load
using the db2load API.
The throughput of the table varies significantly. While the imports of the parts 1-8 achieved
over 50.000 rows per second, parts 9-11 were only running with about 10.000 rows per
second. During the import, too many R3loads were started in parallel to parts 9-11 and Db2
LOAD did not get enough memory (less than 4.000 4K pages) and therefore reduced the
CPU parallelism.
112 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
How to identify a slowdown? The Db2 diagnostic log file records the parallelism of each Db2
LOAD operation and so you can check if the LOAD utility has decreased the CPU Parallelism.
This is shown in Example 6-13.
Example 6-13 Messages in Db2 diagnostic log file created by Db2 LOAD
2023-01-23-15.23.45.962927+060 I41900E554 LEVEL: Warning
PID : 1594211 TID : 140351612380928 PROC : db2sysc 0
INSTANCE: db2tar NODE : 000 DB : TAR
APPHDL : 0-44 APPID: *LOCAL.db2tar.230123135508
UOWID : 59 ACTID: 1
AUTHID : DB2TAR HOSTNAME: MIGTARGET
EDUID : 215 EDUNAME: db2agent (TAR) 0
FUNCTION: DB2 UDB, database utilities, sqluInitLoadParallelism, probe:1097
MESSAGE : Load parallelism modified from 4 to 1
Important: Ensure, Db2 is not throttling CPU parallelism to a value of one. This comes
with a significant performance degradation.
The Db2 LOAD Utility normally also collects the Utility information in the history file. This file
also includes information about the used memory for each load operation. In an SAP
environment, the Db2 registry variable DB2_HISTORY_FILTER=L is temporarily set by SWPM to
reduce potential contention on the history file.
For a detailed analysis, you can temporarily unset this registry variable and the buffers used
for each Db2 LOAD process is reported in the Db2 history file.
6.7.4 DISK_PARALLELISM
The DISK_PARALLELISM parameter specifies the number of processes or threads used by the
LOAD utility to write data records to disk. Use this parameter to improve load performance.
The maximum number allowed is the higher value of either four times the CPU_PARALLELISM
value (used by the LOAD utility), or 50.
By default, DISK_PARALLELISM equals the sum of the tablespace containers on all tablespaces
that contain objects for the table being loaded - except where this value exceeds the
maximum number allowed. Typically, there is no need to change this value manually.
DEFERRED
The LOAD utility does not attempt to create an index if this mode is specified. Indexes are
marked as needing a refresh. The first – not LOAD related – access to such indexes
forces a rebuild, or indexes might be rebuilt when the database is restarted. However, this
option is ignored in many cases as almost all SAP tables have unique indexes that should
be defined prior to loading.
Changing any of the these indexing mode values is typically not recommended and the
default should be used unless needed for specific optimizations. One of these specific
optimizations is the usage of db2load API when sequentially importing split tables. In this
case, setting the indexing mode to INCREMENTAL can help to improve the overall performance of
the import, when indexes are created before the load.
This functionality does not actively sort the data, it just ensures the order of the data. So, if an
unsorted export is loaded, it remains unsorted. Therefore, it is beneficial to disable preserving
the data for all packages that are exported unsorted.
The R3load Option ANY_ORDER enables the db2load API to ignore the order of the data handed
to it to be loaded.
Even for tables that are exported with sorting, this parameter can help to improve load
performance. The sorting is typically required for the export R3load processing (e.g.
declustering or codepage conversion but not for the import side.) and therefore the ANY_ORDER
option can be used on the import side for sorted exports as well. This is true also if R3load
requires the data to be sorted (e.g. due to codepage conversion on the target or declustering
on the target) as the R3load processing is done before the db2load API processing.
You only need to avoid the ANY_ORDER option if you require strict sorting of data on the target
system. For example if you achieve 100% optimal compression rate with minimal additional
savings.
To better explain this we show an example where a larger number of R3load process results
in decreased throughput.
The Import Time Graphs shows that the migration with 2 parallel load process completes after
two hours as shown in Figure 6-11 on page 115.
114 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
The same import with 8 parallel processes took 3.5 hours as shown in Figure 6-12.
Looking at the timings, you can see that the BKPF package went from 45 minutes to 3.5
hours. Because of this the overall runtime of the import expands. This is caused by using too
many parallel processes. The reason for this slowdown is typically overloading resources – in
the example shown in Figure 6-12, it was the storage subsystem.
In other cases, it might be the CPU utilization and the following graph should give you some
advice based on the CPU usage characteristics. Figure 6-13 shows the CPU usage during an
import of a single table with R3load and the db2load API with a CPU Parallelism of 8 that is
not IO bound.
One CPU has a much higher workload compared to the others. A typical scenario is that one
CPU is used up to 100% while others are using between 10% and 20% for a single load
process.
With this data, it becomes clear that the number of Importing processes should not be higher
than the number of available CPUs. In fact, a good recommendation is to start only with 75%
to 80% of the CPU count for the maximum number of parallel processes.
For example, if the import server has 128 CPU, you should not use more than 90 - 100
parallel R3load processes.
This can be used as starting configuration, but you should monitor the resource usage during
the migration process and ensure no resource is overloaded nor resources are idle.
<Empty file> corresponds to any file on the filesystem that has a size of 0 Byte.
If for any reason the LOAD terminated abnormally, the tablespaces in rare cases remain in
quiesce exclusive state (db2 list tablespaces shows state 0x0004). To remove this status,
use the following commands shown in Example 6-15.
This procedure will unquiesce the tablespaces and after this, the table can be cleaned up
from the “LOAD pending” state.
The R3load tool allows you to specify whether the indexes should be created before or after
the data load. If you create the indexes before the load, they are maintained during the build
phase of Db2 LOAD.
Example 6-16 on page 117 shows a db2diag.log with entries for the build phase of the Db2
LOAD.
116 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Example 6-16 Messages in Db2 diagnostic log file created by Db2 LOAD
2022-07-20-03.46.17.195667 Instance:db2fcp Node:000
PID:5721(db2lrid) Appid:*LOCAL.db2fcp.010719161634
database_utilities sqlulPrintPhaseMsg Probe:0 Database:FCP
Starting BUILD phase at 07-20-2022 03:46:17.193097.
2022-07-20-03.56.41.297931 Instance:db2fcp Node:000
PID:5721(db2lrid) Appid:*LOCAL.db2fcp.010719161634
database_utilities sqlulPrintPhaseMsg Probe:0 Database:FCP
Completed BUILD phase at 07-20-2022 03:56:41.288616.
To decide on the order of index creation, please also take the following considerations into
account:
– The performance of the index creation depends to a large extent on the scheduling. If
many tables create indexes in parallel, the I/O Subsystem may be the bottleneck and it
may be more beneficial to pause the index creation for some tables and schedule them
to a period with less I/O contention.
– If the index is created before the load and for some reason the export dump contains
duplicate records, both the load and the index creation must be repeated. If the index is
created after the load and duplicate records exist, it may be possible to clean up the
records and redo the index create only.
– For a cleanup of a split table load, the data is deleted in multiple chunks using
appropriate SQL Statements. If no index exists, the result are multiple slow table
scans.
– The process of maintaining the index is different for tables that are imported with the
db2load API or by using INSERT. The db2load API maintains the indexes during a
dedicated phase while the insert maintains the indexes for each record inserted.
The default for indexes creation is AFTER_LOAD. Editing the first two lines in the
DDLDB6.TPL file determines the order of index creation for R3load as shown in
Example 6-17.
Example 6-17 DDLDB6.TPL file is used to define the order of index creation
prikey: AFTER_LOAD ORDER_BY_PKEY
seckey: AFTER_LOAD
When tables are imported using INSERT instead of db2load API, creating the Indexes is
faster in all cases when the default of AFTER_LOAD is used. The impact can be significant. See
Figure 6-14 for our results.
Figure 6-14 Performance impact of index creation before and after with INSERT
The same is true when tables are loaded with the db2load API. In almost all cases, it is faster
to create the index after the tables are loaded. Figure 6-15 shows our test results.
Figure 6-15 Performance impact of index creation before and after with Db2 Load
Note: This is true for Db2 10.5 and higher and has changed from previous versions due to
optimizations in Db2.
In some cases, it might be useful to change the order of index creation for the primary index.
A test with the tables EDIDC and COEP for example had a noticeable improvement of about
10% when the primary key index is built before the table is loaded. This test is not conclusive
enough to recommend this combination, but it is worth trying this configuration in case you
need to optimize the process. You can implement this optimization by creating a special
template file – for example DDLDB6_INDX.TPL and use this for specific tables in the same or
a dedicated migration monitor instance.
Another scenario where a deviation from the default could make sense is the usage of Db2
registry variable DB2_SMP_INDEX_CREATE that is explained in section 6.10, “SMP parallelism for
index creation” on page 120.
Note: The index creation typically generates the highest I/O load during the migration
process. A good optimization strategy, therefore, is to distribute the index creation
uniformly over the whole migration process to be able to use all available resources, for
example, CPU, disk, and memory.
6.8.2 SORTHEAP
Setting SORTHEAP correctly together with SHEAPTHRES_SHR is one of the most powerful
optimizations for the import and here for the index create phase. If set correctly, it will
minimize IO and the index build can improve with up to 50%.
The SORTHEAP parameter defines the maximum number of private or shared memory pages to
be used for a single sort. You should define this parameter as large as possible for the index
creation, which might avoid sort overflows and spilling data to disk.
118 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
This parameter can be set significantly higher compared to a typical value during production.
While the recommendation for productive operation of ERP systems is at least “2048”,
SORTHEAP should be set to a significantly larger number during migrations.
Although, it is usually not possible to avoid spilled sorts during a migration of large tables, the
goal is to avoid as many sort overflows as possible. So, if enough memory is available, set the
SORTHEAP Parameter to automatic with a starting value of 1.000.000 pages. To optimize this
value, monitor the database for sort overflows and adjust the parameter accordingly together
with the SHEAPTHRES_SHR configuration parameter. Consider assigning available memory to the
sort areas instead of buffer pools during target system import.
6.8.3 SHEAPTHRES_SHR
SHEAPTHRES_SHR represents a soft limit for the total amount of database shared memory that
can be used by sort memory consumers at any time whereas SHEAPTHRES defines the limit on
instance level. As per SAP recommendation, SHEAPTHRES is set to 0. This means that all
shared sorts are performed within the database memory.
Ideally, you should set SHEAPTRES_SHR to automatic with a reasonable multiple value of
SORTHEAP as initial value.
A good starting configuration is to set the value of SHEAPTRES_SHR to the number of parallel
R3load processes multiplied by the minimum SORTHEAP value. However, the optimal value
could be higher as indexes are created in parallel and so multiple parallel sorts for a single
index create may run in parallel.
To monitor the correct setting of the SORTHEAP and SHEAPTRES_SHR parameter, you need to
monitor the sort overflow in the database.
As a staring point, you can use the MON_GET_DATABASE table function or DBSUMMARY procedure
that are described in Chapter 10, “Tips for monitoring” on page 175. In both cases, you must
analyze if sort operations are spilling to disk. The reported values for the following metrics
should decrease or are zero after optimization of the sort-memory.
sort_overflows The total number of sorts that ran out of sort heap and may have
required disk space for temporary storage.
post_threshold_sorts The number of sorts that have requested heaps after the sort
heap threshold has been exceeded.
post_shrthreshold_sorts The total number of sorts that were throttled back by the
sort-memory throttling algorithm. A throttled sort is a sort that
was granted less memory than requested by the sort-memory
manager.
Note: Most likely you will not be able to decrease these sort metrics to zero as the build of
indexes for large tables may exceed the available memory on the system. However, it is
desired to remove as many spilled sorts for other large and medium-sized tables as
possible to optimize the resource consumption on the storage subsystem.
Unlike a ‘normal’ workload, the migration process is somehow special as the only workload
on the database during the import phase of the migration is loading data and building indexes.
To analyze and optimize the use of the buffer pool, we need to take a closer look how buffer
pools are used during the import. Figure 6-16 indicates that the use of buffer pools is
dominated by the temporary data, and it becomes clear that we have a high amount of
physical I/O resulting in a poor buffer pool quality.
Normally you would increase the buffer pool to reduce the physical I/O. However, in our
situation the temporary I/O is generated by the index build. Instead of increasing the buffer
pool, you should alternatively optimize the sort memory areas to reduce the I/O activities to a
minimum.
Note: The buffer pool configuration is of less importance for heterogeneous system copies
if you use the db2load API. The priority for memory optimization should be the utility heap
and the sorting memory configuration. Assign the remaining memory to the buffer pool and
the other Db2 memory areas. This configuration needs to be changed after the system
copy is completed.
The Db2 registry variable increases the parallelism during index creation if CPU and I/O
resources are available. As the Db2 registry variable is active for all processes, setting it to a
large number may not be that beneficial as it locally optimizes the index creation but blocks
resources for importing the data and it degrades performance for creating smaller indexes.
If the process of index create is the bottleneck and if you have 64 or more CPUs on the
importing server, you can set the value to a higher number – for example to 16 – for the whole
heterogeneous system copy.
120 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
If the index create process of a single table or a few tables is the limiting factor, you can try to
schedule the index create of this table at the end of the process and set the variable to a high
number – for example to the number of cores available.
You can achieve this by editing the relevant “.TSK” file and remove the task or set it to “OK” to
create the index manually. Another option is to call R3load with the -o (omit) option to not
create the indexes. See Example 6-18 for an example file setup.
In Example 6-18 the R3load task file has entries for the table EDIDC with two optimizations.
The Primary Key Index is created before the data is loaded and the build steps of secondary
indexes are set to completed. After the initial system copy is completed, you can set
DB2_SMP_INDEX_CREATE to a high value, revert the steps for the secondary indexes to “xeq” and
start the R3load process again.
As the Db2 registry variable DB2_SMP_INDEX_CREATE is dynamic, you can also interactively set
the variable during the migration process. After changing the value, all new R3loads
processes that trigger a CREATE INDEX statement will pick up the new value.
Figure 6-18 on page 122 shows the CPU usage of a single R3load process, creating one
index with DB2_SMP_INDEX_CREATE=24.
Attention: This variable can significantly affect your migration and can bring the whole
heterogeneous system copy to a stop as it comes with a significant increase of all
resources for just this process. In the example shown in Figure 6-18, 100% of the CPU is
used and no resources are free for other tasks.
6.11 LOCKTIMEOUT
During production operation you must set this parameter to an appropriate value to avoid
deadlocks and to ensure that there are no applications waiting indefinitely.
During the migration process, we recommend setting the value of LOCKTIMEOUT to “-1”.
If you set this parameter to “-1”, lock timeout detection is turned off. It avoids having an index
creation task abort because it is waiting for a lock held by another long running index creation
job.
This may also be useful if you import multiple data packages in parallel using the db2load API
(WHERE splitter). In this case, the first LOAD process is locked, and the other processes
must wait until the data package is imported.
122 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
For large tables this can be an expensive and time-consuming procedure. But Db2 delivers
some functions to improve the runtime of this step and therefore, reduces the downtime of a
migration.
Db2 collects table and index statistics automatically and this is enabled by the database
configuration parameter “AUTO_RUNSTATS”. When enabled, Db2 periodically (in a 2 hour
interval) checks if statistics need to be updated and performs the required statistics collection
during the defined online maintenance window. Furthermore, the Db2 database configuration
parameter “AUTO_SAMPLING” should be enabled, so that Db2 automatically uses a page level
sampling.
The “AUTO_STMT_STATS” database configuration parameter is used for RTS statistics collection.
When enabled, the Db2 optimizer will create statistics synchronously for fabricate statistics
based on other available information. In any case, the synchronous statistics collection will
not spend more than 5 seconds in the process and will also use sampling for larger tables. If
sampling is used or the runtime exceeds the 5 second limit, Db2 will schedule asynchronous
statistics collection in addition.
As part of the heterogeneous system copy, SAP SWPM schedules RUNSTATS jobs for a
selected set of tables only and relies on the automatic or real-time statistics collection of Db2.
To get a recent and valid set of statistic information, the following options for optimizing the
statistics collection are available:
Enable automatic statistics collection early during the heterogeneous system copy.
Manually execute RUNSTATS.
Extract statistics from a previous test migration using db2look and apply those using the
Db2 command line processor.
If this is done, the automatic statistics collection will start. It comes with slight increase on IO
and CPU resource with low single digits impact.
Beside the resource usage, there are two more potential trade-offs:
Some tables may be subject of multiple automatic statistics collection. This could happen for
example if INSERT is used, and the import takes 2 or more hours.
The last table or the few last tabled that are imported will not be analyzed prior the SAP
system is started. To address this, the optimizations of manual RUNSTATS or Db2 statistics
copy with db2look can be used.
To enable automatic statistics collection during the import, issue the following command:
UPDATE DB CFG USING AUTO_RUNSTATS ON
Example 6-19 shows the SQL statements for compiling runstats commands.
In the SQL Statement <SAP-SCHEMA> must be replaced by the actual name of the SAP
schema. The SQL Statement will return all non-volatile tables in the SAP schema that do not
have statistics. Save this output to an SQL Script, for example db2_stats.sql.
You can then execute the script db2_stats.sql with the command:
db2 -tvf db2_stats.sql
As a further improvement, you can divide the script db2_stats.sql into different pieces
and execute the scripts in parallel.
As a best practice, you can run dedicated runstats for the largest tables that you also
identified for export package or table splitting. In addition, let one or two manual runstats
scripts run for the remaining tables.
Depending on the free resources you can already start the RUNSTATS scripts when the
import for the last table is still running.
You can also use this process with early enablement of automatic statistics collection. The
SQL Statement shown above will only return tables that do not have already statistics
collected. So, in best case, only a few tables remain and are used for manual statistics
collection.
After the statistical values have been calculated, they are stored in Db2 catalog tables that
can be updated. As statistics values usually exist from a previous test migration, you can
extract them using db2look command. Although the statistic information is typically based on
an older snapshot of the production database and therefore outdated, you can use it to start
production on the target system. The advantage is sub-second execution time for applying the
runstats information about the target database that is collected during a test migration or even
from the source system outside the downtime window.
124 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Gathering RUNSTATS from the source system using db2look and applying them in target is
only possible if the Db2 parameters shown in Example 6-20 are equal on source and target.
If they do not match, subsequent Db2 automatic or real-time statistics could generate more
up-to-date statistics which would be incorrect.
Attention: After table or index statistics are updated using db2look procedure, the table is
excluded from automatic or real-time statistics collection.
To re-enable automatic runstats after the import of the statistics with db2look, a manual
RUNSTATS needs to be executed. To re-enable automatic and real-time statistics collection
use the following command for each corresponding table:
RUNSTATS ON TABLE <tabname> SET PROFILE NONE
This Db2 command can be executed while the SAP System is online and thus outside the
downtime windows. But if you forget to execute this, subsequent statistics will not be
collected.
Example 6-21 shows how to generate a db2look SQL file for the update of table EDIDS in
database NZW with schema name sapnzw. The name of the generated file is
edids_db2look_stats.sql.
Note: When executing the db2look command, make sure that you do not forget option –r.
Otherwise, the generated script contains RUNSTATS commands and the performance
advantage is completely gone
Finally, you must apply the extracted statistics information to the Db2 target system as part of
the productive migration. You can use the following command:
Db2 –tvf <tabname>_db2look_stats.sql
You may combine the db2look procedure with the early enablement of automatic statistics
collection and the manual statistics collection and use the db2look-based procedure only for
the last few remaining tables.
In many cases, the combination of early automatic statistics collection and manual
RUNSTATS is sufficient. We recommend the db2look procedure only in extraordinary
situations.
In general, the benefit of using application servers for the import is not as big as during the
export.
Figure 6-19 shows the result of tests with local and remote R3load processes. it shows the
workload distribution – based on AIX WLM monitoring - between database and R3load
gathered from a migration of a customer system.
Based on these results a rule of thumb is that for the import about 10% - 15% of the CPU
workload can be shifted away from the database server by using a separate application
server for R3load. However, the question is still open why the database load was not faster in
our test case when using remote R3load. The issue is the network traffic that significantly
increases. This is shown in Figure 6-20 on page 127. The data between the SAP application
server and the database server is not compressed and so the process loses the advantage of
shipping less data over the network.
126 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
Figure 6-20 Network I/O during Import with local and remote R3load
Combining the advantages of dedicated application servers for the export with the finding
here, a potential setup could be to use importing database server as exporting server as well.
This should not be done as a default and most like the best configuration is a dedicated
Migration Monitor instance for some tables with this setup only.
STMM decreases the need for detailed memory configuration. It can tune the following
memory consumers from the Db2 database global memory:
– Database LOCKLIST and MAXLOCKS
– Package cache size (PCKCACHESZ)
– Sort memory (SHEAPTHRES_SHR, SORTHEAP)
– Buffer pools
While the values of LOCKLIST, MAXLOCKS and PCKCACHESZ do not play a significant
role in the performance of the heterogeneous system copy, the configuration for sorting and
buffer pools is important. Out of these, the sorting configuration is more important as the Db2
LOAD utility loads data into the database, bypassing the buffer pools. However, the buffer
pool is still important and therefore, STMM can help to find the optimal configuration.
Figure 6-21 shows the configuration changes and runtime improvements for five migrations in
a row.
During the five test migrations, STMM has changed the Memory configuration about 150
times, and we have called this action an “STMM Change Point”.
STMM performed the configuration changes for the buffer pool, package cache and the Db2
sort configuration. The sizes for MAXLOCKS and LOCKLIST have never been changed
during the tests and are not part of the figure. The primary axis on the left displays the
memory size for each of the monitored memory consumers and the secondary axis, the
runtime of the import is shown.
You can see in the Figure 6-21 that STMM rapidly adapts the size for the SORTHEAP and
SHEAPTHRES_SHR and little slower increases the buffer pool. At position 50, the first test
migration was completed, and the second test migration already was close to the optimal
runtime.
Where is the improvement by STMM coming from? The Db2 STMM has adapted the
SORTHEAP and SHEAPTRES_SHR configuration which improved creating indexes.
The chart also shows that the last test migration (Position 120-150) was slightly slower
compared to the previous test run. This might happen but is not significant. More significant is
the fact that it changes at all and this conflicts with the best practice to conduct the final test
migration with the exact same settings and schedule as the last successful test migration.
128 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
6.14.2 INTRA_PARALLEL
This parameter specifies whether the database manager can use intra-partition parallelism
and can be set in accordance with SAP Note 2047006 – DB6: Use of Db2 SMP Parallelism
(INTRA_PARALLEL=YES).
In our test, we have not seen any noticeable positive or negative performance impact. In
theory, the parameter could help if the R3load Option “SPLITTED_LOAD” is used. It can improve
the final copy phase but typically the bottleneck is more IO related and thus the Db2 parallel
processing does not help much.
On the other hand, lowering this value could influence the performance of index build. The
CHNGPGS_TRESH parameter can force that temporary data for index creation is written to
disk even if it would fit into the buffer pool.
Tuning this parameter is difficult and we can give no general recommendation but the default
value of 20 seems to be a good starting point.
You can configure page cleaners on your system to be proactive with the registry variable
DB2_USE_ALTERNATE_PAGE_CLEANING.
When you set the registry variable to ON, page cleaners behave more proactively in choosing
which dirty pages get written out. After enablement, page cleaners no longer respond to the
value of the CHNGPGS_THRESH database configuration parameter.
During out test we have not seen a clear positive or negative impact and so we cannot give a
recommendation to use it – or not to use it.
One tempting method to optimize the temporary IO is to use a RAM drive and create the
temporary tablespaces in this memory area. The advantage is fast I/O
The results may be promising at the first sight but with a decent configuration of SORTHEAP and
SHEAPTRES_SHR, we have not seen a clear benefit of this concept. In short, if you have enough
main memory on the target machine configure Db2 with a large SORTHEAP and SHEAPTRES_SHR
instead of creating a RAM drive.
You can deactivate logging during R3load, which could avoid disk contention and logging
overhead during inserts. In addition, if you run the roll forward utility and it encounters a log
record that indicates that a table in the database was populated with the NOT LOGGED INITIALLY
option, the table will be marked as unavailable. Therefore, perform a full offline backup after
the migration to ensure that the database and all tables are recoverable.
The –nolog option reduces the amount of logging but is typically not faster compared to
logging enabled. So, this option may only be used when logging IO is limiting the
performance.
Important: Do not use the -nolog parameter of R3load when using parallel import of table
splits with INSERT. This will end up in lock wait situations and sequential processing of
tables.
With native database encryption, the database system itself encrypts the data before it calls
the underlying file system to write that data to disk. This means not only your current data is
protected, but also data in new tablespace containers or tablespaces that you might add in
the future. Native database encryption is suitable for protecting data in cases of either
physical theft of disk devices or privileged user abuse.
It also ensures that damaged or outdated disks that may be replaced and scrapped by the
cloud service provider do not contain readable data.
As always in IT, additional functionality comes with a certain cost. Db2 native encryption
introduces an additional processing step in all IO operations. This is the decryption process
when data is read from disk or the encryption process when data is written to tablespaces or
the Db2 transaction log files.
The encryption process is supported by hardware acceleration starting with Db2 11.1. This
acceleration makes a significant difference in the impact on both system resource
consumption and application throughput. Db2 automatically leverages the following CPU
enhancements:
– Intel Advanced Encryption Standard New Instructions (AES-NI) support
– Power8 in-core support for the AES
130 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch06.fm
The hardware acceleration is automatically detected and used by Db2. To determine if your
systems is capable of this, check the Db2 diagnostic log file for the following entry:
“Encryption hardware acceleration detected”
Note: Most of the SAP certified servers, offered by the major cloud service providers
provide AES-NI support for Linux and Windows and you should use Db2 native encryption
only with available support for this feature.
The effect of the additional processing needed for encryption and decryption will show up as
slower disk access, although it is CPU or accelerator resources that are used. If you enable
Db2 native encryption, monitor the following Db2 metrics closely before and after the
enablement:
pool_read_time Indicates the total amount of time spent reading in data and index
pages from the tablespace containers (physical) for all types of
tablespaces.
pool_write_time Cumulative elapsed time for each asynchronous write to the
tablespace containers to complete
log_disk_wait_time The amount of time an agent spends waiting for log records to be
flushed to disk.
There are several options to enable Db2 native encryption. It can be enabled during a restore
operation which can be optimized with HADR to minimize downtime. It can also be enabled
during the database creation time - the SAP SWPM offers this option. The database will be
created with Db2 native encryption enabled and all data that is subsequently imported by
R3load will be encrypted.
As Db2 native encryption introduces additional processing, and with this additional resource
usage, it may slow down the migration process. While we have seen only minimal impact
during the export on an IBM Power Server, the impact during the import is more significant.
This is shown in Figure 6-22.
Figure 6-22 R3load Import with and without Db2 native encryption enabled.
The graph shows significant impact on some tables. The import time, including the index build
for the tables MSEG and VBEP are doubled while the impact on other tables is a 20% to 50%
performance degradation. The impact depends on the table structure but even more on the
number and width of the indexes defined.
So therefore, the overall impact on the migration process may vary, depending on the largest
tables in the system. The impact however will be noticeable, and you should plan for
additional 50% for the runtime of the heterogeneous system copy with Db2 native encryption
enabled.
Beside the default optimization to minimize I/O by reducing spilled sorts and the -nolog option
for tables that are using insert, there is not much optimization potential for Db2 native
encryption.
You must assess if the heterogeneous system copy still fits in the downtime window or if you
want to implement the feature later using backup and restore, perhaps downtime optimized by
using HADR. For details about this process optimization with HADR, please check out the
Db2 for SAP Community page.
132 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
This chapter describes some important facts about both features to provide you a better
understanding of how they can impact your migration.
The motivation for socket transfer is obvious. You do not want to wait with the import until the
export is completed. You want to start with the import immediately when the export starts as
shown in Figure 7-1. With this, you can optimize the scheduling of the export and import and
save up to 50% of the time.
Through the use of table split, multiple goals can be achieved. The export can be improved by
running multiple R3load processes that export a single table in distinct chunks or splits. The
improvement for this comes to some extent from the database processing. Db2 is typically
able to deliver data much faster than a single R3load can receive, compress, and write to
disk.
The benefit of multiple R3load exports against a single table primarily comes from the effect
of parallelization of the R3load internal processing. This is even more true if a code-page
conversion, or declustering is done on the export side.
On the import side improvements are possible as well, but knowing the differences between
how Db2 handles LOAD versus INSERT processing is important to understand the different
import optimizations. This is presented in 6.6.1, “Db2 LOAD versus INSERT” on page 107.
You can execute only one R3load which uses the db2load API per table. So, you may end up
in a situation where your export is running with multiple R3load processes, and the import
134 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
sequentially loads the export dumps. While this seems to be a disadvantage, it can also be
used to achieve similar effects as the socket transfer option - an overlapping of export and
import as shown in Figure 7-2
Figure 7-2 Table Split with single R3load using Db2 LOAD: “Forced Load Option”
To accomplish this overlap, you define a reasonable number of splits and do not execute them
in parallel but sequentially. You then transfer the files to the target by configuring the network
or FTP exchange directory and let the export monitor and the import monitor communicate
using the signal files to manage the import process.
With this, you achieve almost the same effect as with the socket option. Multiple chunks are
defined but only one R3load is exporting at a time and on the target, the imports is managed
so that only one R3load using the db2load API is running at a time. This concept is called
“Forced LOAD” and the effect is almost the same as the socket transfer option. The difference
is that the importing R3load must wait for the first table split to complete the export.
Using INSERT is typically slower than Db2 LOAD but because multiple R3load processes
with INSERT can execute in parallel, another option is to use multiple R3load streams for both
import and export as shown in Figure 7-3.
Figure 7-3 Table Split with multiple R3load processes using INSERT
With multiple parallel R3load processes for one table on the importing side, you also
parallelize the R3load internal processing like decompressing the R3load dump files which
can provide an efficient use of resources and reduce the migration time.
The third option is to import the data into multiple temporary tables with the db2load API,
followed by a final database internal copy with LOAD FROM CURSOR into the final table.
This option is called the “Split Load” option and is shown in Figure 7-4
Figure 7-4 Table Split with single R3load using Db2 LOAD: “Split Load Option
Although the data is loaded twice, the procedure can be much faster than just one R3load.
The final LOAD FROM CURSOR does not involve any R3load processing or Db2 internal
code page conversion (UTF-16 to UTF-8) and is typically by factors faster than R3load with
using the db2load API.
136 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
In our test environment we measured a reduction of the runtime by 35%. This can lead to a
substantial reduction of downtime during a heterogeneous system copy. But why is the
runtime not reduced by 50%, which is the best you can expect?
The best result you can achieve, is when the export and import process execute with the
same throughput. As soon as one part is slower, it determines the overall improvement. In the
test case with the table BKPF, the export was the limiting factor.
Figure 7-6 MigMon Configuration for Unicode Conversion with Socket Feature
To configure the socket option, you must set the parameters in the MigMon properties file as
shown in Table 7-1.
socket Socket operating mode R3load will not write dump files to
the file system but the export and
import work through the socket
connection.
The port definition on export and import side must match. Otherwise, there will be no
connection between them.
You can also use this option within a server. A possible implementation scenario is to use the
importing server also as application server for the export and to define the socket
communication locally. If you want to combine this option with other techniques of data
transport, you must use multiple MigMon instances due to different configurations in the
properties file.
You can also use the socket option for Unicode conversions if you obey the restrictions
described in SAP Note 971646 - 6.40 Patch Collection Hom./Het.System Copy ABAP.
138 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
You need a stable network connection between the export and import side. If there is an error
in the data stream, the process fails and needs to be restarted. This is especially true if you
have a R3load pair that handles a large table.
Restart of a failed socket transfer can be more complex or may take more time. In general, if
one process of an R3load pair fails, the remaining processes must be stopped as well. For
various combinations of restarting a socket R3load, please refer to the SAP Heterogeneous
System Copy Guide.
When using the socket option, declustering needs to take place on the target server and thus
the migration monitor needs to be set up differently. You must set
SUPPORT_DECLUSTERING=false on the export side and
SUPPORT_DECLUSTERING=true on the import side. This looks like a small change, but it
deviates from the SAP default of declustering during the export.
Socket transfer does not compress the data stream. Usually, the R3load export dump files are
compressed. Although this is an CPU intensive operation, the big advantage is that
significantly less data is transferred over the network. With the socket transfer, dump data is
not being compressed, meaning that more data is shipped over the network. This might be an
issue for heterogeneous system copies from on-premises into the cloud. On the other side,
the socket transfer decreases the CPU resource usage for R3load compression.
The Db2 specific R3load option “OPT_COMPRESS” cannot be used in conjunction with
socket transfers. You can only use the automatic dictionary creation by Db2 or the
“FULL_COMPRESS” option that reorganizes the table after it is loaded. We discuss the
various compression options in section 6.5.1, “Introduction to Db2 compression” on page 96.
Important: Table splitting is an advanced option, and you need to be aware that there are
three different methods available for importing split tables with a target Db2 database
(Forced LOAD, parallel insert or split LOAD). The decision on which flavor of the import is
used depends on the table structure, infrastructure and requires testing.
Also be aware, that the number of splits should be kept small – significantly lower than the
maximum of 200.
Consider 10-20 splits per table or up to 100 GB of data volume per table split as a good
place to start.
The *.STR files, which are also used with the standard migration process.
The *.WHR files, a dedicated file per split containing the WHERE clause for the table split.
Example 7-1 *.STR and *.WHR Files for table COVREF that is divided into 10 splits.
bash-5.1$ ls -ltr *COVR*
rwxrwxrwx 1 srcadm sapsys 1172 Mar 16 2022 COVREF.STR
rwxrwxrwx 1 srcadm sapsys 298 Jun 16 2022 COVREF-9.WHR
rwxrwxrwx 1 srcadm sapsys 310 Jun 16 2022 COVREF-8.WHR
rwxrwxrwx 1 srcadm sapsys 316 Jun 16 2022 COVREF-7.WHR
rwxrwxrwx 1 srcadm sapsys 322 Jun 16 2022 COVREF-6.WHR
rwxrwxrwx 1 srcadm sapsys 322 Jun 16 2022 COVREF-5.WHR
rwxrwxrwx 1 srcadm sapsys 322 Jun 16 2022 COVREF-4.WHR
rwxrwxrwx 1 srcadm sapsys 322 Jun 16 2022 COVREF-3.WHR
rwxrwxrwx 1 srcadm sapsys 320 Jun 16 2022 COVREF-2.WHR
rwxrwxrwx 1 srcadm sapsys 147 Jun 16 2022 COVREF-10.WHR
rwxrwxrwx 1 srcadm sapsys 165 Jun 16 2022 COVREF-1.WHR
As mentioned, the *.WHR files contain the WHERE clause of the SQL Statement that is used
by R3load to export this chunk of the table. Comparing the two SQL Statement parts, you will
find different values for the columns RELID, PROGNAME and SRTF in Example 7-2 for the
table COVREF.
Example 7-2 Content of two different WHR-Files for the table COVREF
tab: COVREF
WHERE ('GK' < "RELID" OR ("RELID" = 'GK' AND '%ARBERP_H_ONSTMSA' < "PROGNAME")
OR ("RELID" = 'GK' AND "PROGNAME" = '%ARBERP_H_ONSTMSA' AND 0 < "SRTF2")) AND
("RELID" < 'IC' OR ("RELID" = 'IC' AND "PROGNAME" < '%ARBFND_H_MSGCSTA') OR
("RELID" = 'IC' AND "PROGNAME" = '%ARBFND_H_MSGCSTA' AND "SRTF2" <= 0))
tab: COVREF
WHERE ('FE' < "RELID" OR ("RELID" = 'FE' AND '%ARBERP_H_CXMLMSG' < "PROGNAME")
OR ("RELID" = 'FE' AND "PROGNAME" = '%ARBERP_H_CXMLMSG' AND 0 < "SRTF2")) AND
("RELID" < 'GK' OR ("RELID" = 'GK' AND "PROGNAME" < '%ARBERP_H_ONSTMSA') OR
("RELID" = 'GK' AND "PROGNAME" = '%ARBERP_H_ONSTMSA' AND "SRTF2" <= 0))
There are many options available to create and optimize the table split migration and
therefore we will focus on Db2 specific information only. For details about the table split,
please refer to the SAP heterogeneous system copy guide.
140 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
The order by is appended if the DDLDB6.TPL File contains the ORDER_BY key word, or the
table needs to be exported sorted. For example, if a table cluster is exported and a code-page
conversion is required, a sorted export is being performed. The ORDERCOLS will be the
columns defined by the primary key on the table.
If table split is used, the statement defined in the WHR-File is merged into the general
statement. With this, the actual statement looks like what is shown in Example 7-3.
Example 7-3 Simplified SQL Statement for sorted export with table split
SELECT * FROM SAPSRC.COVREF WHERE ('FE' < "RELID" OR ("RELID" = 'FE' AND
'%ARBERP_H_CXMLMSG' < "PROGNAME") OR ("RELID" = 'FE' AND "PROGNAME" =
'%ARBERP_H_CXMLMSG' AND 0 < "SRTF2")) AND ("RELID" < 'GK' OR ("RELID" = 'GK'
AND "PROGNAME" < '%ARBERP_H_ONSTMSA') OR ("RELID" = 'GK' AND "PROGNAME" =
'%ARBERP_H_ONSTMSA' AND "SRTF2" <= 0)) ORDER BY RELID, PROGNAME, SRTF2
Example 7-3 reveals the additional optimization options that are related to SQL tuning.R3load
scheduling
Before we dig into the details, let’s first discuss what is possibly the most powerful
optimization – accurate scheduling and the number of R3load processes.
Table splitting typically uses the order-by.txt file to define a dedicated schedule for large
tables. As you make a decision on the import strategy, consider that the number of R3load
processes on the source system must be defined in coordination with the import process.
If you choose to use a sequential import using the db2load API, it might not be beneficial to
have many concurrent R3load exports. You might consider reducing them to a sequential
single R3load export, or a few exporting R3loads if a single export is slower than a single
import.
If you decide to import using the temporary table option, a good number of simultaneously
exporting R3loads is also slightly faster than the importing R3loads.
Note: You can define the number of concurrent R3load jobs in the Migration Monitor
properties file and the order-by.txt file. If you do so, the total number is the sum of the jobs
in both files.
FAGLFLEXA-4
FAGLFLEXA-5
FAGLFLEXA
[COVREF]
loadArgs=-dbcodepage 4103
jobNum=2
COVREF-1
COVREF-2
COVREF-3
COVREF-4
COVREF
There are more options available to distinguish between large and small packages, and you
can find details in the SAP Heterogeneous System Copy guide.
The take-away from this section should be. Be aware of the number of parallel R3load
processes, define them and incorporate both the export and import side in the decision. As
always, do not over-utilize any resource.
Figure 7-7 shows an export where all the available CPU resources are consumed solely by
the split export of the table CDCLS. For the cost of optimizing the table CDCLS, the machine
had no free resources of the remaining export. The picture also contains the workload
separation between R3load processing and database processing. It is obvious that Db2 is not
very busy.
Reorganize tables
When you experience a slow export on a split table, check the cluster factor of the index using
the DBA Cockpit in your source system.
142 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
If tables are being exported unsorted, a low cluster factor can significantly reduce the
performance of the data unload.
In this case, you could reorganize the table using the index before you start the export. Be
aware that reorganizing the table will increase the workload on the system.
Reorganizing GLPCA according to the index used for the export, resulted in a significant
runtime reduction. Export performance improved by up to factor of 6.7 as shown in
Figure 7-8.
Another important observation from Figure 7-8 is that the export scales almost linearly when
increasing the number of parallel jobs from 1 to 2 and then to 4 parallel jobs. When increasing
from 4 to 8 parallel jobs, the performance gain was only factor 1.3. With 8 parallel jobs the
CPU utilization was 30% and I/O wait time was up to 50%. We therefore conclude that in the
test case with 8 parallel R3load jobs the system performance was bound by the limited I/O
capacity of the storage.
Figure 7-9 CDCLS with performance slowdown in the first and last Split.
You can first check if the number of rows exported is equally distributed and if in this case, the
first and last split are slower, a non-optimal access plan may be the reason. If this degrades
overall runtime, this needs to be fixed.
As there are different reason for a non-optimal access plan and different solutions, the best is
to open an SAP support incident and optimize this together with SAP support.
One workaround that may help in this situation is to use the Db2 optimizer level “0” for these
statements. The optimizer level cannot be set as an R3load option, but you can use the
procedure described in SAP Note 1818503 – DB6: SAP Optimizer Profiles.
To enable an SAP optimizer profile, you must insert the guideline for optimizer level 0 and the
conditions when the guideline should be applied. These are set in the table
DB6_OPTPROFILE. This can be done using SAP transaction SE16 or manually as described
here.
144 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
The statement in Example 7-5 will add an SAP optimizer guideline into the table
DB6_OPTPROFILE that is valid for the table FAGLFLEXA and a SQL statement that performs
a SELECT on the table.
The guideline will be applied to all SQL statements that select data from the table
FAGLFELXA. So, in this case, all table splits will pick up this guideline. This should normally
be not an issue but if you want to apply the guideline only for a specific R3load export, you
must specify the SQL Statement from that particular *.WHR File.
To verify the guideline, you can analyze the SQL query with dmctop or any other explain tool
and you should validate the optimization level in the output as shown in Example 7-6.
Furthermore, the SQL statement that is now running contains the information about the
optimizer guideline.
If the workaround works as desired, you will see an index access with appropriate start and
stop predicates and the export will execute significantly faster. Still it is advised to open an
SAP incident to validate the workaround. For details about dmctop and db2expln, refer to
Chapter 10, “Tips for monitoring” on page 175.
Forced LOAD
The sequential Import is used if you start the R3load with the following options:
This will force only one active R3load process per table. This can be used to achieve an
overlapping of export and import
For the CDCLS such order-by.txt file could look like Example 7-7
CDCLS-3
CDCLS-4
CDCLS-5
CDCLS-6
CDCLS-7
CDCLS-8
CDCLS-9
In addition, ensure you have set the Db2 database configuration parameter "locktimeout" to
"-1" as a parallel R3load process may be scheduled for some reason. If you do not set the
lock timeout to -1, the scheduled but locked R3load will fail when reaching the timeout and
needs to be restarted.
Check the output of “db6util –sl” for lock wait situations as shown in the Example 7-8.
If you use the “FORCED_LOAD” option and you want to deviate from the default of creating
the indexes after the import is completed, you should use the environment variable
DB6LOAD_INDEXING_MODE=2 (Incremental) as this improves the index build significantly.
The incremental indexing mode can reduce the total runtime by up to 50%, depending on the
size and number of indexes defined for a table.
Note: Ensure that this variable is only set in the environment for the Migration Monitor
instance that uses the forced LOAD. Setting this variable for other R3load processes that
create the index before the data load will slow down the performance.
146 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
Parallel INSERT
The parallel import with INSERT is used if you start the R3load with the following options:
This will use SQL INSERT instead of the db2load API. It allows for parallel imports but
switches to a slower single thread performance. You also need to take into consideration that
the R3load process now writes data to the Db2 transactions log files as in addition to
importing data and logging performance might become a bottleneck. Most likely, the required
IOPS during highly parallel INSERT by R3load is higher than you see in production on the
source system or the target system. Monitor the Db2 logging using the
MON_GET_TRANSACTION_LOG table function.
Note: Do not use the -nolog parameter of R3load when using parallel import with INSERT
as this may result in lock wait situations.
In “Performance comparison – Db2 LOAD versus INSERT” on page 108, we have shown that
for cluster tables, INDX-like tables or tables with large LOB columns, the performance
advantage of Db2 LOAD over INSERT is sometimes only a factor of 2-5. So, these tables are
good candidates for table splitting with multiple parallel INSERT operations.
You can combine the parallel INSERT only with the automatic dictionary creation or
COMPRESS_ALL. If specified, OPT_COMPRESS_ALL and FULL_COMPRESS_ALL
options will be ignored.
Split LOAD
The parallel Import with Db2 LOAD and temporary tables is used if you start the R3load with
the following options:
This option combines the advantages of Db2 LOAD and INSERT to some extent. It
parallelizes the load processing without introducing a significant amount of additional logging
on the target side. It however doubles the write workload and increases the read workload as
data is loaded twice and read once within the database.
This concept perfectly makes sense for tables that benefit from the db2load API. For the table
SWWLOGHIST for example we have tested that Db2 LOAD is by factor 15-20 faster
compared to INSERT processing.
Let’s do a quick calculation: Assume a table of 500 GB needs to be imported and a single
LOAD based R3load imports 100 GB per hour and is 20 times faster compared to a R3load
using INSERT. If you import with five parallel R3load processes into 5 temporary tables, this
takes 1 hour. If the internal LOAD FROM CURSOR is 2 times faster than the R3load, it takes
additional 2.5 hours for this final copy. In total it takes 3.5 hours.
With five parallel R3load process with INSERT it would take 20 hours as those achieve only
50 GB per hour. To be as fast as the process with temporary tables, 29 R3load processes with
INSERT would be required.
Although this is a mathematical exercise, not taking all boundary conditions into account, it
explains why the SPLITTED_LOAD approach could make sense.
The split LOAD option creates temporary tables in the first phase and populates them. As a
default these temporary tables reside in the same tablespace as the permanent table. The
tables will be deleted after the import completes and the space is given back to the
tablespace and can be reused by other objects.
However, as the temporary tables are large and you will have multiple tables, the database
may allocate more space than required for a significant amount of time. If this occurs, you can
give the empty pages back to the filesystem by lowering the tablespace high watermark. This
can be done online in the production target system, but adds additional IO workload to the
database server.
To overcome this effect, consider creating a dedicated tablespace for the temporary tables. To
do so you have to create a tablespace manually, grant use to the SAP APP Role on this
tablespace, and set the environment variable “DB6LOAD_TEMP_TBSPACE” to reference this
tablespace. The temporary tables are then created in the newly created tablespace and once
the heterogeneous system copy is completed, the tablespace can be dropped. This is shown
in Example 7-9.
Example 7-9 Enabling a dedicated tablespace for temporary split load tables
db2 “CREATE TABLESPACE SPLITTEMP MANAGED BY AUTOMATIC STORAGE USING STOGROUP
IBMSTOGROUP”
db2 “GRANT USE OF TABLESPACE SPLITTEMP TO ROLE SAPAPP”
export DB6LOAD_TEMP_TBSPACE=SPLITTEMP
It is essential to closely monitor the Db2 LOAD operations in the Db2 diagnostic log file or
while the R3load is running. Monitor the LOAD operations with respect to any decrease of
CPU Parallelism and ensure that the importing machine is not overloaded.
You can combine the split load with the different Db2 specific compression options. The
options COMPRESS_ALL and FULL_COMPRESS_ALL work transparently with the split
load. The OPT_COMPRESS ALL option is also supported but works differently. The
compression dictionary is not created when the data is read from the R3load dump files.
Instead, the sampling happens when the data is copied from the temporary tables to the final
table.
Note: The R3load option for deferred table creation (DEF_CRT) should not be used with
split load. This option does not improve performance as the large tables will always be
materialize and in some cases it will cause abortions when using Db2 11.5.8 or older.
When a R3load process of one of the split fails, data from other already imported splits can be
retained and only the failing split needs to be restarted. To make this work, the table is not
truncated but the respective data is deleted. The SQL DELETE Statement is in principle
similar to the SQL SELECT Statement in the WHR-File.
Deleting a large portion of data will be logged in the Db2 transaction log file. Thus, log full
situations may occur when there is not enough space on either the log device or the number
of configured log files is completely consumed. See Example 7-10 on page 149.
148 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch07.fm
Example 7-10 Example for an error message with log file full
DB21034E The command was processed as an SQL statement because it was not
a valid Command Line Processor command. During SQL processing it returned:
SQL0964C The transaction log for the database is full. SQLSTATE=57011
FUNCTION: DB2 UDB, data protection services, sqlpgResSpace, probe:2860
MESSAGE : ADM1823E
The active log is full and is held by application handle "<appl_hdl>".
Terminate this application by COMMIT, ROLLBACK or FORCE APPLICATION.
FUNCTION: DB2 UDB, data protection services, sqlpWriteLR, probe:6680
MESSAGE : ZRC=0x85100009=-2062548983=SQLP_NOSPACE
"Log File has reached its saturation point"
DIA8309C Log file was full.
To minimize the risk of such a situation, the DELETE processing during restart is performed in
a loop and frequently issues a COMMIT statement.
With SPLITTED_LOAD, indexes must be created after the load. The step of creating the
primary key index triggers the merge of the temporary tables to the final table. As no indexes
exist, with each loop, a table scan will be performed to identify the data to be deleted. In the
worst case, the last split fails and the table scans need to read a large table multiple times. As
an effect, the restart can take a long time. In fact, it may take several days if the table is large.
Runtime comparison.
After all this details, you probably expect some decisive results and a clear recommendation
for the one and only perfect method to be used. We must disappoint you in this case. The best
approach depends on the various factors mentioned in this and other chapters.
The results shown in Figure 7-10 on page 150 show the comparison of three import options
on two representative tables.
This one simple test has shown that the best option for the table FAGLFLEXA is to use the
SPLITTED_LOAD, while for the table COVREF a single LOAD is the most beneficial setup.
With a Parallelism of 15, it is likely the parallel INSERT could become the best method. These
test results should not be considered as a recommendation for any specific approach. They
are intended to show you that there can be a significant impact on processing time depending
on which method is used.
In general, the SPLITTED_LOAD is a good starting option when used with close monitoring.
Based on the results seen from your monitoring, you may choose to switch to parallel INSERT
or FORCED_LOAD instead.
To minimize the monitoring and test phase, you can use the R3load Option -discard
<NUMBER>. This option will stop the export after <NUMBER> of rows and so you do not
need to perform a full export. This export cannot be used for a final import but to assess the
optimal method for the import since only a subset of the data is exported or imported.
Figure 7-10 Import Runtime Comparison – 5 x Parallel Insert versus 5 x Parallel Load versus single
Load
7.3.4 Conclusions
Our general recommendation is that you familiarize yourself with the handling of table
splitting. This concept is a powerful option but requires practical knowledge of the tools and
Db2. Test this process and the different flavors on a sandbox system.
150 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
Most of these offerings can be accessed using the IBM Cloud Catalog.
Figure 8-1 shows the main use case for physical data shipping by example of Mass Data
Migration.
Figure 8-1 Use Case for Mass Data Migration Device – Source IBM website
152 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
We now present two use cases using IBM Aspera. The second use case is the actual setup
that we tested in our SAP migration test lab environment that was built for this book. This use
case describes a method to transfer data exported from an SAP on premises installation, into
a new SAP replacement system in the cloud. The data transfer process is part of every SAP
migration project. Optimizing this step can significantly reduce the overall migration downtime.
But first we have a look at how Aspera can be used to optimize connectivity to IBM Cloud
Object Store.
IBM Aspera high-speed transfer provides advantages for large file transfers when compared
to FTP and HTTP approaches. This is especially true in networks with difficult conditions
where the gained performance advantages can be significant. IBM Aspera high-speed
transfer uses the IBM FASP® protocol to transfer data. To gain an overview how big the
advantage can be customers can use the IBM Aspera® file-transfer calculator.
Figure 8-4 shows that an IBM Aspera file transfer could be 45 times faster with the given
network characteristics for a 100 GB file conveyed by IBM Aspera compared to a traditional
TCP based transfer.
In section 8.5, “IBM Aspera for SAP migrations example” on page 158, we provide some
more details and a description of a sample setup for the data transfer with IBM Aspera.
154 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
The speed up of the transfer originates from a special network protocol separate from the
storage protocol. This higher performance can easily be 10 times faster than other ways of
transfer.
After having set up Direct Connect between your corporate data center and AWS, you can
directly transfer files to Amazon Elastic File System. For this you can mount EFS fies directly
on an on-premises server.
These devices allow an easy data transfer using the mentioned protocols. They support high
performance transfers by directly transferring the data into Azure. The devices have a local
cache implemented, dependent on the size of the disks in Azure Stack Edge or the size of the
virtual disks in Azure Data Box Gateway. This speeds up copying files into the device. Please
note when you are transferring the data for migration purposes, the delay of transferring the
data into Azure should be minimized and the cache set up in a way that allows the immediate
transfer of the files.
To establish the connection to Azure finally, a virtual private network can be set up which is
very flexible and can be set up immediately. Alternatively, a connection using ExpressRoute
can be established. The latter are connections directly to the backbone of Microsoft Azure.
This is provided by partners of Microsoft. The connection is a dedicated connection. With this
the speed is reliable, fast – with up to 100 Gbps with very low latency which allows fast
uploads.
Both kinds of connections, VPN and ExpressRoute are encrypted so that the security of your
data is guaranteed.
156 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
Gcloud
Gcloud is originally implemented to manage your Google Cloud resources and developer
workflows. However, one of the available options is a secure copy utility to transfer files to the
Google Cloud instance.
Gsutil
Gsutil is a python tool using a command line that can be used to create and delete buckets,
list buckets and objects, move, copy and rename buckets and many more. One particular
option is to upload, download or delete objects. They in turn can be secured by using https
with TLS. The tool is open source and, thus, is continuously enhanced.
It enables you to move or backup data to a cloud storage bucket either from other cloud
storage providers or from a local or cloud POSIX file system. This can be set up periodically
as part of a data processing pipeline or an analytical workflow.
It does not offer real-time support, meaning that it runs periodically with at least one hour in
between two periods. Thus, Storage Transfer Service does not support sub-hourly change
detection.
The transfer agents of the Storage Transfer Service require a Docker installation and they run
on Linux servers or virtual machines (VMs). To copy data on a CIFS or SMB file system, you
can mount the volume on a Linux server or VM and then run the agent from the Linux server
or VM.
It is possible to transfer up to 1 billion files that are hundreds of terabytes in size with several
10s of Gbps in transfer speed. If you have a larger data set than these limits, we recommend
that you split your data across multiple transfer jobs.
This hardware device will be installed in your data center. For uploading the data, you can use
NFS on Linux and UNIX, or SCP and SSH on Windows. The appliance is then sent back to
Google where the data is uploaded into the Google Cloud. The data is encrypted on the
appliance to assure high security requirements. Dependent on the quality of your internet
connection this can still be the fastest solution to upload the data.
Note: Aspera is available for all major cloud platforms. So this could become a unified tool
if you pursue a multicloud strategy.
158 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
We also tested a setup of two Aspera HSTS servers running the file transfer using watch
folders. The watch folder setup was tested with two options:
1. The SAP files were transferred as soon as they were fully written and
2. We started to transfer the SAP files when they are still open and filled with data.
Both approaches worked well and offered GUI support. In the test, we used the Aspera client
setup to limit impact on the SAP system.
We started the shell script every 20 min as a cron job. To ensure a file is only transferred once
we use flock to set a write lock when the script starts and compiles the list of files to be
transferred during the script runtime.
First the script sends all of the *STR files from the source to the target server. Then it checks
if *SGN files are available and compiles the list of files to be transferred. Based on this list the
Aspera client transfers all files and moves the transferred files to ../DATA/dump_archive. The
*SGN files are transferred in a second call of the Aspera client and also moved to
../DATA_dump_archive. Finally the script checks to see if all files are transferred, by
comparing the number of *STR and *SGN files, and transfers the export.statistics.properties
file if that condition is fulfilled.
Example 8-1shows the script we used for the proof of concept. This example is not intended
to be a ready-to-use solution that works in every environment. You may use this as a base
and adopt the script to your environment and needs.
160 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
EXP_D=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/exchange
DUMP_D=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/DATA
SGN_F=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/exchange/signal_
files
SGN2_F=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/exchange/signal
_files_2
#Dump Files
EXP_DATA_F=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/exchange/ex
port_files_data
#Signal Files
EXP_EXCH_F=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/exchange/ex
port_files_exchange
#Loaction of archived files
SGN_F_A=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/dump_archive/s
ignal_files
SGN2_F_A=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/dump_archive/
signal_files_2
EXP_DATA_F_A=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/dump_arch
ive/export_files_data
EXP_EXCH_F_A=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/dump_arch
ive/export_files_exchange
#Aspera transfer command definitions
MIG_TAR_HOST=ASPERASERVER
MIG_TAR_DATA_LOC=/ASPERACLNT_watch_folder_receive_2/ASPERA_EXPORT/ABAP/DATA
MIG_TAR_EXCH_LOC=/ASPERACLNT_watch_folder_receive_2/ASPERA_EXPORT/ABAP/exchange
MOVE_TRANSFERED_FILES=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/
dump_archive/
ASCP_LOG_DIR=/home/aspwatch/migrate/ASPERACLNT_source/ASPERA_EXPORT/ABAP/dump_arch
ive/aspera_logs
# Create Counter for *STR Files in ../DATA
ls -la $EXP_D/STR_COUNTER.txt &> /dev/null
RC=$?
if [ $RC != 0 ]
then
ls -la $DUMP_D/*.STR |wc -l > $EXP_D/STR_COUNTER.txt
fi
#Build Aspera Transfer File: EXP_F for the --file-list option
cd $EXP_D
ls *SGN > /dev/null 2>> $EXP_D/err_no_signal_files.out
#Exit Condition if no new SGN Files
RC=$?
if [ $RC != 0 ]
then
date >> $EXP_D/err_no_signal_files.out
exit
else
ls *SGN > $SGN_F
sed -r 's/.{4}$//' $SGN_F > $SGN2_F
while read file
do
ls "$DUMP_D/$file".* >> $EXP_DATA_F
echo "$EXP_D/$file".SGN >> $EXP_EXCH_F
162 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch08.fm
exit 0
164 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch09.fm
While range partitioned tables and insert time clustering tables can be used for OLTP
Workloads, the other table types are designed to support OLAP workloads. For detailed
use-cases, refer to SAP Note 1555903 - DB6: Supported IBM Db2 Database Features.
Common to all the these table types is, that they cannot be created by R3load natively.
R3load requires the existence of the so-called SQL files that are created by the SAP report
SMIGR_CREATE_DDL.
For further details, please check the SAP Community page about blogs for
SMIGR_CREATE_DDL and SMIGR_CHECK_DB6.
9.1.1 ITC
Insert time clustering tables provide an effective way of maintaining data clustering and easier
management of space utilization. In ITC tables, data is clustered by using a virtual column
which clusters together rows that are inserted at a similar time.
A heterogeneous system copy gives the opportunity to change the tables type. For ITC
tables, the process is supported by the SAP report DB6CONV and the output of this report
can be used for R3load. In the report, choose to execute a conversion on the database level
and select “Convert Insert Time Clustering (ITC) Candidates to ICT Tables”. Create the
conversion queue but do not execute the conversion.
You can export the list of tables identified and use it for further processing with R3load.
R3load provides the option “ITC_LIST” and so adds all tables identified by DB6CONV to a
text file with the name “db6_itc_tablist.txt” and exits the report “DB6CONV” without executing
the queue. R3load will append the syntax “ORGANIZE BY INSERT TIME” to the tables
automatically.
If your source system already contains ITC tables, the report SMIGR_CREATE_DDL will
generate the required SQL files and ITC tables are created using the SQL files. To do so,
specify the “Keep Source Settings” option for the target database.
Regarding optimization and performance – the findings in this book remain true but the
appropriate configuration of Utility Heap and the Load Buffer becomes more important.
166 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch09.fm
9.1.2 MDC
Multidimensional clustering (MDC) provides an elegant method for clustering data in tables
along multiple dimensions in a flexible, continuous, and automatic way. This table type can be
used with SAP BW only.
The SAP report SMIGR_CREATE_DDL allows you to either retain MDC tables that already
exist in the source system (“Keep Source Settings” option) or convert suitable tables to MDC
tables (Use Multi-Dimensional Clustering (MDC) for BW” option). In both cases, SQL files will
be created and processed by R3load.
Like ITC tables – the findings for performance and compression in this book remain true but
the appropriate configuration of Utility Heap and the Load Buffer becomes more important.
This form of partitioning is supported by SAP for a limited set of tables. To identify table
candidates and to define partitions, use the SAP Partitioning Administrator. For details,
please refer to SAP Note 1686102 - DB6: DB6 Partitioning Administrator.
For those tables, each partition can use its own tablespace for indexes and data. Those can
be mutually exclusive, or a subset of the partitions can share the same tablespaces. It is also
possible that certain or all partitions can use a global tablespace. Knowing this, the syntax for
the CREATE TABLE statement can have many variations. The report SMIGR_CHECK_DB6
checks for inconsistent or wrong assignment of the table and its partitions. The report
SMIGR_CREATE_DDL creates SQL files, containing the table DDL and a SQL file for the
tablespaces if required.
When SMIGR_CREATE_DDL creates the SQL files, the tablespace definitions for the
partitions are transferred 1:1 from the source system to the target system. This means that no
standard tablespace names that include the SID of the SAP system should be used. Instead,
it is recommended to use tablespace names derived from the table name of the partitioned
table with a numeric postfix.
Note: Although many different cases are incorporated in the reports, certain situations may
not be handled completely according to the planned layout. Therefore, pay special
attention to the DDL Statements for range-partitioned tables that are created by
SMIGR_CREATE_DDL.
Regarding optimization and performance for export or import of partitioned tables, the
findings in this book remain true. From an application point of view, range-partitioned tables
remain transparent and so no additional facts need to be considered.
Note: Both optimizations require Db2 knowledge and should be validated using a
four-eyes principle. Keep in mind that a mistake in the manual partition definition can lead
to a loss of data.
For details, please refer to the SAP Guide: SAP Business Warehouse on IBM Db2 for Linux,
UNIX, and Windows 10.5 and Higher: Administration Tasks
Before we show some of the specific information with column-organized tables, here is our
most important advice: the target release for Db2 should be Db2 11.5.8 or higher. There have
been massive improvements introduced in the db2load API and thus, just by using this Db2
version, import is significantly improved. Figure 9-1 on page 169 shows the improvement of
over 40% for a selected table. In addition, the improvement comes with reduced CPU
consumption – so a perfect match for heterogeneous system copies.
168 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch09.fm
Column-organized tables have some special characteristics that affect the migration. One of the
major differences is the data compression process and how it is enabled. The compression is
effective and automated to a large extent. In fact, column-organized tables are always
compressed and there is no choice to disable this. The compression is based on a compression
dictionary and fully automated. Therefore, the various Db2 compression options of R3load do
not have any effect on the import of column organized tables. That means, the R3load Options
COMPRESS, COMPRES_ALL, OPT_COMPRESS and FULL_COMPRESS are ignored by R3load when
importing column organized tables.
Compression for column-organized tables is enabled by a similar concept like for default row
compression – the compression dictionary is built automatically and there is now way to
influence the procedure by R3load.
With row-organized tables, the compression dictionary is built when the table is populated
with 2 MB of data and while the dictionary is built, no data is written to the tables. The process
only takes a few seconds and so this is nothing to worry about. For column-organized table,
the creation of the compression dictionary depends on how the table is populated.
With Insert (R3load -loadprocedure fast), the compression dictionary is built after 500.000
records are inserted independently of the row size. For a table with an average row size of 1
KB, about 500 MB of data is sampled.
If the table is populated with the db2load API (R3load -loadprocedure fast LOAD), 128 GB of
data is sampled. If the tables contain less data, the dictionary is smaller of course. Building the
compression dictionary may take some time and therefore the compression dictionary is
created asynchronously, and the data load continues. Data that is not compressed in the first
step, is automatically compressed asynchronously. The LOAD option is the preferred option as
this is by factors faster for the typical BW tables.
The dictionary creation requires memory form the Utility Heap and therefore a large Utility Heap
is required for loading column-organized tables. You need to configure at least 1.000.000 pages
per table, that is loaded in parallel. We have seen up to 25% Improvement if you allow more
Utility Heap than for row-based tables.
The process of dictionary creation also writes a file to disk, that can become 128 GB in size.
Therefore, you need to ensure enough disk space and throughput in the respective directory.
For each concurrently loaded table, up to 128 GB space is required.
In this directory, a subdirectory structure directory is created with one directory for each
running LOAD process and the files are deleted when the LOAD is completed. The files are
not deleted after the dictionary is built and remain in the directory until the table is completely
loaded. Example 9-2 is an example of the contents of one of these subdirectories.
With column-organized tables, the LOAD utility remains in the ANALYZE phase of the LOAD
process. Therefore, you may see the LOAD remaining in this phase for several minutes as
shown in Example 9-3.
170 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch09.fm
Phase Number = 4
Description = BUILD
Total Work = 2 indexes
Completed Work = 0 indexes
Start Time = Not Started
The Db2 columnar engine is designed for effective parallel processing and therefore, the CPU
resources are efficiently used when loading column-organized tables. This however has the
effect, that the CPU load is significantly higher when loading column-organized tables.
Figure 9-2 shows the CPU usage when just one table is loaded. It consumes 30% of the CPU
resources already. So, plan for a single digit number of parallel LOADs for column-organized
tables.
In a nutshell:
Use Db2 11.5.8 or higher.
All compression options of R3load are ignored
You need to configure at least 1.000.000 pages for Db2 Utility Heap per concurrent LOAD
– the more, the better.
Use the R3load Option “-loadprocedure fast LOAD” as this is significantly faster compared
to INSERT processing.
Plan for a low number of concurrent LOADs.
database partitions. This concept is often referred to as Database Partitioning Feature (DPF)
and allows a shared nothing concept for OLAP performance.
Tables in this environment are created to distribute the data evenly across all partitions
automatically. To do so, the tables are created with a “DISTRIBUTE BY HASH” or as
“PARTITIONING … USING HASHING” clause.
The DDL shown in Example 9-4 is generated for a standard E fact table of an InfoCube.
This DDL Statement is also created if the system does not have multiple partitions. It is
created for all tables that can be distributed across partitions. If you later add one or more
partitions, no change in the table definition is required. So, this partitioning type can be
ignored during a heterogeneous system copy to Db2.
172 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch09.fm
While they are in principle also exported and imported using the same R3load method like
non-BW systems, you must perform special preparation (source system) and post-processing
(target system) steps because SAP has implemented database-specific features for
BW-based systems.
On the source system, you must execute the report SAP_DROP_TMPTABLES, that deletes
temporary BW application tables, which would be copied unnecessarily. Also on the source
system, you must execute the report SMIGR_CREATE_DDL to retain special table
definitions.
On the target system, after database import, you must execute report
RS_BW_POST_MIGRATION that performs a number of SAP BW specific post migration
tasks, that comes with two variants:
SAP&POSTMGR is used when the DB platform has not changed, for example, when
performing a pure Unicode conversion
SAP&POSTMGRDB is used when the DB platform has changed. It includes several
additional repair operations that are necessary because of the SAP BW implementation
differences on the different database platforms.
With respect to the heterogeneous system copy, the hints, and tips out of this book apply as
the target is a single node system. However, there are some restrictions regarding the
supported table types. The following SAP supported types of tables are not supported in a
Db2 pureScale environment:
– Multidimensional clustering (MDC) tables
– Insert time clustering (ITC) tables
Therefore, if you plan a subsequent conversion to Db2 pureScale, do not enable these table
types and convert tables of this type that currently exist on the source database to regular
tables prior to database export. For more details, refer to the SAP Guide: Running an SAP
System on IBM Db2 11.5 with the Db2 pureScale Feature.
174 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch10.fm
10
You may also use other tools like the IBM Data Management Console, the SAP DBA Cockpit
or any other 3rd party tools.
The same is true for OS monitoring. We determined the tool nmon is useful for ad-hoc
monitoring and for historical monitoring. There are many other OS tools available like iostat,
vmstat, top (on Linux) and topas (on AIX) or perfmon (on Windows).
Last but not least, the cloud service providers also deliver tools to monitor the system
performance. So, you can use IBM Cloud Monitoring, Azure VM insights, AWS Cloudwatch,
Google Cloud Monitoring or 3rd party apps as well.
The key takeaway should be to use those tools to detect bottlenecks in Db2 or the system
environment and act accordingly.
For example, if you want to get the number of sort operations per interval, you get the total
number of sorts by a Db2 monitoring table function. You then must calculate the data based
on the difference in total number of sorts, divided by the collection interval.
Db2 also includes tools to automate this processing. Examples are dmctop, dsmtop, db2top
or the MONREPORT modules. Those tools prepare the data but do not provide all the
details, compared to the native interfaces.
A different set of tools can provide many details, calculate metrics and in addition persist data
to disk. Example for these tools is the SAP DBA Cockpit or the IBM Data Management
Console and many other commercial tools.
The utility db2top is based on the Db2 snapshot infrastructure that is no longer extended. So
db2top does not benefit from new performance metrics that are added to Db2. The tool is also
not available for Windows operating systems.
The tool dsmtop is a Java utility that uses monitoring table functions instead of Db2
snapshots. As it is written in Java, it is also available for Windows operating systems – with
some limitations.
An example for monitoring with db2top or dsmtop is to monitor the utilities running in the
database.
If you select the interactive command “u” of db2top or “Alt-u” in dsmtop, the utilities screen is
displayed. If you use the db2load API with R3load, the underlying Db2 LOAD processes are
displayed as shown in Example 10-1.
176 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch10.fm
10.1.2 dmctop
The dmctop utility is a simple text-based tool for monitoring, like dsmtop. The dmctop utility
can monitor Db2 version 11.1.0 and later fix packs. Beginning with Db2 v11.5.6, the utility is
bundled with Db2.
The dmctop utility is a lightweight, low-overhead tool that works in a text-only environment,
such as a simple Unix command line. Monitoring is accomplished by using mon_get_xxx table
functions. Views can be set to fast refresh rates to provide a real-time view of activity in the
monitored database.
You can use dmctop to see key performance indicators in the same way that the db2top
command worked. The dmctop is the most recent monitoring utility and should be used
instead of dsmtop or db2top. An example is shown in Figure 10-1.
The dmctop utility allows you to collect data in background mode. Unfortunately, the utility
collects only one set of data when it is called in background mode. Furthermore, you cannot
combine different areas to monitor. This means, you cannot collect data for the complete
database and all connections in one command. To collect the set of data in certain intervals
and multiple areas, you can use a simple script which is shown in Example 10-2.
This script will generate two comma separated files in the /tmp directory. One file for the
database overview and another one for details about the connections. If the files exist, the
data is appended to the files. If the files do not exist, new files are created. The newly created
files also contain one header line. This header line exists only once for each file.
You can use the generated files and import the data in a spreadsheet application and analyze
the data.
178 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch10.fm
You can also use the SQL shown in Example 10-6 to monitor the reorganization.
1 record(s) selected.
If you used the “COMPRESS” option, you can use the following statement to get information
about how many rows were compressed as shown in Example 10-7.
2 record(s) selected.
1 record(s) selected.
MIGHOST:db2zwn 58> db2 "SELECT SUBSTR(APPL_NAME,1,10) AS APPL_NAME, APPL_STATUS FROM SYSIB-MADM.SNAPAPPL_INFO where
AGENT_ID=22"
APPL_NAME APPL_STATUS
---------- ----------------------
R3load UOWWAIT
1 record(s) selected.
10.1.7 Sorts
If you use R3load, sorts can occur, for example: during the export or during the index creation
phase of an importing process. The command in Example 10-9 on page 181 provides
information about active sorts, total sorts, and sort overflows and therefore, helps to tune Db2
parameters related to sorting.
180 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch10.fm
1 record(s) selected.
If multiple loads are running in parallel, you should monitor the utility heap to avoid heap full
situations. You can do so by either using the Db2 memory tracker tool or the db2top or
dsmtop tool
With active Db2 LOAD processes, memory for “load data buffers” is allocated from the utility
heap. This is shown in Example 10-10.
Alternatively, you can use dmctop to extract SQL query information and to generate explain
information. To do so, start dmctop, press “s” to display the in-flight statements. Look for the
Application Name “R3load” and scroll down to the line by using the arrow key. Press Enter
followed by “L” and you will see the SQL statement that is executed. You now have the option
to export the statement to a file or to run the db2exfmt utility for this statement and save the
explain output. The explain output is automatically saved in the home directory of the user
that calls the dmctop utility.
A LOAD process consists of multiple phases; the current phase is marked in the output. In
Example 10-11 we see that for table BKPF, Phase #2 (LOAD) is the current one and that
3,305,066 rows (out of 17,520,379) have been loaded.
182 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch10.fm
In exceptional situation, you may unset the registry variable DB2_HISTORY_FILTER to gather
the information. This can be useful for an analysis after a test migration as the history
contains information about CPU parallelism, disk parallelism, buffer size and the indexing
mode, all in one place.
For example, if GENERAL times are high and DATABASE times are low during an export, the
move to a dedicated remote applications server can be beneficial while it would not be
beneficial if the DATABASE time dominates the process.
Beside the different components, you can assess if there is a resource bottleneck in the
System. For example, if the DATABASE “real” time is significantly higher than the time for
“usr”, the database cannot process efficiently and is waiting on resources. This potentially
points to an resource bottleneck in the system. Example 10-13 shows an example file.
Note: The time and resource required to compress the R3load dump files are reported as
part of the FILE resource usage. Therefore, a high percentage of FILE processing time
does not point to an IO Issue for the export dumps files in all cases.
184 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
11
The methods in this section also require the typical pre-migration and post-migration steps
like:
– Suspending batch jobs,
– Locking users,
– Disabling interfaces
– Setting up RFC communications
As this process is already well documented elsewhere, we do not provide the details here.
Refer to the chapter “IBM Db2 for Linux, UNIX, and Windows-Specific Procedures” in the SAP
system copy guide for details.
Generally the use of HADR involves the use of a cluster manager to provide the automated
failover. For this use case, there is no need to install the cluster manager as the failover will be
done manually during a final migration step.
Even if you already have an active Db2 HADR cluster with a cluster manager for automated
failover, this method can be used by introducing an 3rd auxiliary standby server without
automated failover.
The concept is to setup a Db2 HADR standby node in the cloud or remote datacenter and
perform the HADR takeover during the scheduled SAP migration weekend instead of running
time consuming export, transfer and import activities. This is shown in Figure 11-1.
186 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
The target operating system must be the same version and patch level as the source system.
As rolling OS updates are allowed, a mismatch would be acceptable only for a short period in
time but of course is not recommended. The Db2 software version needs to be the same for
source and target system.
The details of OS, Db2, network and storage requirements can be found in the IBM
documentation for Db2.
Taking this into account, there are some limitations to using HADR as a system copy method
compared to the common SAP SWPM R3load approach. A primary limitation is that you
cannot change the database endian type or perform a Unicode conversion.
11.2.2 Procedure
Use the following steps to implement this procedure:
1. First, the SAP target environment needs to be installed as a plain SAP NetWeaver
System, matching your source system version including the Db2 database release. SAP
application servers can be prepared with DEFAULT profiles and INSTANCE profiles
matching the source environment.
2. This is followed by a Db2 backup on-premises.
3. Then transfer the backup image and Db2 transaction log files to the target.
4. On the target environment perform a Db2 restore from the backup image.
5. Using the transaction logs, perform a rollforward to end of logs and keep the database in
rollforward mode.
6. Next, the source and target databases need then to be updated with the according HADR
settings.
If the source Db2 is already using HADR on-premises, this could be done as an online
activity as the HADR_TARGET_LIST parameter is already configured and only needs to
be updated with the new standby node in the remote datacenter.
Otherwise HADR activation can be done in a short system downtime outside of business
hours.
Example 11-1 contains the steps to set up the HADR pair if you already have a cluster
enabled and with a new auxiliary HADR stand-by for the purpose of moving to the cloud.
db2start
db2 "update db cfg for <SID> using LOGINDEXBUILD = ON"
db2 "update db cfg for <SID> using HADR_TARGET_LIST
<primaryon-prem>:<port>|<standbyonprem>:<port> immediate"
db2 "update db cfg for <SID> using HADR_LOCAL_HOST <localhostname> immediate"
db2 "update db cfg for <SID> using HADR_LOCAL_SVC <port> immediate"
db2 "update db cfg for <SID> using HADR_REMOTE_HOST <primaryonprem> immediate"
db2 "update db cfg for <SID> using HADR_REMOTE_SVC <port> immediate"
db2 "update db cfg for <SID> using HADR_REMOTE_INST db2<SID> immediate"
db2 "update db cfg for <SID> using HADR_SYNCMODE NEARSYNC immediate"
db2 "update db cfg for <SID> using HADR_SPOOL_LIMIT AUTOMATIC immediate"
db2stop
db2start
db2 get db cfg for <SID> | grep HADR
db2 START HADR on DB <SID> as STANDBY
db2pd -db <SID> -hadr
Example 11-2 contains the steps to set up the HADR pair of a new HADR stand-by for the
purpose of moving to the cloud only.
188 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
When the HADR cluster is operational, close monitoring is needed to check if the log gap is
decreasing and the HADR members are approaching peer state. Peer state is required before
performing the cut over.
During cut over, SAP source application servers need to be isolated and stopped like in
typical SAP migration scenarios. Then the HADR takeover at the cloud datacenter or the new
datacenter can be performed followed by starting the target SAP application servers. The
common homogeneous system copy tasks mentioned in SAP System Copy Guide, such as
key user acceptance tests and Go-Live would then be executed.
This step either can be performed keeping the HADR cluster working for possible fallback
scenarios or by splitting the HADR cluster and removing the original source copy.
This enables customers to reduce the overall downtime by limiting the system downtime
which leaves the business-related user acceptance and test activities. This assumes that the
filesystem contents of SAP transport directory, global directory and possible interface
directories are synced over prior to the cut over weekend.
An even more interesting procedure that involves HADR is discussed in 11.2.4, “HADR for
heterogeneous system copy” on page 189 which discusses how Db2 HADR can be used for a
heterogeneous system copy.
On the migration weekend you can then switch the HADR roles to primary in the remote
datacenter and subsequently perform the classic SWPM and R3load based migration
approach. The benefit is that the resulting export dump already resides within the new
datacenter and is copied locally.
This approach has been conducted successfully with customers but is an advanced option
that requires thoroughly planning.
Note: You cannot change the operating system with this option. So Db2 Shift can be used
for homogeneous system copies only.
The Db2 Click to Containerize family encompasses several tools that provide customers with
the ability to quickly modernize their Db2 landscape. The Db2 Shift utility is part of the Click to
Containerize family and can be used to clone a copy of Db2 into an OpenShift, Kubernetes,
Cloud Pak for Data (CP4D), or a IaaS Db2 instance. The utility is intended to help customers
move their existing Db2 databases on Linux into a cloud environment with the minimum
amount of effort. Some benefits of Db2 Shift are:
Automated, fast, and secure movement of Linux databases to Hybrid Cloud
Enables alignment with Agile Delivery and Project Lifecycle with Cloning capabilities.
In addition to directly shifting the database from one location to another, the Db2 Shift
program also provides the ability to clone a database for future deployment. This feature is
useful for environments where the target server is air-gapped, or unavailable for direct
connection from the source server.
190 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
– For expert users, the Db2 Shift command can be issued with the appropriate options
and run directly from a command line or a script.
– For those users who require more help, the program can also be run in an interactive
mode, with detailed instructions and help for the various shift scenarios.
In summary, Db2 Shift provides the ability to quickly and easily shift your Db2 Linux database
into a containerized environment with a minimal amount of effort.
Perform the installation with SWPM and “Homogeneous System Copy” using “Database Copy
Method”. Note that passwordless SSH for the db2<sid> instance user is required between the
source and target system.
Follow the SWPM installation procedure until the exit step to perform the backup/restore and
instead run the Db2 Shift procedure as replacement. Afterwards continue with SWPM to run
post migration activities for SAP systems.
Before moving an SAP system, it is required to isolate the system from an application point of
view, to avoid any kind of communication while starting the new target system. Therefore,
suspend batch jobs, lock users, disable interfaces and RFC communications.
2. Execute the offline move during the downtime, while the Db2 instance and all SAP
application servers are down and isolated as shown in Example 11-4.
192 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
-source-dbname=c2c \
-source-hadr-host=ibmas-red-s \
-source-hadr-port=3700 \
-dest-server=db2c2c@ibmas-red-t \
-dest-hadr-host=ibmas-red-t \
-dest-hadr-port=3700 \
-source-owner=db2c2c \
-dest-owner=db2c2c
After the configuration and structure files have been generated, copy the entire database into
the staging directory using Db2 Shift - /tmp/cloneDir in Example 11-8.
Example 11-8 db2shift command to copy the database into the staging area
db2 terminate; db2 deactivate db C2C
db2stop force
db2shift \
-mode=clone \
-source-dbname=C2C \
-source-owner=db2c2c \
-clone-dir=/tmp/cloneDir \
-offline \
-threads=8 \
-blank-slate=false
If the multi parallel copy method is successfully executed, copy the entire directory with all its
contents into your air-gapped target system before applying the clone. Afterwards apply the
clone using the --mirror-path option which creates the exact Db2 structure as it was before in
you source system, which is required for SAP systems. Optional DBM and DB CFG
parameters, such as the INSTANCE_MEMORY can be overwritten. See Example 11-9 on
page 194.
After the migration of the application data (including data conversion), the update is finalized,
and the SAP system runs on the target database.
The process involves an R3load export and import and thus, the findings in this book are
valuable for this procedure. For details, please refer to the SAP Guide: Database Migration
Option: Target Database IBM Db2 for Linux, Unix, and Windows.
The DMO option in unfortunately is not available if the database management system is the
same on the source and target system. This means, DMO is not 100% suitable for the use
case described in this book – migrations with Db2 from on premise to the Cloud. It is however
possible to use the DMO Option to migrate from other database vendors to Db2.
The DMO option is available on request as per SAP Note: 3207766 - Database Migration
Option (DMO) of SUM 1.1 SP01. The DMO has to be requested by reporting an incident on
component BC-UPG-TLS-TLA.
194 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531ch11.fm
One example for this procedure is IBM Data Replication - CDC Replication. This tool can be
embedded in a heterogeneous system copy with R3load where the largest tables are
replicated while the source system is online. This allows you to reduce the downtime
significantly and can be used also for heterogeneous system copies with a change of the
endianness. It also works for migrations from other selected database vendors to Db2. This is
shown in Figure 11-2.
Figure 11-2 Process Overview with IBM IBM Data Replication - CDC Replication
Note: This data replication process requires special knowledge and licensing and thus a
dedicated team is enabled to run this procedure. Do not hesitate to contact the authors of
this book or your IBM representative for details about this procedure.
196 Db2 Optimization Techniques for SAP Database Migration to the Cloud
Draft Document for Review June 20, 2023 5:35 pm 8531bibl.fm
Related publications
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Note that some publications referenced in this list might be available in softcopy
only.
Best Practices Guide for Databases on IBM FlashSystem, REDP-5520
Modernize SAP Workloads on IBM Power Systems, REDP-5577
SAP HANA on IBM Power Systems Virtual Servers: Hybrid Cloud Solution, REDP-5693
SAP HANA Data Management and Performance on IBM Power Systems, REDP-5570
SAP HANA on IBM Power Systems Backup and Recovery Solutions, REDP-5618
You can search for, view, download or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
IBM Db2 11.5 for Linux, UNIX and Windows documentation
https://www.ibm.com/docs/en/db2/11.5
System Copy for SAP ABAP Systems Based on UNIX: SAP HANA 2.0 Database - Using
Software Provisioning Manager 2.0
https://help.sap.com/docs/SLTOOLSET/afec789b34ba4a5fb5a26826fc8a2584/276253812b
ec452c8a63e61831db791c.html?version=CURRENT_VERSION_SWPM20
FAQ of IBM Cloud for SAP
https://cloud.ibm.com/docs/sap?topic=sap-faq-ibm-cloud-for-sap
SAP on IBM Power Systems Virtual Server
https://wiki.scn.sap.com/wiki/display/VIRTUALIZATION/SAP+on+IBM+Power+Systems+V
irtual+Server
SAP Sizing
https://www.sap.com/about/benchmark/sizing.html
Troubleshoot and monitor Linux system performance with nmon
https://www.redhat.com/sysadmin/monitor-linux-performance-nmon
198 Db2 Optimization Techniques for SAP Database Migration to the Cloud
To determine the spine width of a book, you divide the paper PPI into the number of pages in the book. An example is a 250 page book using Plainfield opaque 50# smooth which has a PPI of 526. Divided
250 by 526 which equals a spine width of .4752". In this case, you would use the .5” spine. Now select the Spine width for the book and hide the others: Special>Conditional
Text>Show/Hide>SpineSize(-->Hide:)>Set . Move the changed Conditional text settings to all files in your book by opening the book file with the spine.fm still open and File>Import>Formats the
Conditional Text Settings (ONLY!) to the book files.
Draft Document for Review June 20, 2023 5:35 pm 8531spine.fm 199
Db2 Optimization Techniques for SG24-8531-00
SAP Database Migration to the Cloud ISBN DocISBN
(1.5” spine)
1.5”<-> 1.998”
789 <->1051 pages
Db2 Optimization Techniques for SG24-8531-00
SAP Database Migration to the Cloud ISBN DocISBN
(1.0” spine)
0.875”<->1.498”
460 <-> 788 pages
SG24-8531-00
Db2 Optimization Techniques for SAP Database Migration to the Cloud
ISBN DocISBN
(0.5” spine)
0.475”<->0.873”
250 <-> 459 pages
Db2 Optimization Techniques for SAP Database Migration to the Cloud
(0.2”spine)
0.17”<->0.473”
90<->249 pages
(0.1”spine)
0.1”<->0.169”
53<->89 pages
To determine the spine width of a book, you divide the paper PPI into the number of pages in the book. An example is a 250 page book using Plainfield opaque 50# smooth which has a PPI of 526. Divided
250 by 526 which equals a spine width of .4752". In this case, you would use the .5” spine. Now select the Spine width for the book and hide the others: Special>Conditional
Text>Show/Hide>SpineSize(-->Hide:)>Set . Move the changed Conditional text settings to all files in your book by opening the book file with the spine.fm still open and File>Import>Formats the
Conditional Text Settings (ONLY!) to the book files.
Draft Document for Review June 20, 2023 5:35 pm 8531spine.fm 200
Db2 Optimization Techniques for SG24-8531-00
SAP Database Migration to the ISBN DocISBN
Cloud
(2.5” spine)
2.5”<->nnn.n”
1315<-> nnnn pages
Db2 Optimization Techniques for SG24-8531-00
SAP Database Migration to the ISBN DocISBN
Cloud
(2.0” spine)
2.0” <-> 2.498”
1052 <-> 1314 pages
Back cover
Draft Document for Review June 20, 2023 5:35 pm
SG24-8531-00
ISBN DocISBN
Printed in U.S.A.
®
ibm.com/redbooks