migration_eng (1)
migration_eng (1)
postgrespro.com
Peter Petrov,
Database
Engineer,
September 09,
2021
Speaker bio
1. Started working with Oracle 10g Release 2 in 2013 and PostgreSQL 8.4 in
2015.
2. Wrote procedures to transfer data from Oracle to PostgreSQL, database sizes
varied from 1TB to 5TB.
3. Consulted customers on correct business logic design and development using
pl/pgSQL programming language.
4. Participated in optimization of various database queries.
5. Participated in troubleshooting various situations that occurred during
PostgreSQL maintenance.
6. Designed some business logic by using PL/SQL and Java programming
languages.
2
Agenda
3
Determining the source RDBMS features and
assessment of migration feasibility (1)
The data schema can be migrated “as is” or with the following modifications:
It is possible to migrate just a part of the schema since some objects may
be used in some deprecated code.
There can also be changes related to the schema’s normalization and
denormalization.
Replacing column data types with the PostgreSQL compatible
counterparts.
Dividing some objects into smaller partitions for reducing maintenance
operations time.
Creating additional data structures for storing some intermediate
computation results.
The application itself can use the source RDBMS features, therefore, it
must be rewritten while migrating to PostgreSQL:
Old join syntax in user queries.
Hierarchical queries usage in the application code.
Presence of user-defined data types.
Specific functions, procedures or modules for communication with
some external systems.
9
Brief data migration description (1)
Data migration 10
Brief data migration description (2)
Data migration 11
Brief data migration description (3)
Data migration 12
Data conversion tips (1)
1. It’s required to establish a data type mapping between the source and
the destination DBMS.
2. If the required data type is absent in PostgreSQL, then a
transformation procedure should be used to convert it into the
existing type in PostgreSQL.
3. Some additional PostgreSQL modules provide additional data types
for the application’s needs.
4. Consider the possibility of storing various document formats, as well
as a sequence of nested data structures inside tables columns.
Data migration 13
Data conversion tips (2)
Data migration 14
ora2pg as a tool for schema and data migration
ora2pg is the open-source utility for work with Oracle DBMS providing
the following features:
1. A migration project creation.
2. Scanning the source RDBMS and extracting its schema definition and
data.
3. Generating commands to create PostgreSQL-compatible structures.
4. Saving data of the source DBMS in intermediate files, if necessary.
pgloader is the open-source utility for work with MS SQL Server, MySQL
and PostgreSQL DBMS. pgloader provides the following features:
• Scanning the source RDBMS and extracting its schema definition and
data.
• Schema objects can be created directly in PostgreSQL without
generating intermediate SQL files.
• Generate a file containing row data that encountered an error
during processing in the destination DBMS. In this case, the
remaining lines are successfully saved.
1. ora2pg for partial translation of Oracle and MySQL stored procedure code.
2. ANTLR4 and its grammar files for PL/SQL and T-SQL code.
3. Third party online conversion services such as sqlines.com.
13. Queries with filter and join conditions that can’t be calculated
during a query execution time.
14. Executing schedules tasks by using the dbms_scheduler
package.
9. For the stored code unit-testing pg_tap and pg_prove modules should
be used.
10. Autonomous transactions are available in Postgres Pro Enterprise
distribution. dblink or pg_background modules could also be used for
that.
11. pg_variables could be used for storing various data structures
during user sessions.
12. PostgreSQL provides the RLS mechanism for the implementation of
additional row access rules.
36
Preparing the system for the real-world workload
37
Preparation a code and data conversion solution (1)
The main goals are downtime reducing and avoiding code reconversion.
The possible methods are presented below:
1. Capturing changes in the schema definition and its data.
2. Detecting unchanged data and its conversion.
3. Using triggers for capturing data changes.
4. Testing data transferring procedure and measuring its execution time.
1. MATERIALIZED VIEW LOG and its data transferring to a message queue such as
Apache Kafka:
https://github.com/averemee-si/oracdc. This solution is tailored to the business
reporting schemas as well as Oracle business suite
4. symmetricDS uses triggers to write changes to a service table, then data packages are
formed and transferred for the exchange between the source and target nodes:
https://www.symmetricds.org/doc/3.7/html/user-guide.html#_architecture
5. Debezium connectors for retrieving data changes from various DBMS with subsequent transfer
to Apache Kafka:
https://github.com/debezium/debezium/tree/master/debezium-connector-sqlserver
https://debezium.io/documentation/reference/connectors/sqlserver.html
https://docs.informatica.com/data-integration/powerexchange-for-cdc-and-mainframe/10-
0/_cdc-guide-for-linux-unix-and-windows_powerexchange-for-cdc-and-
mainframe_100_ditamap/powerexchange_cdc_data_sources_1/db2_for_linux_unix_and_
windows_cdc.html
System’s summary:
1. Total database size is 600GB.
2. The number of concurrent user sessions on the application servers:
600-800.
3. All business logic was implemented on the application side.
46
An example of transferring a corporate document
management system to the Postgres Pro Standard (2)
Goals:
1. Develop an automated procedure for transferring data to the
Postgres Pro Standard DBMS. Determine the objects list to be
transferred.
2. Suggest some solutions for improving the database performance.
3. Convert heavy queries for work with the Postgres Pro Standard
DBMS.
47
An example of transferring a corporate document
management system to the Postgres Pro Standard (3)
Actions performed:
1. Scripts for automatic invocation of the ora2pg and pentaho kettle
utilities for schema conversion and data transfer have been designed.
The user could specify lists of transferred objects as well as the
number of simultaneously reading and writing threads.
2. Heavy queries have been converted, and recommendations for their
optimization have been developed.
3. Schema changes have been included in the migration procedures.
4. Consultations for the DBMS and OS tuning have been provided.
48
An example of transferring a department system to
Postgres Pro Enterprise (1)
System’s summary:
1. Total database size is 5TB.
2. The number of concurrent user sessions on the application servers :
2000-5000.
3. 95% of business logic was implemented on the application side, 5%
was implemented in a form of Oracle views for reporting.
4. There was a table with large binary data with the total size of 4.5TB
and tables with the number of rows > 1 billion.
49
An example of transferring a department system to
Postgres Pro Enterprise (2)
Goals:
1. Develop a procedure for transferring data to the Postgres Pro
Enterprise DBMS. Determine the objects list to be transferred.
2. Convert views for work with the Postgres Pro Enterprise.
3. Optimize Postgres Pro Enterprise DBMS to handle hundreds of
user sessions on a server with many cores.
50
An example of transferring a department system to
Postgres Pro Enterprise (3)
Actions performed:
1. Scripts for automatic invocation of the ora2pg and pentaho
kettle utilities for schema conversion and data transfer have
been implemented. The user could specify lists of
transferred objects as well as the number of simultaneously
reading and writing threads.
2. Heavy views have been converted and recommendations for
their optimization have been developed, consultations for the
DBMS and OS tuning have been provided.
3. Methods for tracking changes in tables with binary data and a
large number of rows have been proposed.
4. Patches of the DBMS kernel have been developed to
optimize working with a large number of user sessions.
51
PostgreSQL features from version 12 (1)
52
PostgreSQL features from version 12 (2)
53
Useful links (1)
ora2pg: https://github.com/darold/ora2pg
pentaho kettle: https://help.pentaho.com/Documentation/9.1
pgloader: https://github.com/dimitri/pgloader
ANTLR4: https://github.com/antlr/antlr4
sqlines: https://github.com/dmtolpeko/sqlines
orafce: https://github.com/orafce/orafce
pg_cron: https://github.com/citusdata/pg_cron
54
Useful links (2)
pg_timetable: https://github.com/cybertec-postgresql/pg_timetable
plprofiler: https://github.com/bigsql/plprofiler
multicorn: https://github.com/Segfault-Inc/Multicorn
oracdc: https://github.com/averemee-si/oracdc
kafka-connect-oracle: https://github.com/erdemcer/kafka-connect-oracle
debezium + log miner: https://github.com/debezium/debezium-incubator/pull/185
debezium + xstream: https://github.com/debezium/debezium
mamonsu: https://github.com/postgrespro/mamonsu
55
Useful links (3)
http://postgrespro.com/
[email protected]
[email protected]
postgrespro.com