Lab - Batch Data Ingestion With DMS - Instructor Setup
Lab - Batch Data Ingestion With DMS - Instructor Setup
1
Table of Contents
Introduction .......................................................................................................................... 2
Create the Instructor Environment ......................................................................................... 3
Create the Change Data Capture Environment (Optional) ....................................................... 6
Appendix: AWS CloudFormation Template ........................................................................... 10
1
Database Migration Services Instructor Environment for the Lab
Introduction
***Make sure you select the us-east-1 (Virginia) region***
The Database Migration Services (DMS) hands-on lab provide a scenario, where participant
learns to hydrate Amazon S3 data lake with a relation database. To achieve that, participants
need a source endpoint and this guide helps instructors set up a PostgreSQL database with
public endpoint as the source database.
2
Database Migration Services Instructor Environment for the Lab
1. Sign in to the Console where you will host the source database environment.
4. Give stack name and Enter the Key Pair to use. Please make sure to create an Amazon
EC2 Key pair if don’t have one in select us-east-1 (Virginia) region. Follow User guide
Amazon EC2 key pairs to create a key pair.
5. Enter a tag for the Name that identifies the resources as part of this lab.
6. Launch the stack. It may take 15 minutes for the stack to launch.
This stack creates a new VPC, Subnets, Security groups, EC2 instance, Route table,
Routes, and an RDS Postgres instance and takes about 20 minutes to launch. You can
see all resources listed below:
3
Database Migration Services Instructor Environment for the Lab
7. Once the stack is launched, navigate to the Amazon Relational Database Service
(Amazon RDS) page and select Instances > dmslabinstance and Copy the instance
Endpoint information as shown in below screenshot
4
Database Migration Services Instructor Environment for the Lab
8. SSH to the ec2 instance created by this template and execute the following command(s)
in sequence:
cd aws-database-migration-samples/PostgreSQL/sampledb/v1/
export PGPASSWORD=master123
To see how your job is doing you can observe install.out file by giving command
cat ~/install.out
Note:
i. It may include messages about non-existing table, but you should not see
any errors and the background process will end when complete. You can
check whether the process is still running with the following command.
ps -aef | grep psql
5
Database Migration Services Instructor Environment for the Lab
When you want to generate transactions to demonstrate DMS CDC functionality you can
execute the following commands:
psql --host=<instanceaddress> --port=5432 --dbname=sportstickets --username=master
enter the password “master123” when prompted, then you can execute the following within
the psql command prompt (sportstickets=>)
select dms_sample.generateticketactivity(1000);
select dms_sample.generatetransferactivity(100);
Note:
When enabling CDC functionality in DMS, only one DMS instance/task should activate “Ongoing
replication” to avoid conflicts.
When replicating to multiple targets, the processing to fan out the updates should begin with
the Amazon S3 bucket, that is the target of the DMS task responsible for Ongoing replication.
The process should not begin with the source database, as only one CDC process should be
tracking and setting the last committed transaction that was replicated.
1. Create a custom DB parameter group in RDS console for postgres10. Go to Amazon RDS
Parameter groups and click on Create Parameter group button as shown below:
6
Database Migration Services Instructor Environment for the Lab
7
Database Migration Services Instructor Environment for the Lab
3. Modify the RDS instance we created, and associate the custom parameter group with
the RDS instead of the default parameter group, and choose to apply it immediately.
4. Once you see that your instance parameters are in “pending-reboo” state, reboot the
RDS instance via RDS console to let the new static parameters take effect.
8
Database Migration Services Instructor Environment for the Lab
5. After the reboot of the database SSH to your ec2 instance and run following:
enter the password “master123” when prompted, then you can run the
following SQL script to create the wrappers needed for DMS CDC replication:
BEGIN;
CREATE SCHEMA IF NOT EXISTS fnRenames;
CREATE OR REPLACE FUNCTION fnRenames.pg_switch_xlog() RETURNS pg_lsn AS $$
SELECT pg_switch_wal(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_pause() RETURNS VOID AS $$
SELECT pg_wal_replay_pause(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_resume() RETURNS VOID AS $$
SELECT pg_wal_replay_resume(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_location() RETURNS pg_lsn AS $$
SELECT pg_current_wal_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_is_xlog_replay_paused() RETURNS boolean AS $$
SELECT pg_is_wal_replay_paused(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name(lsn pg_lsn) RETURNS TEXT AS $$
SELECT pg_walfile_name(lsn); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_replay_location() RETURNS pg_lsn AS $$
SELECT pg_last_wal_replay_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_receive_location() RETURNS pg_lsn AS $$
SELECT pg_last_wal_receive_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_flush_location() RETURNS pg_lsn AS $$
SELECT pg_current_wal_flush_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_insert_location() RETURNS pg_lsn AS $$
SELECT pg_current_wal_insert_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_location_diff(lsn1 pg_lsn, lsn2 pg_lsn) RETURNS NUMERIC AS $$
SELECT pg_wal_lsn_diff(lsn1, lsn2); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name_offset(lsn pg_lsn, OUT TEXT, OUT INTEGER) AS $$
SELECT pg_walfile_name_offset(lsn); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_create_logical_replication_slot(slot_name name, plugin name,
temporary BOOLEAN DEFAULT FALSE, OUT slot_name name, OUT xlog_position pg_lsn) RETURNS RECORD AS $$
SELECT slot_name::NAME, lsn::pg_lsn FROM pg_catalog.pg_create_logical_replication_slot(slot_name, plugin,
temporary); $$ LANGUAGE SQL;
9
Database Migration Services Instructor Environment for the Lab
Copy and paste this template into an instructor_dmslab.json file on your computer and save it.
Select that file in AWS CloudFormation for Step 3.
{
"AWSTemplateFormatVersion": "2010-09-09",
"Parameters" : {
"KeyName": {
"Description" : "Name of an existing EC2 KeyPair to enable SSH access to the instance",
"Type": "AWS::EC2::KeyPair::KeyName",
"ConstraintDescription" : "must be the name of an existing EC2 KeyPair in us-east-1
only."
}
},
"Resources": {
"dmsinstructorvpc": {
"Type": "AWS::EC2::VPC",
"Properties": {
"CidrBlock": "10.0.0.0/24",
"InstanceTenancy": "default",
"EnableDnsSupport": "true",
"EnableDnsHostnames": "true",
"Tags": [
{
"Key": "Name",
"Value": "DMSLabSourceDB"
}
]
}
},
"RDSSubNet": {
"Type": "AWS::EC2::Subnet",
"Properties": {
"CidrBlock": "10.0.0.0/28",
"AvailabilityZone": "us-east-1d",
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"Tags": [
{
"Key": "Name",
"Value": "DMSLabRDS1"
}
]
}
},
"EC2SubNet": {
"Type": "AWS::EC2::Subnet",
"Properties": {
"CidrBlock": "10.0.0.32/28",
"AvailabilityZone": "us-east-1c",
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"Tags": [
{
"Key": "Name",
"Value": "DMSLabEC2"
}
]
}
},
10
Database Migration Services Instructor Environment for the Lab
"RDSSubNet2": {
"Type": "AWS::EC2::Subnet",
"Properties": {
"CidrBlock": "10.0.0.16/28",
"AvailabilityZone": "us-east-1b",
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"Tags": [
{
"Key": "Name",
"Value": "DMSLabRDS2"
}
]
}
},
"igw0887475a258f00277": {
"Type": "AWS::EC2::InternetGateway",
"Properties": {
"Tags": [
{
"Key": "Name",
"Value": "DMSLabIGW"
}
]
}
},
"dopt1cc25278": {
"Type": "AWS::EC2::DHCPOptions",
"Properties": {
"DomainName": "ec2.internal",
"DomainNameServers": [
"AmazonProvidedDNS"
]
}
},
"rtb0c3fae104a7b64456": {
"Type": "AWS::EC2::RouteTable",
"Properties": {
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"Tags": [
{
"Key": "Name",
"Value": "DMSLabRT"
}
]
}
},
"instancei0f63b887480639040": {
"Type": "AWS::EC2::Instance",
"Properties": {
"DisableApiTermination": "false",
"InstanceInitiatedShutdownBehavior": "stop",
"EbsOptimized": "true",
"ImageId": "ami-04681a1dbd79675a5",
"InstanceType": "t3.2xlarge",
"KeyName": {"Ref" : "KeyName" },
"UserData" : {"Fn::Base64" : {"Fn::Join" : ["", [
"#!/bin/bash -xe\n",
"yum install -y postgresql\n",
"yum install -y git\n",
"yum update -y\n",
"cd /home/ec2-user\n",
"git clone https://github.com/aws-samples/aws-database-migration-
samples.git\n"
]]}},
"Monitoring": "false",
"Tags": [
{
11
Database Migration Services Instructor Environment for the Lab
"Key": "Name",
"Value": "DMSLabEC2"
}
],
"NetworkInterfaces": [
{
"DeleteOnTermination": "true",
"Description": "Primary network interface",
"DeviceIndex": 0,
"SubnetId": {
"Ref": "EC2SubNet"
},
"PrivateIpAddresses": [
{
"PrivateIpAddress": "10.0.0.40",
"Primary": "true"
}
],
"GroupSet": [
{
"Ref": "sgDMSLabSG"
}
],
"AssociatePublicIpAddress": "true"
}
]
}
},
"rdsdmslabdb": {
"Type": "AWS::RDS::DBInstance",
"Properties": {
"AllocatedStorage": "20",
"AllowMajorVersionUpgrade": "false",
"AutoMinorVersionUpgrade": "true",
"DBInstanceClass": "db.t2.xlarge",
"DBInstanceIdentifier": "dmslabinstance",
"Port": "5432",
"PubliclyAccessible": "true",
"StorageType": "gp2",
"BackupRetentionPeriod": "7",
"MasterUsername": "master",
"MasterUserPassword": "master123",
"PreferredBackupWindow": "04:00-04:30",
"PreferredMaintenanceWindow": "sun:05:20-sun:05:50",
"DBName": "sportstickets",
"Engine": "postgres",
"EngineVersion": "10.4",
"LicenseModel": "postgresql-license",
"DBSubnetGroupName": {
"Ref": "dbsubnetdefaultdmsinstructorvpc"
},
"VPCSecurityGroups": [
{
"Ref": "sgrdslaunchwizard2"
}
],
"Tags": [
{
"Key": "workload-type",
"Value": "other"
}
]
}
},
"dbsubnetdefaultdmsinstructorvpc": {
"Type": "AWS::RDS::DBSubnetGroup",
"Properties": {
"DBSubnetGroupDescription": "Created from the RDS Management Console",
"SubnetIds": [
{
"Ref": "RDSSubNet"
12
Database Migration Services Instructor Environment for the Lab
},
{
"Ref": "EC2SubNet"
},
{
"Ref": "RDSSubNet2"
}
]
}
},
"sgDMSLabSG": {
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": "launch-wizard-6 created 2018-08-29T15:10:01.302-04:00",
"VpcId": {
"Ref": "dmsinstructorvpc"
}
}
},
"sgrdslaunchwizard2": {
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": "Created from the RDS Management Console: 2018/08/29
18:14:15",
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"Tags": [
{
"Key": "Name",
"Value": "DMSLabRDS-SG"
}
]
}
},
"dbsgdefault": {
"Type": "AWS::RDS::DBSecurityGroup",
"Properties": {
"GroupDescription": "default"
}
},
"gw1": {
"Type": "AWS::EC2::VPCGatewayAttachment",
"Properties": {
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"InternetGatewayId": {
"Ref": "igw0887475a258f00277"
}
}
},
"subnetroute1": {
"Type": "AWS::EC2::SubnetRouteTableAssociation",
"Properties": {
"RouteTableId": {
"Ref": "rtb0c3fae104a7b64456"
},
"SubnetId": {
"Ref": "RDSSubNet2"
}
}
},
"subnetroute2": {
"Type": "AWS::EC2::SubnetRouteTableAssociation",
"Properties": {
"RouteTableId": {
"Ref": "rtb0c3fae104a7b64456"
},
"SubnetId": {
"Ref": "RDSSubNet"
13
Database Migration Services Instructor Environment for the Lab
}
}
},
"subnetroute3": {
"Type": "AWS::EC2::SubnetRouteTableAssociation",
"Properties": {
"RouteTableId": {
"Ref": "rtb0c3fae104a7b64456"
},
"SubnetId": {
"Ref": "EC2SubNet"
}
}
},
"route1": {
"Type": "AWS::EC2::Route",
"Properties": {
"DestinationCidrBlock": "0.0.0.0/0",
"RouteTableId": {
"Ref": "rtb0c3fae104a7b64456"
},
"GatewayId": {
"Ref": "igw0887475a258f00277"
}
},
"DependsOn": "gw1"
},
"dchpassoc1": {
"Type": "AWS::EC2::VPCDHCPOptionsAssociation",
"Properties": {
"VpcId": {
"Ref": "dmsinstructorvpc"
},
"DhcpOptionsId": {
"Ref": "dopt1cc25278"
}
}
},
"ingress1": {
"Type": "AWS::EC2::SecurityGroupIngress",
"Properties": {
"GroupId": {
"Ref": "sgDMSLabSG"
},
"IpProtocol": "tcp",
"FromPort": "22",
"ToPort": "22",
"CidrIp": "0.0.0.0/0"
}
},
"ingress2": {
"Type": "AWS::EC2::SecurityGroupIngress",
"Properties": {
"GroupId": {
"Ref": "sgrdslaunchwizard2"
},
"IpProtocol": "tcp",
"FromPort": "5432",
"ToPort": "5432",
"SourceSecurityGroupId": {
"Ref": "sgDMSLabSG"
},
"SourceSecurityGroupOwnerId": "649225637812"
}
},
"ingress3": {
"Type": "AWS::EC2::SecurityGroupIngress",
"Properties": {
"GroupId": {
"Ref": "sgrdslaunchwizard2"
},
14
Database Migration Services Instructor Environment for the Lab
"IpProtocol": "tcp",
"FromPort": "5432",
"ToPort": "5432",
"CidrIp": "72.21.196.67/32"
}
},
"ingress4": {
"Type": "AWS::EC2::SecurityGroupIngress",
"Properties": {
"GroupId": {
"Ref": "sgrdslaunchwizard2"
},
"IpProtocol": "tcp",
"FromPort": "5432",
"ToPort": "5432",
"CidrIp": "0.0.0.0/0"
}
},
"egress1": {
"Type": "AWS::EC2::SecurityGroupEgress",
"Properties": {
"GroupId": {
"Ref": "sgDMSLabSG"
},
"IpProtocol": "-1",
"CidrIp": "0.0.0.0/0"
}
},
"egress2": {
"Type": "AWS::EC2::SecurityGroupEgress",
"Properties": {
"GroupId": {
"Ref": "sgrdslaunchwizard2"
},
"IpProtocol": "-1",
"CidrIp": "0.0.0.0/0"
}
}
},
"Description": "DMS Lab Instructor account",
"Metadata": {
"AWS::CloudFormation::Designer": {
"a79fb943-c167-4e59-8eda-911d4acc331f": {
"size": {
"width": 60,
"height": 60
},
"position": {
"x": 810,
"y": 390
},
"z": 1,
"embeds": []
}
}
}
}
15