In this post, we will explore one approach to Migrate PostgreSQL DB on EC2 instance to RDS instance using AWS Data Migration services.
Amazon Relational Database Service (Amazon RDS) is quite popular because of its easy setup, operation, and auto-scalability. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching and backups. It frees you to focus on your applications so you can give them the fast performance, high availability, security, and compatibility they need.
At a high level this migration will be done in 3 steps, let’s briefly go over each piece of the puzzle:
- Prepare source (EC2 Postgres) for database migration service
- Prepare target (RDS Postgres) for database migration service
- Set up DMS for full load or full load + replication data transfer from source to target
Proof Of Concept
Assumptions and Prerequisites:
- Source at EC2 Postgres version 9.5.21 & target at RDS Postgres version 9.5.21
- Source has primary keys on all tables
- Superuser access on both source and target Postgres environment
1. Prepare source (EC2 Postgres) for database migration service
1a. Add the IP address of the AWS DMS replication server to the pg_hba.conf configuration file and enable replication and socket connections. For example:
# Replication Instance host all all 126.96.36.199/00 md5 # Allow replication connections from localhost, by a user with the # replication privilege. host replication dms 188.8.131.52/00 md5
1b. Set the following parameters and values in the postgresql.conf configuration file:
- Set wal_level = logical
- Set max_replication_slots to a value greater than 1. The max_replication_slots value should be set according to the number of tasks that you want to run. For example, to run five tasks you need to set a minimum of five slots. Slots open automatically as soon as a task starts and remain open even when the task is no longer running. You need to manually delete open slots.
- Set max_wal_senders to a value greater than 1. The max_wal_senders parameter sets the number of concurrent tasks that can run.
- Set wal_sender_timeout =0. The wal_sender_timeout parameter terminates replication connections that are inactive longer than the specified number of milliseconds. Although the default is 60 seconds, we recommend that you set this parameter to zero, which disables the timeout mechanism.
2. Prepare target (RDS Postgres) for database migration service
AWS DMS takes a table-by-table approach when migrating data from source to target in the Full Load phase. Table order during the full load phase cannot be guaranteed. Tables are out of sync during the full load phase and while cached transactions for individual tables are being applied. As a result, active referential integrity constraints can result in task failure during the full load phase.
In PostgreSQL, foreign keys (referential integrity constraints) are implemented using triggers. During the full load phase, AWS DMS loads each table one at a time. Recommendations are that you disable foreign key constraints during a full load, using one of the following methods:
- Temporarily disable all triggers from the instance, and finish the full load
- Use the
session_replication_roleparameter in PostgreSQL.
Alter system set session_replication_role='replica';
3. Setup DMS for (full load) or (full load + replication) data transfer between source and target
Following is the setup for demonstration of full load data transfer with a sample database of Postgres version 9.5.21 (source and target)
3a. Create and setup replication instance
3b. Create and setup source endpoint
3c. Create and set up target endpoint
3d. Create, setup and run migration task
Limitations for DMS: There are multiple limitations with different versions of Postgres, the following docs will help to identify the issue and mitigate it before running the migration.
Source & Target limitations
Great Read! Nicely explained !!