Amazon Relational Database Service (RDS) allows you to deploy scalable MySQL servers in minutes with cost-efficient and resizable hardware capacity.
You can ingest data from your Amazon RDS MySQL database using Hevo Pipelines and replicate it to a Destination of your choice. Hevo recommends connecting to a read-replica of your master MySQL database, however, this is not mandatory. A read-replica reflects the changes to your master database in near real-time, and can reduce the load on the master database by serving all “read” requests. The Role attribute of the instance identifies its type.
Prerequisites
Perform the following steps to configure your Amazon RDS MySQL Source:
Create a Read Replica (Optional)
To use an existing read-replica or connect Hevo to your master database, skip to Set up MySQL Binary Logs for Replication section.
To create a read-replica:
-
Open the Amazon RDS console.
-
In the left navigation pane, under Dashboard, click Databases (or Instances if you are using an older version).
-
In the Databases section on the right, select the DB identifier of the RDS MySQL instance you want to replicate. For example, amazon-rds-mysql, in the image below.
-
In the Actions drop-down, click Create read replica.
-
On the Create read replica page, specify the following:
-
Select the instance specifications relevant to your requirements.
-
Under the Connectivity section, set the Public access option to Publicly accessible to allow connection to the DB instance via a public IP address, such as, Hevo’s IP address.
-
Click Create read replica. It takes a few minutes for the read replica to be created, during which it shows the status as Creating. The status changes to Available when the replica is created.
You can now see the Read Replica instance in the Databases section.
Set up MySQL Binary Logs for Replication
A binary log is a collection of log files that records information about data modifications and data object modifications made on a MySQL server instance. Typically binary logs are used for data replication and data recovery.
In order for the binary log to record the data modifications, automatic backups must be enabled with a duration of at least 1 day.
To enable binary logging, follow these steps:
-
Open the Amazon RDS console.
-
In the left navigation pane, click Databases (or Instances if you are using an older version).
-
In the Databases section on the right, click the DB instance that you want to connect.
-
Click the Configuration tab, and then click the link text under DB instance parameter group. If you have created the parameter group with Type as DB cluster parameter group, click the link text under DB cluster parameter group.
-
Click Edit.
-
Update the values of the parameters as follows:
Parameter Name |
Value |
binlog_format |
ROW |
binlog_row_image |
full |
gtid-mode |
ON |
enforce_gtid_consistency |
ON |
log_slave_updates |
1 |
Note:
-
The log_slave_updates
setting is required only if you are connecting to a read replica. When it is set to 1, the replica logs any updates received from the main database, maintaining a record of those changes in its log.
-
Enabling GTID is recommended because it makes replication simpler and helps ensure that the primary and replica servers are in sync with each other.
-
Ensure that the value of binlog_row_value_options
parameter is not set to PARTIAL_JSON.
-
Click Save changes.
-
Reboot the database instance that you are using to connect to Hevo, to apply the above changes.
To do this:
-
In the left navigation pane, under Dashboard, select Databases.
-
In the Databases section on the right, select the DB identifier of the MySQL instance you are replicating.
-
In the Actions drop-down, click Reboot.
-
On the Reboot DB Instance page, click Confirm to reboot your DB instance.
The replication reference guide on MySQL’s documentation portal provides a complete reference of the options available for replication and binary logging.
2. Enable Automatic Backups
-
In the Amazon RDS console, in the left navigation pane, click Databases (or Instances if you are using an older version).
-
In the Databases section on the right, select the instance for which you want to enable automatic backups, and then click Modify.
-
On the Modify DB Instance: <Instance name> page, scroll down to the Backup section.
-
Select the Enable automated backups check box and set the Backup retention period to any value greater than or equal to 1 day. A backup retention period of at least 3 days is recommended.
-
Click Continue.
-
Under the Schedule modifications panel, select Apply immediately, and then click Modify DB instance.
-
Log in to your Amazon RDS MySQL database instance with ADMIN
privileges.
-
Run the following command to view the current BinLog retention period (in hours):
call mysql.rds_show_configuration;
-
If the BinLog retention period is less than 72 hours, run the following command to set it to at least 72 hours (three days).
call mysql.rds_set_configuration('binlog retention hours', 72);
Allowlist Hevo IP addresses for your region
You need to allowlist the Hevo IP address for your region to enable Hevo to connect to your Amazon RDS MySQL database.
To do this:
-
Open the Amazon RDS console.
-
In the left navigation pane, click Databases (or Instances if you are using an older version).
-
In the Databases section on the right, click the DB identifier of the Amazon RDS MySQL instance to configure a security group.
Note: The instance does not necessarily have to be a replica as long as it allowlists the region’s IP addresses.
-
In the Connectivity & security tab:
-
Note the region where your account is set up, using the Hevo App URL also. For example, the URL for the US region is https://us.hevodata.com/.
-
Click the link text under Security, VPC security groups to open the Security Groups panel.
-
Select the security group and then select Edit inbound rules from the Actions drop-down.
-
On the Edit inbound rules page:
-
Click Add rule.
-
In the Port range column, enter the port of your Amazon RDS MySQL instance. For example, 3306.
-
In the Source column, select Custom from the drop-down and enter Hevo’s IP addresses for your region.
-
Click Save rules.
Create a Database User and Grant Privileges
1. Create a database user (Optional)
Perform the following steps to create a database user in your Amazon RDS MySQL database:
-
Connect to your Amazon RDS MySQL database as a root user with an SQL client tool, such as MySQL workbench.
-
Create a database user:
CREATE USER <username>@'%' IDENTIFIED BY '<password>';
Note: Replace the placeholder values in the command above with your own. For example, <username> with hevo.
2. Grant privileges to the user
The database user specified in the Hevo Pipeline must have the following global privileges:
-
SELECT
- Select data from tables in your database which you want to replicate.
-
RELOAD
- Access to clear or reload internal caches, flush tables, or acquire locks.
-
SHOW DATABASES
- View list of database names in the server.
-
REPLICATION CLIENT
- Access to the MySQL server BinLog for replication.
-
REPLICATION SLAVE
- Access replication status and log details.
Note: The SELECT
, RELOAD
, and SHOW DATABASES
privileges are only required during the historical load.
Perform the following steps to set up these privileges:
-
Connect to your Amazon RDS MySQL database as a root user with an SQL client tool, such as MySQL workbench.
-
Grant the required privileges to the user:
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO <username>@'%';
-
Allow Hevo to access your database:
GRANT SELECT ON <database-name>.* TO <username>;
-
Grant privileges to the database user to read BinLog settings:
GRANT EXECUTE ON PROCEDURE mysql.rds_show_configuration TO '<username>'@'<hostname>';
-
Finalize the user’s permissions:
Note: Replace the placeholder values in the commands above with your own. For example, <username> with hevo.
Retrieve the Database Hostname and Port Number (Optional)
Note: The RDS hostnames start with your database name and end with rds.amazonaws.com.
For example:
Host : mysql-rds-replica-1.xxxxxxxxx.rds.amazonaws.com
Port : 3306
-
In the left navigation pane of the Amazon RDS console, click Databases (or Instances if you are using an older version).
-
In the Databases section on the right, click the DB identifier of the Amazon RDS MySQL instance.
-
Click the Connectivity & security tab, and copy the values under Endpoint and Port as the hostname and port number. You will specify these while creating your Hevo Pipeline.
Perform the following steps to configure your Amazon RDS MySQL Source:
-
Click PIPELINES in the Navigation Bar.
-
Click the Edge tab in the Pipelines List View and click + CREATE EDGE PIPELINE.
-
On the Create Pipeline page, under Source Configuration, do the following:
-
In the Selection screen, select Amazon RDS MySQL.
-
In the Amazon RDS MySQL screen, specify the following:
-
Source Name: A unique name for your Source, not exceeding 255 characters. For example, Amazon RDS MySQL Source.
-
In the Connect to your MySQL section:
-
Database Host: The MySQL host’s IP address or DNS. This is the endpoint that you obtained in Step 5 above.
Note: For URL-based hostnames, exclude the http:// or https:// part. For example, if the hostname URL is http://mysql-replica.westeros.inc, enter mysql-replica.westeros.inc.
-
Database Port: The port on which your Amazon RDS MySQL server listens for connections. This is the port number that you obtained in Step 5 above. Default value: 3306.
-
Database User: The authenticated user who has the permissions to read tables in your database. This user can be the one you created in Step 4 above or an existing user. For example, hevouser.
-
Database Password: The password of your database user.
-
Database Names: The comma separated list of databases from where you want to replicate data. For example, demo1, demo2.
-
(Optional) In the Additional Settings section:
-
Use SSH: Enable this option to connect to Hevo using an SSH tunnel instead of directly connecting your MySQL database host to Hevo. This provides an additional level of security to your database by not exposing your MySQL setup to the public.
If this option is turned off, you must configure your Source to accept connections from Hevo’s IP addresses.
-
Use SSL: Enable this option to use an SSL-encrypted connection. Specify the following:
-
CA File: The file containing the SSL server certificate authority (CA).
-
Client Certificate: The client’s public key certificate file.
-
Client Key: The client’s private key file.
-
Click TEST & CONTINUE to test the connection to your Amazon RDS MySQL Source. Once the test is successful, you can proceed to set up your Destination.
Read the detailed Hevo documentation for the following related topics:
Data Type Mapping
Hevo maps the MySQL Source data type internally to a unified data type, referred to as the Hevo Data Type, in the table below. This data type is used to represent the Source data from all supported data types in a lossless manner.
The following table lists the supported MySQL data types and the corresponding Hevo data type to which they are mapped:
MySQL Data Type |
Hevo Data Type |
- BIT(1) - BOOLEAN - TINYINT(1) - TINYINT UNSIGNED(1) |
BOOLEAN |
- TINYINT(>1) - SMALLINT - TINYINT UNSIGNED(>1) |
SHORT |
- INT - MEDIUMINT - SMALLINT UNSIGNED - MEDIUMINT UNSIGNED - YEAR |
INTEGER |
- BIGINT - INT UNSIGNED - BIGINT UNSIGNED |
LONG |
- FLOAT(0-23) |
FLOAT |
- REAL - DOUBLE - FLOAT(24-53) |
DOUBLE |
- NUMERIC - DECIMAL |
DECIMAL |
- CHAR - VARCHAR - TINYTEXT - TEXT - MEDIUMTEXT - LONGTEXT - JSON - ENUM - SET |
VARCHAR |
- TIMESTAMP |
TIME_TZ |
- DATE |
DATE |
- TIME |
TIME |
- DATETIME |
TIMESTAMP |
- BIT(>1) - BINARY - VARBINARY - TINYBLOB - BLOB - MEDIUMBLOB - LONGBLOB |
BYTEARRAY |
At this time, the following MySQL data types are not supported by Hevo:
Note: If any of the Source objects contain data types that are not supported by Hevo, they are marked as unsupported during object configuration in the Pipeline.
Source Considerations
- MySQL does not generate log entries for cascading deletes. So, Hevo cannot capture these deletes for log-based Pipelines.
Limitations
-
Hevo only fetches tables from the MySQL database. It does not fetch other entities such as functions, stored procedures, views, and triggers.
-
Hevo does not set the metadata column __hevo_is_deleted__ to True for data deleted from the Source table using the TRUNCATE command. This action could result in a data mismatch between the Source and Destination tables.