Hevo can replicate tables from your Aurora database, either hosted in Amazon RDS. The replication can happen at any frequency that you would like. Following replication modes are supported:
- Full dump and load - Full tables are replicated at a set frequency.
- Incremental load for append-only data - Only new rows are replicated from tables. Suitable for append-only tables
- Incremental load for mutable data - Updated or new rows are replicates from tables. Suitable for mutable data.
1. Create a new Pipeline
Click on PIPELINES option in the left navigation bar and click on Create New Pipeline.
2. Select Source Type
Select Aurora from the list on Select Source Type Screen.
3. Provide Connection Settings
Provide your Aurora database connection details on Aurora Connection Settings page. You will have following options in connection details block:
- Source Name - A unique name for this source
- Aurora Host - Aurora host's IP address or DNS
- Aurora Port - The port on which your Aurora server is listening for connections (default is 3306)
- Database User - The read-only user that can read the tables in your database.
- Database Password - Password for the read-only user
- Database Name - The database that you wish to replicate
- If you want to connect to Hevo using an SSH tunnel, check How to Connect through SSH. Else, you will have to whitelist Hevo's IP addresses as listed here.
Alternatively, you copy details from an existing source of Aurora type. Please note that it will create an independent copy of the selected source.
Click TEST CONNECTION to test the credentials and click CONTINUE once test succeeds.
4. Select Ingestion Mode
On this page, you will have 3 options for Ingestion mode. This will define how you want Hevo to read your data from Aurora source.
- Table: In this mode, your tables will be polled individually at a fixed frequency. You can use this mode when you are looking to fetch data from multiple tables in your database and you would like to have control over ingestion for every table individually.
- Custom SQL: If you are looking to fetch data in a different structure than how it is stored in your source tables, you can use this mode. It will allow you to write a custom query and data will be polled using that query at a fixed frequency.
- BinLog: Data will be polled using Aurora's BinLog in this mode. This mode is useful when you are looking to replicate complete database as it is. This mode is very efficient in replicating but leaves you with less control and manageability over data ingestion. A few pointers on setting up the binary logs may be found here.
Select the Ingestion Mode and hit CONTINUE.
5. Provide Job Settings
You will land on this screen if you selected Table or Custom SQL mode on the last screen.
In case of Table mode, you will be presented with the list of tables in your Aurora database. You can deselect the table that you don't want to replicate. For every table, you will have an option to provide replication mode.
In case of Query mode, you will be presented with Query editor, where you can write the custom SQL query using which you want to load the data. Select the Replication mode after writing the query.
Replication mode defines how Hevo will read data from your tables or query in every run. You have following options for replication:
- Full Load: Complete result set from the Query or Table will be replicated at a set frequency.
- Delta - Append-only: Specify an auto-increment column (usually the Primary key). Only new results from the Query result set or Table will be replicated on the basis of specified auto-increment id.
- Delta - Timestamp: Specify update timestamp column name in 'Timestamp column' field. New or updated data from Query or Table will be replicated.
- Change Data Capture: This mode will take care of both incrementing id and update timestamp for detecting delta change in data while replicating Query or Table.
After entering the details, click CONTINUE.
6. Select the Destination
Select the Destination where you want to replicate Aurora Data or Click on NEW DESTINATION to create a new Destination. Check out How to add Destination tutorial for the detailed walkthrough on steps needed for adding new Destination.
7. Pipeline Created
Your Pipeline will be created when you enter this page and you will have an option to see Sample Data and Map Schema.
While Hevo tries to load your schemas, you can select CONTINUE IN BACKGROUND if it is taking too much time. Click on CREATE SCHEMA MAPPING to map Source and Destination Schemas, check out Introduction to Schema Mapper to learn about Schema Mapper or you can select DO IT LATER to directly head to Pipeline page. You can map schemas later on Schema Mapper page in your pipeline.
Please note that your data will not start replicating in Destination tables until you map source and Destination schemas.