Amazon Redshift

Amazon Redshift is a fully managed, reliable data warehouse service in the cloud that offers large-scale storage and analysis of data set and performs large-scale database migrations. It is a part of the larger cloud-computing platform Amazon Web Services (AWS).

You can ingest data from your Amazon Redshift database using Hevo Pipelines and replicate it to a Destination of your choice.

Prerequisites


Whitelist Hevo’s IP Addresses

You need to whitelist the Hevo IP address for your region to enable Hevo to connect to your Amazon RDS MySQL database.

To do this:

  1. Log in to the Amazon Redshift dashboard.

  2. In the left navigation pane, click Clusters.

  3. Click the Cluster that you want to connect to Hevo.

    Click Cluster

  4. In the Configuration tab, click the link text under Cluster Properties, VPC security groups to open the Security Groups panel.

    Click Security group ID

  5. In the Security Groups panel, click Inbound, and then, click Edit.

    Click Edit

  6. In the Edit inbound rules dialog box:

    Edit inbound rules

    1. Click Add Rule.

    2. In the Type column, select Redshift from the drop-down.

    3. In the Port Range column, enter the port of your Amazon Redshift cluster. The default value is 5439.

    4. In the Source column, select Custom from the drop-down and enter Hevo’s IP addresses for your region.

    5. Click Save.

  7. In the Security Groups panel, click Outbound, and then, click Edit.

  8. Repeat Step 6 in the Outbound Rules tab to configure outbound rules.


Create a Database User and Grant Privileges

1. Create a database user (optional)

In order to create a user, you must be a superuser or a user with CREATE privilege.

To create a user, log in to your Amazon Redshift database and enter the following commands:

CREATE USER hevo WITH PASSWORD '<password>';

2. Grant privileges to the user

The database user specified in the Hevo Pipeline must have the SELECT privileges.

To assign this privilege, log in to your Amazon Redshift database as a superuser and enter the following commands:

  • Grant SELECT privilege to all tables or specific tables:

    GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO hevo; #all tables
    GRANT SELECT ON TABLE <schema_name>.<table_name> TO hevo; #specific table
    
  • Optionally, view the list of tables available in a schema:

    SELECT distinct(<table_name>) FROM pg_table_def WHERE <schema_name> = 'pg_catalog';
    

Retrieve the Hostname and Port Number (Optional)

  1. Log in to the Amazon Redshift dashboard.

  2. In the left navigation pane, click Clusters.

  3. Click the Cluster that you want to connect to Hevo.

    Click Cluster

  4. Under Cluster Database Properties, locate the JDBC URL and the Port.

    The default Amazon Redshift port is 5439.

    Use this JDBC URL as the database host and the Port as the database port in Hevo while creating your Pipeline.

    Locate hostname


Configure Amazon Redshift Connection Settings

Perform the following steps to configure Amazon Redshift as a Source in Hevo:

  1. Click PIPELINES in the Asset Palette.

  2. Click + CREATE in the Pipelines List View.

  3. In the Select Source Type page, select Amazon Redshift.

  4. In the Configure your Amazon Redshift Source page, specify the following:

    Test & continue

    • Pipeline Name: A unique name for your Pipeline.

    • Database Cluster Identifier: Amazon Redshift host’s IP address or DNS name.

    • Database Port: The port on which your Amazon Redshift server is listening for connections. Default value: 5439.

    • Database User: The authenticated user that can read the tables in your database.

    • Database Password: Password for the database user.

    • Database Name: The database that you wish to replicate.

    • Connect through SSH: Enable this option to connect to Hevo using an SSH tunnel, instead of directly connecting your Amazon Redshift database host to Hevo. This provides an additional level of security to your database by not exposing your Amazon Redshift setup to the public. Read Connecting Through SSH.

      If this option is disabled, you must whitelist Hevo’s IP addresses to allow Hevo to connect to your Amazon Redshift host.


Object Settings

Object settings must be configured if the Pipeline mode is Table.

To do this:

  1. Once the Source settings are specified in step 4, select the objects to be replicated in the SELECT OBJECTS YOU WANT TO REPLICATE page, and then, click CONTINUE.

    Select objects

    Note: Each object represents a table in your database.

  2. In the CONFIGURE SOURCE OBJECTS page, specify the query mode to be used for each selected object.


Limitations

None.


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Feb-22-2021 1.57 Revised the document to include the end-to-end procedure for configuring Amazon Redshift as a Source.
Last updated on 01 Jun 2021