Amazon Redshift

Amazon Redshift is a fully managed, reliable data warehouse service in the cloud that offers large-scale storage and analysis of data set and performs large-scale database migrations. It is a part of the larger cloud-computing platform Amazon Web Services (AWS).

Hevo can load data from any of your Pipelines into an Amazon Redshift data warehouse. You can set up the Redshift Destination on the fly, as part of the Pipeline creation process, or independently. The ingested data is first staged in Hevo’s S3 bucket before it is batched and loaded to the Amazon Redshift Destination.


Whitelist Hevo’s IP Addresses

You need to whitelist the Hevo IP address for your region to enable Hevo to connect to your Amazon Redshift database.

To do this:

  1. Log in to the Amazon Redshift dashboard.

  2. In the left navigation pane, click Clusters.

  3. Click the Cluster that you want to connect to Hevo.

    Click Cluster

  4. In the Configuration tab, click the link text under Cluster Properties, VPC security groups to open the Security Groups panel.

    Click Security group ID

  5. In the Security Groups panel, click Inbound, and then, click Edit.

    Click Edit

  6. In the Edit inbound rules dialog box:

    Edit inbound rules

    1. Click Add Rule.

    2. In the Type column, select Redshift from the drop-down.

    3. In the Port Range column, enter the port of your Amazon Redshift cluster. The default value is 5439.

    4. In the Source column, select Custom from the drop-down and enter Hevo’s IP addresses for your region. Repeat this step to whitelist all the IP addresses.

    5. Click Save.

Create a User and Grant Privileges

1. Create a user (optional)

  1. Log in to your Amazon Redshift database as a superuser or a user with CREATE privilege.

  2. Enter the following command:

    CREATE USER hevo WITH PASSWORD '<password>';

2. Grant privileges to the user

  1. Log in to your Amazon Redshift database as a superuser.

  2. Enter the following commands:

    1. Grant CREATE privilege to the database user for an existing database:

      GRANT CREATE ON DATABASE <database_name> TO hevo;
    2. Grant SELECT privilege to all tables or specific tables:

      GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO hevo; #all tables
      GRANT SELECT ON TABLE <schema_name>.<table_name> TO hevo; #specific table

Retrieve the Hostname and Port Number (Optional)

  1. Log in to the Amazon Redshift dashboard.

  2. In the left navigation pane, click Clusters.

  3. Click the Cluster that you want to connect to Hevo.

    Click Cluster

  4. Under Cluster Database Properties, locate the JDBC URL and the Port.

    The default Amazon Redshift port is 5439.

    Use this JDBC URL as the database host and the Port as the database port in Hevo while creating your Pipeline.

    Locate hostname

Configure Amazon Redshift as a Destination

Perform the following steps to configure Amazon Redshift as a Destination in Hevo:

  1. Click DESTINATIONS in the Asset Palette.

  2. Click + CREATE in the Destinations List View.

  3. In the Add Destination page, select Amazon Redshift.

  4. In the Configure your Amazon Redshift Destination page, specify the following:

    Test & continue

    • Destination Name: A unique name for your Destination.

    • Database Cluster Identifier: Amazon Redshift host’s IP address or DNS.

    • Database Port: The port on which your Amazon Redshift server listens for connections. Default value: 5439

    • Database User: A user with a non-administrative role in the Redshift database.

    • Database Password: The password of the user.

    • Database Name: The name of the Destination database where data will be loaded.

    • Database Schema: The name of the Destination database schema. Default value: public.

    • Additional Settings:

      • Connect through SSH: Enable this option to connect to Hevo using an SSH tunnel, instead of directly connecting your Amazon Redshift database host to Hevo. This provides an additional level of security to your database by not exposing your Redshift setup to the public. Read Connecting Through SSH.

        If this option is disabled, you must whitelist Hevo’s IP addresses.

      • Sanitize Table/Column Names?: Enable this option to remove all non-alphanumeric characters and spaces in a table or column name, and replace them with an underscore (_). Read Name Sanitization.

    • Advanced Settings:

      • Populate Loaded Timestamp: Enable this option to append the __hevo_loaded_at_ column to the Destination table to indicate the time when the Event was loaded to the Destination. See Loading Data to a Data Warehouse for more information.
  5. Click Test Connection to test connectivity with the Amazon Redshift warehouse.

  6. Once the test is successful, click SAVE DESTINATION.

Handling Source Data with Different Data Types

For teams created in or after Hevo Release 1.60, Hevo automatically modifies the data type of an Amazon Redshift Destination table column to accommodate Source data with a different data type. Read Handling Different Data Types in Source Data.

Note: Your Hevo release version is mentioned at the bottom of the Asset Palette.

Handling Source Data with JSON Fields

For Pipelines created in or after Hevo Release 1.74, Hevo uses Replicate JSON fields to JSON columns as the default parsing strategy to load the Source data to the Amazon Redshift Destination.

With the changed strategy, you can query your JSON data directly, eliminating the need to parse it as JSON strings. This change in strategy does not affect the functionality of your existing Pipelines. Therefore, if you want to apply the changed parsing strategy to your existing Pipelines, you need to recreate them.

In addition, the replication strategies, Flatten structs and split arrays to new Events and Replicate JSON fields as JSON strings and array fields as strings have been deprecated for newer Pipelines, and are no longer visible in the UI.

Read Parsing Nested JSON Fields in Events for more information.

Destination Considerations

  • Amazon Redshift is case insensitive to names of database objects, including tables and columns. For example, if your JSON field names are either in mixed or uppercase, such as Product or ITEMS, Amazon Redshift does not recognize these field names and is hence, unable to fetch data from them. Therefore, to enable Amazon Redshift to identify such JSON field names, you must set the session parameter enable_case_sensitive_identifier to TRUE. Read SUPER configurations for more information.

  • The SUPER data type only supports up to 1 MB of data for an individual SUPER field or object. Read SUPER type - Limitations for more information.

  • Hevo stages the ingested data in an Amazon S3 bucket, from where it is loaded to the Destination tables using the COPY command. Hence, if you have enabled enhanced VPC routing, ensure that your VPC is configured correctly. Enhanced VPC routing affects the way your Amazon Redshift cluster accesses other resources in your AWS network, such as the S3 bucket, specifically for the COPY and UNLOAD commands. Read Enhanced VPC Routing in Amazon Redshift.


  • Hevo replicates a maximum of 4096 columns to each Amazon Redshift table. Read Limits on the Number of Columns.

  • Hevo does not support writing to tables that have identity columns.

    Let us suppose you create a table with the default Identity column and manually map a Source table to it. When the Pipeline runs, Hevo would issue insert queries to write all the values from the Source table to this table. However, the writes would fail, since Amazon Redshift does not permit writing values to the identity column.

  • Hevo supports mapping of only JSON fields to the SUPER data type that Amazon Redshift uses to support JSON columns.

    Read SUPER type.

See Also

Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Mar-07-2022 NA Updated the section, Destination Considerations for actions to be taken when Enhanced VPC Routing is enabled.
Feb-07-2022 1.81 Updated section, Whitelist Hevo’s IP Address to remove details about Outbound rules as they are not required.
Nov-09-2021 NA Updated section, Step 2. Create a Database User and Grant Privileges, with the list of commands to be run for granting privileges to the user.
Oct-25-2021 1.74 Added sections:
- Handling Source Data with JSON Fields.
- Destination Considerations.
Updated sections:
- Limitations to add the limitation about Hevo mapping only JSON fields.
- See Also.
Apr-06-2021 1.60 - Added section, Handling Source Data with Different Data Types.
Feb-22-2021 NA - Added the limitation that Hevo does not support writing to tables that have identity columns.
- Updated the page overview to state that the Pipeline stages the ingested data in Hevo’s S3 bucket, from where it is finally loaded to the Destination.
- Revised the procedural sections to include detailed steps for configuring the Amazon Redshift Destination.
Last updated on 24 Jun 2022