Amazon Redshift (Edge)

Last updated on Feb 19, 2025

Edge Pipeline is currently available under Early Access. You can request access to evaluate and test its features.

Amazon Redshift is a fully managed, reliable data warehouse service in the cloud that offers large-scale storage and analysis of data set and performs large-scale database migrations. It is a part of the larger cloud-computing platform Amazon Web Services (AWS).

Hevo can load data from any of your Pipelines into an Amazon Redshift data warehouse. You can set up the Redshift Destination on the fly, as part of the Pipeline creation process, or independently. The ingested data is first staged in Hevo’s S3 bucket before it is batched and loaded to the Amazon Redshift Destination.

If you are new to AWS and Redshift, you can follow the steps listed below to create an AWS account and after that, create an Amazon Redshift database to which the Hevo Pipeline will load the data. Alternatively, you can create users and assign them the required permissions to set up and manage databases within Amazon Redshift. Read AWS Identity and Access Management for more details.


Modifying Amazon Redshift Destination Configuration

After creating an Amazon Redshift Destination, you can modify the database user and password.

Note: Any configuration changes will affect all the Pipelines configured to use this Destination.

To modify the configuration of your Amazon Redshift Destination:

  1. In the detailed view of your Destination, click EDIT.

    Amazon Redshift Destination Detailed View

  2. On the <Your Destination Name> page, you can do the following:

    1. Enter your new Database User.

    2. Click Change and enter your new Database Password.

    Change Database Password

  3. Click TEST & SAVE to check the connection to your Amazon Redshift Destination and then save the modified configuration.


Data Type Mapping

Hevo maps a Source data type internally to a unified data type, referred to as the Hevo Data Type in the table below. This data type is used to represent the Source data from all supported data types in a lossless manner. The Hevo data types are then mapped to the corresponding data types that are supported in each Destination.

Hevo Data Type Redshift Data Type
- ARRAY
- JSON
- STRUCT
SUPER
BOOLEAN BOOLEAN
BYTEARRAY VARBYTE
- BYTE
- SHORT
SMALLINT
DATE DATE
- DATETIME
- TIMESTAMP
TIMESTAMP
DECIMAL DECIMAL
DOUBLE DOUBLE PRECISION
FLOAT REAL
INTEGER INTEGER
LONG BIGINT
TIME TIME
- TIMESTAMPTZ
- ZONEDDATETIME
TIMESTAMP_TZ
TIMETZ TIMETZ
VARCHAR VARCHAR

Destination Considerations

  • Amazon Redshift SUPER data type is case-insensitive by default. This means that if your JSON field names are in mixed case, uppercase, or lowercase, such as Product, PRODUCT, or product, Amazon Redshift does not differentiate between them within the same object. Therefore, to enable Amazon Redshift to differentiate between such JSON field names, you must set the warehouse parameter enable_case_sensitive_super_attribute to TRUE. This allows you to query your data in different cases for field names within the same object.

    Read SUPER configurations.

  • Hevo converts the table and column names to lowercase during loading. Therefore, if you want to query your Destination irrespective of the case, ensure that the warehouse parameter enable_case_sensitive_identifier remains set to its default value, FALSE. If this parameter is set to TRUE, you need to use the table and column names in lowercase when querying the Destination.

  • Hevo stages the ingested data in an Amazon S3 bucket, from where it is loaded to the Destination tables using the COPY command. Hence, if you have enabled enhanced VPC routing, ensure that your VPC is configured correctly. Enhanced VPC routing affects the way your Amazon Redshift cluster accesses other resources in your AWS network, such as the S3 bucket, specifically for the COPY and UNLOAD commands. Read Enhanced VPC Routing in Amazon Redshift.

  • Hevo uses the Amazon Redshift COPY command to load data into the Destination tables; however, it has the following limitations:

    • If a Source object has a row size exceeding 4 MB, Hevo cannot load the object, as the COPY command supports a maximum row size of 4 MB. For example, an object having a VARBYTE row with 6 MB of data cannot be loaded, even though the VARBYTE data type supports up to 16 MB. To avoid such a scenario, ensure that each row in your Source objects contains less than 4 MB of data.

    • The command does not support Not a Number (NaN) values in FLOAT and DOUBLE columns, so Hevo loads NaN values in these columns as NULL.


Limitations

  • Hevo replicates a maximum of 1600 columns to each Amazon Redshift table, of which five are Hevo-reserved metadata columns used during data replication. Therefore, your Pipeline can replicate up to 1595 (1600-5) columns for each table.

  • Hevo does not support writing to tables that have IDENTITY columns.

    Let us suppose you create a table with a default IDENTITY column and manually map a Source table to it. When the Pipeline runs, Hevo issues insert queries to write all the values from the Source table to this table. However, the writes would fail, as Amazon Redshift does not permit writing values to the IDENTITY column.

  • Hevo supports mapping of only JSON fields to the SUPER data type that Amazon Redshift uses to support JSON columns.

    Read SUPER type.

Tell us what went wrong