On This Page
Amazon Redshift is a fully managed, reliable data warehouse service in the cloud that offers large-scale storage and analysis of data set and performs large-scale database migrations. It is a part of the larger cloud-computing platform Amazon Web Services (AWS).
Hevo can load data from any of your Pipelines into an Amazon Redshift data warehouse. You can set up the Redshift Destination on the fly, as part of the Pipeline creation process, or independently. The ingested data is first staged in Hevo’s S3 bucket before it is batched and loaded to the Amazon Redshift Destination.
If you are new to AWS and Redshift, you can follow the steps listed below to create an AWS account and after that, create an Amazon Redshift database to which the Hevo Pipeline will load the data. Alternatively, you can create users and assign them the required permissions to set up and manage databases within Amazon Redshift. Read AWS Identity and Access Management for more details.
The following image illustrates the key steps that you need to complete to configure Amazon Redshift as a Destination in Hevo:
Handling Source Data with Different Data Types
For teams created in or after Hevo Release 1.60, Hevo automatically modifies the data type of an Amazon Redshift Destination table column to accommodate Source data with a different data type. Read Handling Different Data Types in Source Data.
Note: Your Hevo release version is mentioned at the bottom of the Navigation Bar.
Handling Source Data with JSON Fields
For Pipelines created in or after Hevo Release 1.74, Hevo uses Replicate JSON fields to JSON columns as the default parsing strategy to load the Source data to the Amazon Redshift Destination.
With the changed strategy, you can query your JSON data directly, eliminating the need to parse it as JSON strings. This change in strategy does not affect the functionality of your existing Pipelines. Therefore, if you want to apply the changed parsing strategy to your existing Pipelines, you need to recreate them.
In addition, the replication strategies, Flatten structs and split arrays to new Events and Replicate JSON fields as JSON strings and array fields as strings have been deprecated for newer Pipelines, and are no longer visible in the UI.
Amazon Redshift is case insensitive to names of database objects, including tables and columns. For example, if your JSON field names are either in mixed or uppercase, such as Product or ITEMS, Amazon Redshift does not recognize these field names and is hence, unable to fetch data from them. Therefore, to enable Amazon Redshift to identify such JSON field names, you must set the session parameter
enable_case_sensitive_identifierto TRUE. Read SUPER configurations.
SUPERdata type only supports up to 1 MB of data for an individual
SUPERfield or object. Read SUPER type - Limitations.
Hevo stages the ingested data in an Amazon S3 bucket, from where it is loaded to the Destination tables using the COPY command. Hence, if you have enabled enhanced VPC routing, ensure that your VPC is configured correctly. Enhanced VPC routing affects the way your Amazon Redshift cluster accesses other resources in your AWS network, such as the S3 bucket, specifically for the COPY and UNLOAD commands. Read Enhanced VPC Routing in Amazon Redshift.
Hevo replicates a maximum of 4096 columns to each Amazon Redshift table. Read Limits on the Number of Destination Columns.
Hevo does not support writing to tables that have
Let us suppose you create a table with the default Identity column and manually map a Source table to it. When the Pipeline runs, Hevo would issue
insertqueries to write all the values from the Source table to this table. However, the writes would fail, since Amazon Redshift does not permit writing values to the
Hevo supports mapping of only JSON fields to the
SUPERdata type that Amazon Redshift uses to support JSON columns.
Read SUPER type.
- Destination FAQs
- Enhanced VPC Routing in Amazon Redshift
- Loading Data to an Amazon Redshift Data Warehouse
- SUPER type
- SUPER configurations
- SUPER type - Limitations
Refer to the following table for the list of key updates made to this page:
|Date||Release||Description of Change|
|Sep-04-2023||NA||Updated the page contents to reflect the latest Amazon Redshift user interface (UI).|
|Aug-11-2023||NA||Fixed broken links.|
|Apr-25-2023||2.12||Updated section, Configure Amazon Redshift as a Destination to add information that you must specify all fields to create a Pipeline.|
|Feb-20-2023||2.08||Updated section, Configure Amazon Redshift as a Destination to add steps for using the connection string to automatically fetch the database credentials.|
- Set up an Amazon Redshift Instance
- Create a Database
|Sep-21-2022||NA||Added a note in section, Configure Amazon Redshift as a Destination.|
|Mar-07-2022||NA||Updated the section, Destination Considerations for actions to be taken when Enhanced VPC Routing is enabled.|
|Feb-07-2022||1.81||Updated section, Whitelist Hevo’s IP Address to remove details about Outbound rules as they are not required.|
|Nov-09-2021||NA||Updated section, Step 2. Create a Database User and Grant Privileges, with the list of commands to be run for granting privileges to the user.|
- Handling Source Data with JSON Fields.
- Destination Considerations.
- Limitations to add the limitation about Hevo mapping only JSON fields.
- See Also.
|Apr-06-2021||1.60||- Added section, Handling Source Data with Different Data Types.|
|Feb-22-2021||NA||- Added the limitation that Hevo does not support writing to tables that have
- Updated the page overview to state that the Pipeline stages the ingested data in Hevo’s S3 bucket, from where it is finally loaded to the Destination.
- Revised the procedural sections to include detailed steps for configuring the Amazon Redshift Destination.