Snowflake
ON THIS PAGE
Snowflake offers a cloud-based data storage and analytics service, generally termed as data warehouse-as-a-service. Companies can use it to store and analyze data using cloud-based hardware and software.
In Snowflake, you can create both data warehouses and databases to store your data. Each data warehouse can further have one or more databases, although this is not mandatory. Snowflake provides you one data warehouse automatically when you create an account.
For Hevo to access your data, you must assign the required permissions. Snowflake uses Roles to assign permissions to users. You need ACCOUNTADMIN
, SECURITYADMIN
or SYSADMIN
privileges to create the required roles for Hevo. Read more about Roles in Snowflake.
The data from your Pipeline is staged in Hevo’s S3 bucket before being finally loaded to your Snowflake warehouse.
Prerequisites
-
An active Snowflake account.
-
The user has
ACCOUNTADMIN
orSECURITYADMIN
privileges in Snowflake to create a role for Hevo. -
The user must have
ACCOUNTADMIN
orSYSADMIN
privileges in Snowflake, if a warehouse is to be created. -
Hevo is assigned
USAGE
permissions on data warehouses. -
Hevo is assigned
USAGE
andCREATE SCHEMA
permissions on databases. -
Hevo is assigned
ALL
permissions on the current and future schemas.
Perform the following steps to configure your Snowflake Destination:
Create a Snowflake Warehouse (Optional)
When you create an account, Snowflake automatically creates a warehouse for you. However, you can choose to create a different warehouse for your Pipeline.
To do this:
-
Do one of the following to connect to your Snowflake account:
-
In your SQL client, connect to your Snowflake account.
-
Log into your Snowflake instance. Click Worksheets at the top of the page.
-
-
Execute the following commands:
USE ROLE <ACCOUNTADMIN or SYSADMIN>; CREATE WAREHOUSE <warehouse_name> WITH AUTO_RESUME = TRUE WAREHOUSE_SIZE = <size> //size = {XSMALL | SMALL | MEDIUM | LARGE | XLARGE | XXLARGE | XXXLARGE | X4LARGE} AUTO_SUSPEND = <time_in_seconds>; //default suspending time is 600 seconds
In the above commands:
-
Auto-resume specifies whether to automatically resume a warehouse when a SQL query is submitted to it.
-
Warehouse size is the size of the warehouse depends on your cost & usability - by default it is set to XSmall.
-
Auto-suspend specifies the number of seconds of inactivity after which the warehouse is automatically suspended. When data is received, the warehouse is automatically resumed if Auto-resume is set to True.
-
Create a Snowflake Database (Optional)
You can use the default database provided with your Snowflake data warehouse for loading the data or create another database as per your requirements.
To create a database:
-
Do one of the following to connect to your Snowflake account:
-
In your SQL client, connect to your Snowflake account.
-
Log into your Snowflake instance. Click Worksheets at the top of the page.
-
-
Execute the following commands:
USE ROLE <ACCOUNTADMIN or SYSADMIN>;// use one of the roles CREATE DATABASE <database_name>;# replace <database_name> with a database name
Create a Role and a User (Optional)
Snowflake uses the concept of Roles to assign privileges to a user. Perform the following steps to create a Hevo user and assign it the required role:
-
Do one of the following to connect to your Snowflake account:
-
In your SQL client, connect to your Snowflake account.
-
Log into your Snowflake instance. Click Worksheets at the top of the page.
-
-
Create a role for the Hevo user:
USE ROLE <ACCOUNTADMIN or SECURITYADMIN>;// use one of the roles CREATE ROLE <role_name> COMMENT = 'Role for Hevo access';# replace <role_name> with a name for the role
-
Optional: If you have created a hierarchy that assigns all custom roles to the SYSADMIN role, grant this role to the SYSADMIN:
GRANT ROLE <role_name> to role SYSADMIN;
-
Create the Hevo user and grant the role created above to it:
CREATE USER IF NOT EXISTS <user_name># replace <user_name> with a name for the user PASSWORD='<password>' COMMENT='User for Hevo database user' DEFAULT_ROLE='<role_name>' DEFAULT_WAREHOUSE='<warehouse_name>';
Grant Permissions to the Hevo User
You must grant the ALL
permission to the role you created for the Hevo user above to allow access to your data. This allows Hevo to connect , create, modify, and monitor usage at the object level for the warehouse and the database.
Do the following:
-
Grant warehouse-level privileges to the role you created:
GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE <role_name>;
-
Grant database-level privileges to the role you created:
GRANT USAGE,CREATE SCHEMA ON DATABASE <database_name> TO ROLE <role_name>;
-
Grant
ALL
permission on the schema to the role you created:GRANT ALL ON SCHEMA <schema_name> TO ROLE <role_name>;
-
Grant
ALL
permission on the future schema to the role you created:GRANT ALL ON FUTURE SCHEMAS IN DATABASE <database_name> TO ROLE <role_name>;
Obtain your Snowflake Account and Region Names
The Snowflake data warehouse may be hosted on any of the following Cloud providers:
-
Amazon Web Services (AWS)
-
Google Cloud Platform (GCP)
-
Microsoft Azure (Azure)
You can derive the account name and region from your Snowflake web interface URL.
For most accounts, the URL looks like https://account_name.region.snowflakecomputing.com
.
For example, https://westeros.us-east-2.aws.snowflakecomputing.com
. Here, westeros
is your account name, us-east-2
is the region, and aws
is the service provider.
However, if your Snowflake instance is hosted on AWS, US West region, the URL looks like https://account_name.snowflakecomputing.com
.
Select the account name and region as per the applicable URL.
Configure Snowflake as a Destination
-
Click DESTINATIONS in the Asset Palette.
-
Click + CREATE in the Destinations List View.
-
In Add Destination page select Snowflake as the Destination type.
-
In the Configure your Snowflake Warehouse page, specify the following:
-
Destination Name: A unique name for your Destination.
-
Account Name: The name provided by Snowflake to your account.
-
Account Region: The region in which Snowflake account is located. If no account-region is present in URL then your account is located in US West (Oregon).
-
Database User: The Hevo user that you created. This user has a non-administrative role in the Snowflake database.
-
Database Password: Password of the database user.
-
Database Name: Name of the Destination database where the data is to be loaded.
-
Database Schema: Name of the schema in the Destination database where the data is to be loaded.
Note: Schema name is case-sensitive.
-
Warehouse: The Snowflake warehouse associated with your database, where the SQL queries and DML operations are performed.
-
Advanced Settings:
-
Populate Loaded Timestamp: Enable this option to append the
__hevo_loaded_at
column to the Destination table to indicate the time when the Event was loaded to the Destination. See Loading Data to a Data Warehouse for more information. -
Create Transient Tables: Enable this option to create transient tables. Transient tables have the same features as permanent tables minus the Fail-safe period. The fail-safe feature allows Snowflake to recover the table if you were to lose it, for up to seven days. Transient tables allow you to avoid the additional storage costs for the backup, and are suitable if your data does not need the same level of data protection and recovery provided by permanent tables, or if it can be reconstructed outside of Snowflake. See Transient Tables.
-
-
-
Click Test Connection to test connectivity with the Snowflake warehouse.
-
Once the test is successful, click SAVE DESTINATION.
Handling Source Data with Different Data Types
For teams created in or after Hevo Release 1.58, Hevo automatically modifies the data type of a Snowflake table column to accommodate Source data with a different data type. Datatype promotion is performed on tables that are less than 50GB in size. Read Handling Different Data Types in Source Data.
Note: Your Hevo release version is mentioned at the bottom of the Asset Palette.
Limitations
- Hevo replicates a maximum of 4096 columns to each Snowflake table, of which six are Hevo-reserved metadata columns used during data replication. Therefore, your Pipeline can replicate up to 4090 (4096-6) columns for each table. Read Limits on the Number of Columns.
See Also
Revision History
Refer to the following table for the list of key updates made to the page:
Date | Release | Description of Change |
---|---|---|
09-Mar-2021 | 1.58 | Added section Handling Source Data with Different Data Types. |
22-Feb-2021 | NA | - Updated the page overview to state that the Pipeline stages the ingested data in Hevo’s S3 bucket, from where it is finally loaded to the Destination. - Formatting-related edits. |