Snowflake

Snowflake offers a cloud-based data storage and analytics service, generally termed as data warehouse-as-a-service. Companies can use it to store and analyze data using cloud-based hardware and software.

In Snowflake, you can create both data warehouses and databases to store your data. Each data warehouse can further have one or more databases, although this is not mandatory. Snowflake provides you one data warehouse automatically when you create an account.

For Hevo to access your data, you must assign the required permissions. Snowflake uses Roles to assign permissions to users. You need ACCOUNTADMIN, SECURITYADMIN or SYSADMIN privileges to create the required roles for Hevo. Read more about Roles in Snowflake.

The data from your Pipeline is staged in Hevo’s S3 bucket before being finally loaded to your Snowflake warehouse.


Prerequisites


Perform the following steps to configure your Snowflake Destination:

Create a Snowflake Warehouse (Optional)

When you create an account, Snowflake automatically creates a warehouse for you. However, you can choose to create a different warehouse for your Pipeline.
To do this:

  1. Do one of the following to connect to your Snowflake account:

    • In your SQL client, connect to your Snowflake account.

    • Log into your Snowflake instance. Click Worksheets at the top of the page.

      Connect to Snowflake account

  2. Execute the following commands:

    USE ROLE <ACCOUNTADMIN or SYSADMIN>;
    
    CREATE WAREHOUSE <warehouse_name> WITH AUTO_RESUME = TRUE
    WAREHOUSE_SIZE = <size> //size = {XSMALL | SMALL | MEDIUM | LARGE | XLARGE | XXLARGE | XXXLARGE | X4LARGE} AUTO_SUSPEND = <time_in_seconds>; //default suspending time is 600 seconds
    

    In the above commands:

    • Auto-resume specifies whether to automatically resume a warehouse when a SQL query is submitted to it.

    • Warehouse size is the size of the warehouse depends on your cost & usability - by default it is set to XSmall.

    • Auto-suspend specifies the number of seconds of inactivity after which the warehouse is automatically suspended. When data is received, the warehouse is automatically resumed if Auto-resume is set to True.


Create a Snowflake Database (Optional)

You can use the default database provided with your Snowflake data warehouse for loading the data or create another database as per your requirements.

To create a database:

  1. Do one of the following to connect to your Snowflake account:

    • In your SQL client, connect to your Snowflake account.

    • Log into your Snowflake instance. Click Worksheets at the top of the page.

      Connect to Snowflake account

  2. Execute the following commands:

    USE ROLE <ACCOUNTADMIN or SYSADMIN>;// use one of the roles
    CREATE DATABASE <database_name>;# replace <database_name> with a database name
    

Create a Role and a User (Optional)

Snowflake uses the concept of Roles to assign privileges to a user. Perform the following steps to create a Hevo user and assign it the required role:

  1. Do one of the following to connect to your Snowflake account:

    • In your SQL client, connect to your Snowflake account.

    • Log into your Snowflake instance. Click Worksheets at the top of the page.

      Connect to Snowflake account

  2. Create a role for the Hevo user:

    USE ROLE <ACCOUNTADMIN or SECURITYADMIN>;// use one of the roles
    CREATE ROLE <role_name> COMMENT = 'Role for Hevo access';# replace <role_name> with a name for the role
    
  3. Optional: If you have created a hierarchy that assigns all custom roles to the SYSADMIN role, grant this role to the SYSADMIN:

    GRANT ROLE <role_name> to role SYSADMIN;
    
  4. Create the Hevo user and grant the role created above to it:

    CREATE USER IF NOT EXISTS <user_name># replace <user_name> with a name for the user
      PASSWORD='<password>'
      COMMENT='User for Hevo database user'
      DEFAULT_ROLE='<role_name>'
      DEFAULT_WAREHOUSE='<warehouse_name>';
    

Grant Permissions to the Hevo User

You must grant the ALL permission to the role you created for the Hevo user above to allow access to your data. This allows Hevo to connect , create, modify, and monitor usage at the object level for the warehouse and the database.

Do the following:

  1. Grant warehouse-level privileges to the role you created:

    GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE <role_name>;
    
  2. Grant database-level privileges to the role you created:

    GRANT USAGE,CREATE SCHEMA ON DATABASE <database_name> TO ROLE <role_name>;
    
  3. Grant ALL permission on the schema to the role you created:

    GRANT ALL ON SCHEMA <schema_name> TO ROLE <role_name>;
    
  4. Grant ALL permission on the future schema to the role you created:

    GRANT ALL ON FUTURE SCHEMAS IN DATABASE <database_name> TO ROLE <role_name>;
    

Obtain your Snowflake Account and Region Names

The Snowflake data warehouse may be hosted on any of the following Cloud providers:

  • Amazon Web Services (AWS)

  • Google Cloud Platform (GCP)

  • Microsoft Azure (Azure)

You can derive the account name and region from your Snowflake web interface URL.

For most accounts, the URL looks like https://account_name.region.snowflakecomputing.com.

For example, https://westeros.us-east-2.aws.snowflakecomputing.com. Here, westeros is your account name, us-east-2 is the region, and aws is the service provider.

However, if your Snowflake instance is hosted on AWS, US West region, the URL looks like https://account_name.snowflakecomputing.com.

Select the account name and region as per the applicable URL.


Configure Snowflake as a Destination

  1. Click DESTINATIONS in the Asset Palette.

  2. Click + CREATE in the Destinations List View.

  3. In the Add Destination page, select Snowflake as the Destination type.

  4. In the Configure your Snowflake Warehouse page, specify the following:

    Snowflake Settings

    • Destination Name: A unique name for your Destination.

    • Account Name: The name provided by Snowflake to your account.

    • Account Region: The region in which Snowflake account is located. If no account-region is present in URL then your account is located in US West (Oregon).

    • Database User: The Hevo user that you created. This user has a non-administrative role in the Snowflake database.

    • Database Password: Password of the database user.

    • Database Name: Name of the Destination database where the data is to be loaded.

    • Database Schema: Name of the schema in the Destination database where the data is to be loaded.

      Note: Schema name is case-sensitive.

    • Warehouse: The Snowflake warehouse associated with your database, where the SQL queries and DML operations are performed.

    • Advanced Settings:

      • Populate Loaded Timestamp: Enable this option to append the __hevo_loaded_at column to the Destination table to indicate the time when the Event was loaded to the Destination. See Loading Data to a Data Warehouse for more information.

      • Create Transient Tables: Enable this option to create transient tables. Transient tables have the same features as permanent tables minus the Fail-safe period. The fail-safe feature allows Snowflake to recover the table if you were to lose it, for up to seven days. Transient tables allow you to avoid the additional storage costs for the backup, and are suitable if your data does not need the same level of data protection and recovery provided by permanent tables, or if it can be reconstructed outside of Snowflake. See Transient Tables.

  5. Click Test Connection to test connectivity with the Snowflake warehouse.

  6. Once the test is successful, click SAVE DESTINATION.


Handling Source Data with Different Data Types

For teams created in or after Hevo Release 1.58, Hevo automatically modifies the data type of a Snowflake table column to accommodate Source data with a different data type. Datatype promotion is performed on tables that are less than 50GB in size. Read Handling Different Data Types in Source Data.

Note: Your Hevo release version is mentioned at the bottom of the Asset Palette.


Limitations

  • Hevo replicates a maximum of 4096 columns to each Snowflake table, of which six are Hevo-reserved metadata columns used during data replication. Therefore, your Pipeline can replicate up to 4090 (4096-6) columns for each table. Read Limits on the Number of Columns.



See Also


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Mar-09-2021 1.58 Added section Handling Source Data with Different Data Types.
Feb-22-2021 NA - Updated the page overview to state that the Pipeline stages the ingested data in Hevo’s S3 bucket, from where it is finally loaded to the Destination.
- Formatting-related edits.
Last updated on 17 Sep 2021