- Introduction
- Getting Started
- Creating an Account in Hevo
- Connection Options
- Familiarizing with the UI
- Creating your First Pipeline
- Data Loss Prevention and Recovery
- Data Ingestion
- Data Loading
- Loading Data in a Database Destination
- Loading Data to a Data Warehouse
- Optimizing Data Loading for a Destination Warehouse
- Manually Triggering the Loading of Events
- Scheduling Data Load for a Destination
- Loading Events in Batches
- Data Loading Statuses
- Data Spike Alerts
- Name Sanitization
- Table and Column Name Compression
- Parsing Nested JSON Fields in Events
- Pipelines
- Data Flow in a Pipeline
- Familiarizing with the Pipelines UI
- Working with Pipelines
- Managing Objects in Pipelines
-
Transformations
-
Python Code-Based Transformations
- Supported Python Modules and Functions
-
Transformation Methods in the Event Class
- Create an Event
- Retrieve the Event Name
- Rename an Event
- Retrieve the Properties of an Event
- Modify the Properties for an Event
- Fetch the Primary Keys of an Event
- Modify the Primary Keys of an Event
- Fetch the Data Type of a Field
- Check if the Field is a String
- Check if the Field is a Number
- Check if the Field is Boolean
- Check if the Field is a Date
- Check if the Field is a Time Value
- Check if the Field is a Timestamp
-
TimeUtils
- Convert Date String to Required Format
- Convert Date to Required Format
- Convert Datetime String to Required Format
- Convert Epoch Time to a Date
- Convert Epoch Time to a Datetime
- Convert Epoch to Required Format
- Convert Epoch to a Time
- Get Time Difference
- Parse Date String to Date
- Parse Date String to Datetime Format
- Parse Date String to Time
- Utils
- Examples of Python Code-based Transformations
-
Drag and Drop Transformations
- Special Keywords
-
Transformation Blocks and Properties
- Add a Field
- Change Datetime Field Values
- Change Field Values
- Drop Events
- Drop Fields
- Find & Replace
- Flatten JSON
- Format Date to String
- Format Number to String
- Hash Fields
- If-Else
- Mask Fields
- Modify Text Casing
- Parse Date from String
- Parse JSON from String
- Parse Number from String
- Rename Events
- Rename Fields
- Round-off Decimal Fields
- Split Fields
- Examples of Drag and Drop Transformations
- Effect of Transformations on the Destination Table Structure
- Transformation Reference
- Transformation FAQs
-
Python Code-Based Transformations
-
Schema Mapper
- Using Schema Mapper
- Mapping Statuses
- Auto Mapping Event Types
- Manually Mapping Event Types
- Modifying Schema Mapping for Event Types
- Schema Mapper Actions
- Fixing Unmapped Fields
- Resolving Incompatible Schema Mappings
- Resizing String Columns in the Destination
- Schema Mapper Compatibility Table
- Limits on the Number of Destination Columns
- File Log
- Troubleshooting Failed Events in a Pipeline
- Mismatch in Events Count in Source and Destination
- Activity Log
-
Pipeline FAQs
- Can multiple Sources connect to one Destination?
- What happens if I re-create a deleted Pipeline?
- Why is there a delay in my Pipeline?
- Can I change the Destination post-Pipeline creation?
- How does changing the query mode affect data ingestion?
- Why is my billable Events high with Delta Timestamp mode?
- Can I drop multiple Destination tables in a Pipeline at once?
- How does Run Now affect scheduled ingestion frequency?
- Will pausing some objects increase the ingestion speed?
- Can I see the historical load progress?
- Why is my Historical Load Progress still at 0%?
- Why is historical data not getting ingested?
- How do I set a field as a primary key?
- How do I ensure that records are loaded only once?
- Events Usage
- Sources
- Free Sources
-
Databases and File Systems
- Data Warehouses
-
Databases
- Connecting to a Local Database
- Amazon DocumentDB
- Amazon DynamoDB
- Elasticsearch
-
MongoDB
- Generic MongoDB
- MongoDB Atlas
- Support for Multiple Data Types for the _id Field
- Example - Merge Collections Feature
-
Troubleshooting MongoDB
-
Errors During Pipeline Creation
- Error 1001 - Incorrect credentials
- Error 1005 - Connection timeout
- Error 1006 - Invalid database hostname
- Error 1007 - SSH connection failed
- Error 1008 - Database unreachable
- Error 1011 - Insufficient access
- Error 1028 - Primary/Master host needed for OpLog
- Error 1029 - Version not supported for Change Streams
- SSL 1009 - SSL Connection Failure
- Troubleshooting MongoDB Change Streams Connection
- Troubleshooting MongoDB OpLog Connection
-
Errors During Pipeline Creation
- SQL Server
-
MySQL
- Amazon Aurora MySQL
- Amazon RDS MySQL
- Azure MySQL
- Google Cloud MySQL
- Generic MySQL
- MariaDB MySQL
-
Troubleshooting MySQL
-
Errors During Pipeline Creation
- Error 1003 - Connection to host failed
- Error 1006 - Connection to host failed
- Error 1007 - SSH connection failed
- Error 1011 - Access denied
- Error 1012 - Replication access denied
- Error 1017 - Connection to host failed
- Error 1026 - Failed to connect to database
- Error 1027 - Unsupported BinLog format
- Failed to determine binlog filename/position
- Schema 'xyz' is not tracked via bin logs
- Errors Post-Pipeline Creation
-
Errors During Pipeline Creation
- MySQL FAQs
- Oracle
-
PostgreSQL
- Amazon Aurora PostgreSQL
- Amazon RDS PostgreSQL
- Azure PostgreSQL
- Google Cloud PostgreSQL
- Generic PostgreSQL
- Heroku PostgreSQL
-
Troubleshooting PostgreSQL
-
Errors during Pipeline creation
- Error 1003 - Authentication failure
- Error 1006 - Connection settings errors
- Error 1011 - Access role issue for logical replication
- Error 1012 - Access role issue for logical replication
- Error 1014 - Database does not exist
- Error 1017 - Connection settings errors
- Error 1023 - No pg_hba.conf entry
- Error 1024 - Number of requested standby connections
- Errors Post-Pipeline Creation
-
Errors during Pipeline creation
- PostgreSQL FAQs
- Troubleshooting Database Sources
- File Storage
-
Engineering Analytics
- Apify
- Asana
- Buildkite
- GitHub
-
Streaming
- Android SDK
- Kafka
-
REST API
- Writing JSONPath Expressions
-
REST API FAQs
- Why does my REST API token keep changing?
- Can I use a bearer authorization token for authentication?
- Does Hevo’s REST API support API chaining?
- What is the maximum payload size returned by a REST API?
- How do I split an Event into multiple Event Types?
- How do I split multiple values in a key into separate Events?
- Webhook
- GitLab
- Jira Cloud
- Opsgenie
- PagerDuty
- Pingdom
- QuickBooks Time
- Trello
- Finance & Accounting Analytics
-
Marketing Analytics
- ActiveCampaign
- AdRoll
- Amazon Ads
- Apple Search Ads
- AppsFlyer
- CleverTap
- Criteo
- Drip
- Facebook Ads
- Facebook Page Insights
- Firebase Analytics
- Freshsales
- Google Ads
- Google Analytics
- Google Analytics 4
- Google Analytics 360
- Google Play Console
- Google Search Console
- HubSpot
- Instagram Business
- Klaviyo
- Lemlist
- LinkedIn Ads
- Mailchimp
- Mailshake
- Marketo
- Microsoft Advertising
- Onfleet
- Outbrain
- Pardot
- Pinterest Ads
- Pipedrive
- Recharge
- Segment
- SendGrid Webhook
- SendGrid
- Salesforce Marketing Cloud
- Snapchat Ads
- SurveyMonkey
- Taboola
- TikTok Ads
- Twitter Ads
- Typeform
- YouTube Analytics
- Product Analytics
- Sales & Support Analytics
-
Source FAQs
- From how far back can the Pipeline ingest data?
- Can I connect to a Source not listed in Hevo?
- Can I connect a local database as a Source?
- How can I push data to Hevo API?
- How do I connect a CSV file as a Source?
- Why are my selected Source objects not visible in the Schema Mapper?
- How does the Merge Table feature work?
- Destinations
- Familiarizing with the Destinations UI
- Databases
-
Data Warehouses
- Amazon Redshift
- Azure Synapse Analytics
- Databricks
- Firebolt
- Google BigQuery
- Hevo Managed Google BigQuery
- Snowflake
-
Destination FAQs
- Can I move data between SaaS applications using Hevo?
- Can I change the primary key in my Destination table?
- How do I change the data type of table columns?
- Can I change the Destination table name after creating the Pipeline?
- How can I change or delete the Destination table prefix?
- How do I resolve duplicate records in the Destination table?
- How do I enable or disable deduplication of records?
- Why does my Destination have deleted Source records?
- How do I filter deleted Events from the Destination?
- Does a data load regenerate deleted Hevo metadata columns?
- Can I load data to a specific Destination table?
- How do I filter out specific fields before loading data?
- How do I sort the data in the Destination?
- Transform
- Alerts
- Account Management
- Personal Settings
- Team Settings
-
Billing
- Pricing Plans
- Time-based Events Buffer
- Setting up Pricing Plans, Billing, and Payments
- On-Demand Purchases
- Billing Alerts
- Viewing Billing History
- Billing Notifications
-
Billing FAQs
- Can I get a plan apart from the Starter plan?
- Are free trial Events charged once I purchase a plan?
- For how long can I stay on the Free plan?
- How can I upgrade my plan?
- Is there a discount for non-profit organizations?
- Can I seek a refund of my payment?
- Do ingested Events count towards billing?
- Will Pipeline get paused if I exceed the Events quota?
- Will the initial load of data be free?
- Does the Hevo plan support multiple Destinations?
- Do rows loaded through Models count in my usage?
- Is Hevo subscription environment-specific?
- Can I pause billing if I have no active Pipelines?
- Can you explain the pricing plans in Hevo?
- Where do I get invoices for payments?
- Account Suspension and Restoration
- Account Management FAQs
- Activate
- Glossary
- Release Notes
- Release Version 2.16.4
- Release Version 2.16.3
- Release Version 2.16.2
- Release Version 2.16.1
- Release Version 2.16
- Release Version 2.15
- Release Version 2.14
- Release Version 2.13
- Release Version 2.12
- Release Version 2.11
- Release Version 2.10
- Release Version 2.09
- Release Version 2.08
- Release Version 2.07
- Release Version 2.06
- Release Version 2.05
- Release Version 2.04
- Release Version 2.03
- Release Version 2.02
- Release Version 2.01
- Release Version 2.00
- Release Version 1.99
- Release Version 1.98
- Release Version 1.97
- Release Version 1.96
- Release Version 1.95
- Release Version 1.93 & 1.94
- Release Version 1.92
- Release Version 1.91
- Release Version 1.90
- Release Version 1.89
- Release Version 1.88
- Release Version 1.87
- Release Version 1.86
- Release Version 1.84 & 1.85
- Release Version 1.83
- Release Version 1.82
- Release Version 1.81
- Release Version 1.80 (Jan-24-2022)
- Release Version 1.79 (Jan-03-2022)
- Release Version 1.78 (Dec-20-2021)
- Release Version 1.77 (Dec-06-2021)
- Release Version 1.76 (Nov-22-2021)
- Release Version 1.75 (Nov-09-2021)
- Release Version 1.74 (Oct-25-2021)
- Release Version 1.73 (Oct-04-2021)
- Release Version 1.72 (Sep-20-2021)
- Release Version 1.71 (Sep-09-2021)
- Release Version 1.70 (Aug-23-2021)
- Release Version 1.69 (Aug-09-2021)
- Release Version 1.68 (Jul-26-2021)
- Release Version 1.67 (Jul-12-2021)
- Release Version 1.66 (Jun-28-2021)
- Release Version 1.65 (Jun-14-2021)
- Release Version 1.64 (Jun-01-2021)
- Release Version 1.63 (May-19-2021)
- Release Version 1.62 (May-05-2021)
- Release Version 1.61 (Apr-20-2021)
- Release Version 1.60 (Apr-06-2021)
- Release Version 1.59 (Mar-23-2021)
- Release Version 1.58 (Mar-09-2021)
- Release Version 1.57 (Feb-22-2021)
- Release Version 1.56 (Feb-09-2021)
- Release Version 1.55 (Jan-25-2021)
- Release Version 1.54 (Jan-12-2021)
- Release Version 1.53 (Dec-22-2020)
- Release Version 1.52 (Dec-03-2020)
- Release Version 1.51 (Nov-10-2020)
- Release Version 1.50 (Oct-19-2020)
- Release Version 1.49 (Sep-28-2020)
- Release Version 1.48 (Sep-01-2020)
- Release Version 1.47 (Aug-06-2020)
- Release Version 1.46 (Jul-21-2020)
- Release Version 1.45 (Jul-02-2020)
- Release Version 1.44 (Jun-11-2020)
- Release Version 1.43 (May-15-2020)
- Release Version 1.42 (Apr-30-2020)
- Release Version 1.41 (Apr-2020)
- Release Version 1.40 (Mar-2020)
- Release Version 1.39 (Feb-2020)
- Release Version 1.38 (Jan-2020)
- Upcoming Features
Azure Synapse Analytics
On This Page
- Prerequisites
- Set up Azure Synapse Analytics Instance (Optional)
- Create a Dedicated SQL Pool (Optional)
- Whitelist Hevo’s IP addresses
- Obtain the Azure Synapse Analytics Data Warehouse Connection Settings
- Create a Database User and Grant Privileges (Optional)
- Configure Azure Synapse Analytics as a Destination
- Handling Source Data with Different Data Types
- Handling JSON Fields
- Destination Considerations
- Limitations
- Revision History
Azure Synapse is a cloud-based analytics service that combines the technologies required for data integration, data warehousing, and big data analytics. You can query and perform analytics on data using resources organized in both serverless and dedicated SQL pools.
Hevo integrates with dedicated SQL pools, which are SQL data warehouses in Azure Synapse Analytics. Your dedicated SQL pool must be provisioned from within a Synapse workspace. A Synapse workspace helps to securely collaborate between your Azure resources, which are logically grouped together in a container called resource group.
If you are new to Azure Synapse Analytics and need assistance setting up your Azure Synapse Analytics instance, you can follow the steps listed below to create an Azure portal account, and after that create a Synapse workspace.
The following image illustrates the key steps that you need to complete to configure Azure Synapse Analytics as a Destination in Hevo:
Prerequisites
-
An Azure account with an active subscription for which billing is enabled.
-
A user in your subscription with the Contributor role in Azure RBAC exists, to create the required Synapse resources.
-
The connection settings for the Azure Synapse Analytics data warehouse are available.
-
You are assigned the Team Collaborator, or any administrator role except the Billing Administrator role in Hevo to create the Destination.
Set up Azure Synapse Analytics Instance (Optional)
You need an Azure portal account to set up the Azure Synapse Analytics instance. Once you have an account, you need to create a Synapse workspace.
1. Access the Azure portal in one of the following ways
-
Sign in to an existing Azure account:
-
Go to azure.microsoft.com and click Sign in.
-
Specify the email address that has access to the Azure portal.
-
Provide your password and click Sign in.
-
-
Sign up for an Azure free trial, which credits $200 with 30-days validity.
Note: A free trial allows you to access Azure services, such as Azure Active Directory (Azure AD) and Azure Synapse Analytics. If you decide to continue using the Azure services, you can move to a pay as you go model.
-
Go to azure.microsoft.com and click Start free.
-
In the Sign in dialog, do one of the following:
-
Log in using an existing Microsoft account, a phone number, or a Skype account.
Note: You should not have signed up for a free trial with this account earlier.
-
Click Create one! to create an account by performing the following steps:
-
Specify an email address and click Next. You can also set up a new email address at this time.
-
Provide a password and click Next.
-
Complete the required steps to create your Azure account
-
-
-
Log in to the Azure portal and add a subscription to activate your free trial. A subscription must exist to be able to create an Azure Synapse workspace.
-
2. Create a Synapse workspace
Note: You must have an active subscription with billing enabled and a user with the Azure RBAC Contributor role under that subscription to perform the steps listed in this section.
-
Log in to the Azure portal, and in the Search bar, type azure synapse.
-
From the search results, under Services, select Azure Synapse Analytics.
-
In the Azure Synapse Analytics page, click Create.
-
In the Create Synapse Workspace page, Basics tab, provide the essential information for creating your workspace:
-
Subscription: A container comprising related business and technical resources that are used and billed together. Select the Azure subscription in which you want to create your workspace from the drop-down.
-
Resource group: A collection of related resources that may be logically grouped based on factors such as geography or team functions. You can either select an existing resource group from the drop-down or create one now.
Note: You must have the Azure RBAC Owner or Contributor role on the existing resource group to access it.
-
Managed resource group: A container that holds the supporting resources created by Azure Synapse Analytics for your workspace. A managed resource group is created for you and a default name is assigned to it when your workspace is created. You can also specify a name of your choice for it now.
-
Workspace name: A unique name for your workspace.
-
Region: The geographical location in which you want to create your workspace. Select an Azure region from the drop-down list. It is recommended to co-locate your workspace with your data and client applications such as Hevo.
-
Select Data Lake Storage Gen2: The storage account and container that Azure Synapse uses to store the workspace data. Only Azure Data Lake Storage (ADLS) Gen2 accounts for which hierarchical namespace (HNS) is enabled are displayed. In these type of accounts, data objects and files are organized in a hierarchy of directories for efficient access.
Select your account in one of the following ways:
-
From subscription:
-
Account name: Specify a unique name for your storage account if you are creating one now. Alternatively, you can select an existing storage account available in your subscription from the drop-down list. Read Prepare the existing storage account for information on how to use it with Azure Synapse Analytics.
-
File system name: Specify a unique name for your data storage container if you want to create one. This field is populated with the name of the storage container associated with it if you selected an existing storage account.
-
Assign myself the Storage Blob Data Contributor role…: If selected, allows the workspace to access the data in your ADLS Gen2 account. The check box is not displayed if you selected an existing storage account.
-
-
Manually via URL:
-
Account name: Specify the name of an existing HNS-enabled ADLS Gen2 account, which is prefixed to dfs.core.windows.net to create its URL. For example, if the URL of your ADLS Gen2 account is hevodocstorage1.dfs.core.windows.net, the account name is hevodocstorage1. You can obtain the account name from your Azure account administrator.
-
File system name: Specify a unique name for your data storage container.
-
-
-
Click Next: Security >.
-
-
(Required) In the Security tab, configure the options to secure your workspace:
-
Authentication method: Select Use both local and Azure Active Directory (Azure AD) authentication.
-
SQL Server admin login: Specify a unique name for the user with admin access to the workspace’s SQL pools. Default value: sqladminuser.
-
SQL Password and Confirm Password: Specify a strong password for your admin user.
-
Allow network access to Data Lake Storage Gen2 account: If selected, gives your workspace network access to the ADLS Gen2 account, selected in the Basics tab. The check box is disabled for storage accounts that do not restrict network access.
-
Workspace encryption: An option to double-encrypt your data at rest in the workspace. Default value: Disable.
-
Click Next: Networking >.
-
-
(Optional) In the Networking tab, configure the network connectivity settings for your workspace:
-
Managed virtual network: An option to allow public network access to the workspace through a managed virtual network. Default value: Disable. Perform the additional steps displayed on the screen if you enable this option. Read Create an Azure Synapse workspace with a managed virtual network for more information.
-
Allow connections from all IP addresses: If enabled, adds a default firewall rule, allowing connections from all IP addresses to your workspace. Default value: Enabled.
Note: This option is only available for Synapse workspaces not managed through a virtual network.
-
Minimum TLS version: The TLS version needed by the workspace to access your storage account’s data. Default selection: TLS 1.2. This field is non-editable.
-
Click Next: Tags > if you want to categorize your resources, or click Review + create and skip to Step 8.
-
-
(Optional) In the Tags tab: Specify name-value pairs if you want to categorize your resources. Tagging enables you to consolidate resources for purposes such as billing.
-
In the Review + create tab: Review the specified configuration, and click Create to provision your workspace resources. The provisioning may take around 10-15 minutes.
You can use the dedicated SQL endpoint of this workspace as the server name, the SQL Server admin login as the username, and the SQL password as the password, while configuring your Azure Synapse Analytics Destination with the Enter Connection Settings Manually option.
Create a Dedicated SQL Pool (Optional)
Use the steps in this section to create a dedicated SQL pool from within an existing Synapse workspace. A dedicated SQL pool is the data warehouse in Azure Synapse Analytics into which your data is replicated.
-
Log in to the Azure portal, and in the Search bar, type azure synapse.
-
Select Azure Synapse Analytics under Services.
-
In the Azure Synapse Analytics page, click on your Synapse workspace name. For example, asea-workspace-doc.
-
In the Overview page, click + New dedicated SQL pool.
-
In the New dedicated SQL pool page, Basics tab, provide the essential information for your dedicated SQL pool:
-
Dedicated SQL pool name: A unique name for your data warehouse.
-
Geo-redundant: An option to replicate your data to another data center in the Azure region of your workspace. Default value: No selection.
Note: This option may not be displayed for certain Azure regions, such as South India. Read Regions with availability zones and no region pair.
-
Performance level: The size of your data warehouse. Use the slider to change the value. Default value: DW1000c. The Estimated price displays the cost based on this size.
-
Click Next Additional settings >
-
-
(Required) In the Additional settings tab, customize the settings for your dedicated SQL pool:
-
Use existing data: The data source from which your dedicated SQL pool must be populated. Select Backup or Restore point if you want to restore data to your dedicated SQL pool. Default value: None.
-
Collation: The rules to sort and compare data in your dedicated SQL pool. Default value: SQL_Latin1_General_CP1_CI_AS. You cannot change this value after the dedicated SQL pool is created. Read Help collations to obtain a list of all supported collations.
-
-
Click Review + create to review your configuration, and then Create to provision the resources for your dedicated SQL pool.
You can provide this dedicated SQL pool as your database name while configuring your Azure Synapse Analytics Destination with the Enter Connection Settings Manually option.
Whitelist Hevo’s IP addresses
You must whitelist the Hevo IP address for your region to enable Hevo to connect to your Azure Synapse Analytics instance.
To do this:
-
Log in to your Azure portal dashboard.
-
In the All resources pane, click on your Synapse workspace name. For example, asea-workspace-doc.
-
In the right pane of your <Workspace name> page, Essentials section, click the Show firewall settings link under Networking.
-
Perform this step if your workspace is created in a managed virtual network, else skip to Step 5. In the Public network access section, ensure that public access to your workspace is Enabled.
-
Scroll down to the Firewall rules section, and perform the following:
-
Select the Allow Azure services and resources to access this workspace check box if you want to allow connections from your Azure services and resources to your Synapse workspace.
Note: This setting is internal to Azure Synapse Analytics and does not affect the data replication process in Hevo.
-
Click + Add client IP to add your machine’s IP address, to allow connections from your machine to the Azure Synapse Analytics data warehouse.
-
Specify the following to add your firewall rules:
-
Rule name: A name to identify the rule. For example, HevoIndia.
-
Start IP: The starting IP address. This should be Hevo’s IP address for your region. For example, 13.235.131.126 is Hevo’s IP address for the India region.
-
End IP: The ending IP address. This must be the same as your Start IP address. For example, 13.235.131.126.
-
-
Repeat the step above to add all the IP addresses for your Hevo region.
-
-
Click Save.
Obtain the Azure Synapse Analytics Data Warehouse Connection Settings
Hevo connects to your Azure Synapse Analytics data warehouse in one of the following ways:
Connection Method: Using connection string
-
Log in to the Azure portal, and in the Search bar, type azure synapse.
-
Select Azure Synapse Analytics under Services.
-
In the Azure Synapse Analytics page, click on your Synapse workspace name. For example, asea-workspace-doc.
-
In the left navigation pane of your <Workspace Name> page, scroll down to the Analytics pools section, and click SQL pools.
-
Click the desired dedicated SQL pool name. For example, docsqlpool.
-
In the page that is displayed, view the connection string in one of the following ways:
-
In the right pane of your <Workspace Name/Dedicated SQL Pool Name> page, Essentials section, under Connection strings, click Show database connection strings.
-
In the left navigation pane of your <Dedicated SQL Pool Name> page, scroll down to the Settings section, and click Connection strings.
-
-
In the Connection strings page, select the JDBC tab, and in the JDBC (SQL authentication) box, click the Copy (
) icon to copy the connection string, and save it.
Paste it in the Connection String box while configuring your Azure Synapse Analytics Destination in Hevo.
Connection Method: Using individual connection fields
-
Log in to the Azure portal, and in the Search bar, type azure synapse.
-
Select Azure Synapse Analytics under Services.
-
In the Azure Synapse Analytics page, select your Synapse workspace name. For example, asea-workspace-doc.
-
In the page that is displayed, from the Essentials section on the right, make a note of the following, and specify them while configuring Azure Synapse Analytics as a Destination in Hevo:
-
SQL admin username: This is your username.
-
Dedicated SQL endpoint: This is your server name.
-
-
In the left navigation pane, scroll down to the Analytics pools section, and click SQL pools.
-
Make a note of the dedicated SQL pool name into which you want to load data, and specify it as your database name while configuring Azure Synapse Analytics as a Destination in Hevo.
Create a Database User and Grant Privileges (Optional)
You can use any SQL tool that allows connecting to your Azure Synapse dedicated SQL pool (data warehouse) to create the database user and grant privileges to it. For example, Azure Data Studio or mssql for Visual Studio Code. In this section, we have used Azure Data Studio.
Perform the following steps:
-
Create a login user and map it to a database user, to connect to your SQL database.
1. Create a connection for your SQL database
-
Obtain the connection settings of your SQL database.
-
Access Azure Data Studio.
-
In the left navigation pane, click Connections, and in the SERVERS sidebar, click the New Connection (
) icon.
-
In the Connection Details page, select the Parameters option, and specify the following:
-
Connection type: The type of database. Default value: Microsoft SQL Server.
-
Server: The fully qualified server name. For example, asea-doc-ws.sql.azuresynapse.net.
-
Authentication type: The method with which you want to connect. Default value: SQL Login. Here, we will connect using SQL authentication.
-
User name: The SQL admin user for your database. For example, sqladmin.
-
Password: The password for the specified user name.
-
Database: The database to which you want to connect. Select from the drop-down list. For example, master.
-
Name (optional): A name for your connection. For example, MasterDB.
-
-
Click Connect.
2. Create a database user
-
Open Azure Data Studio and in the SERVERS sidebar, select the connection name created in Step 1.
-
Click the Expand chevron to open Databases, System Databases, and then master.
-
Right-click on master, and select New Query.
-
In the query editor, enter the following command to create a login user and click Run:
CREATE LOGIN [<login_name>] WITH PASSWORD=N'<strong_password>' GO
-
Click Change Connection at the top in the query editor, and select your SQL database name from the drop-down list. For example, docsqlpool.
-
In the query editor, enter the following command to create a database user for the login user in the selected database:
CREATE USER [<username>] FOR LOGIN [<login_name>] GO
Note: Replace the placeholder values in all the commands above with your own. For example, replace <login_name> with hevouser.
3. Grant privileges to the database user
The following table lists the privileges that Hevo requires to connect to and load data into your Azure Synapse Analytics Destination:
Privilege Name | Privilege Description | Applies to |
---|---|---|
ALTER | Allows Hevo to create, alter, and drop objects from the schema. | SCHEMA |
INSERT | Allows Hevo to insert rows, synonyms, tables and columns, and views and columns into the schema or database (dedicated SQL pool), as applicable. | - SCHEMA - DATABASE |
SELECT | Allows Hevo to select rows, synonyms, tables and columns, and views and columns from the schema or database, as applicable. | - SCHEMA - DATABASE |
UPDATE | Allows Hevo to update rows, synonyms, tables and columns, and views and columns in the schema or database, as applicable. | - SCHEMA - DATABASE |
CREATE TABLE | Allows Hevo to create tables in the database. | DATABASE |
ADMINISTER DATABASE BULK OPERATIONS | Allows Hevo to bulk load data into the database using the COPY INTO command. |
DATABASE |
Perform the following steps to grant the privileges:
-
Open Azure Data Studio and in the SERVERS sidebar, select the connection name created in Step 1.
-
Click the Expand chevron to open Databases, <your SQL database name>. For example, docsqlpool.
-
Right-click on your SQL database name, and select New Query.
-
In the query editor, enter the following commands to grant privileges to the database user created in Step 2 and click Run:
GRANT ALTER, SELECT, INSERT, UPDATE ON SCHEMA::<schema_name> TO <username> GO GRANT CREATE TABLE, INSERT, SELECT, UPDATE ON DATABASE::<database_name> to <username> GO GRANT ADMINISTER DATABASE BULK OPERATIONS on DATABASE::<database_name> to <username> GO
Note: Replace the placeholder values in all the commands above with your own. For example, replace <schema_name> with dbo.
Configure Azure Synapse Analytics as a Destination
Perform the following steps to configure Azure Synapse Analytics as a Destination in Hevo:
-
Click DESTINATIONS in the Navigation Bar.
-
Click + CREATE in the Destinations List View.
-
In the Add Destination page, select Azure Synapse Analytics.
-
In the Configure your Azure Synapse Analytics Destination page, specify the following:
-
Destination Name: A unique name for your Destination, not exceeding 255 characters.
-
Connection Settings: Select one of the following methods to connect to your Azure Synapse Analytics data warehouse:
-
Paste Connection String: Enter the connection string obtained in the using a connection string method section above.
If you added a login user in Step 5 above, you must edit the obtained connection string, and replace the value of user in it with your login name. For example, in the following connection string, replace sqladminuser with hevouser:
jdbc:sqlserver://asea-workspace-doc….;user=sqladminuser@…;password={your_password_here};encrypt=true;trustServerCertificate=false; hostNameInCertificate=*.sql.azuresynapse.net;loginTimeout=30;
-
Connection String: The unique identifier for connecting to an Azure Synapse Analytics instance. The connection string automatically fetches details such as the server name, database port, database username, and database name.
-
Password: The password of the database username.
-
Schema Name: The name of the Destination database schema. Default value: dbo.
-
-
Enter Connection Settings Manually: Provide the connection settings obtained in the manual connection method section above.
-
Server Name: The dedicated SQL endpoint for connecting to your Azure Synapse Analytics data warehouse. For example, asea-test-ws.sql.azuresynapse.net
-
Database Port: The port number over which Azure Synapse server listens for connections. Default value: 1433.
-
Username: The name of the user with administrator access to your Synapse workspace. For example, sqladminuser. This can be the SQL Server admin login that you specified while creating the Synapse workspace, or the login user created in Step 5.
-
Password: The password of the specified username.
-
Database Name: The name of the dedicated SQL pool where the data is to be loaded.
-
Schema Name (Optional): The name of the Destination database schema. Default value: dbo.
-
-
-
Advanced Settings:
-
Connect through SSH: Enable this option to connect to Hevo using an SSH tunnel, instead of directly connecting your Azure Synapse Analytics database host to Hevo. This provides an additional level of security to your database by not exposing your Synapse setup to the public. Read Connecting Through SSH.
If this option is disabled, you must whitelist Hevo’s IP addresses for your region.
-
Populate Loaded Timestamp: Enable this option to append the
__hevo_loaded_at_
column to the Destination table to indicate the time when the Event was loaded to the Destination. Read Loading Data to a Data Warehouse. -
Sanitize Table/Column Names?: Enable this option to remove all non-alphanumeric characters and spaces in a table or column name, and replace them with an underscore (_). Read Name Sanitization.
-
-
-
Click TEST CONNECTION. This button is enabled once all the mandatory fields are specified.
-
Click SAVE & CONTINUE. This button is enabled once all the mandatory fields are specified.
Handling Source Data with Different Data Types
Hevo automatically modifies the data type of an Azure Synapse Analytics Destination table column to accommodate Source data with a different data type. Read Handling Different Data Types in Source Data.
Note: Your Hevo release version is mentioned at the bottom of the Navigation Bar.
Handling JSON Fields
Read Parsing Nested JSON Fields in Events to know how Hevo parses the JSON data and makes it available at the Destination. JSON data ingested from the Source is either collapsed into JSON strings or passed as-is to the Destination. Read JSON Functions (Transact-SQL) for information on querying and extracting data from your Azure Synapse Analytics Destination tables using JSON functions.
Destination Considerations
- In Azure Synapse Analytics, tables in dedicated SQL pools are created with clustered columnstore index by default, as this reduces data storage costs and improves query performance. However, any variable-length column in these tables cannot contain more than 8000 bytes of data. As a result, any Events containing data that exceeds this size limit are sidelined by Hevo. You can use Transformations to reduce the data size of these Events and replay them.
Limitations
- Hevo does not support replication of data to serverless SQL pools in Azure Synapse Analytics.
See Also
Revision History
Refer to the following table for the list of key updates made to this page:
Date | Release | Description of Change |
---|---|---|
Apr-25-2023 | 2.12 | Updated section, Configure Azure Synapse Analytics as a Destination to add information that you must specify all fields to create a Pipeline. |
Mar-10-2023 | 2.09 | New document. |