Databricks

Last updated on Nov 02, 2023

Databricks is an open-source storage layer that allows you to operate a data lakehouse architecture. This architecture provides data warehousing performance at data lake costs. Databricks runs on top of your existing data lake and is fully compatible with Apache Spark APIs. Apache Spark is an open-source data analytics engine that can perform analytics and data processing on very large data sets. Read A Gentle Introduction to Apache Spark on Databricks.

Hevo can load data from any of your Sources into a Databricks data warehouse. You can set up the Databricks Destination on the fly, while creating the Pipeline, or independently from the Navigation bar. The ingested data is first staged in Hevo’s S3 bucket before it is batched and loaded to the Databricks Destination. Additionally, Hevo supports Databricks on the AWS, Azure, and GCP platforms.

Hevo supports Databricks on the AWS, Azure, and GCP platforms. You can connect your Databricks warehouse hosted on any of these platforms to Hevo using one of the following methods:


Destination Considerations

None.


Limitations

  • Hevo currently does not support Databricks as a Destination in the US-GCP region.



See Also


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Aug-10-2023 NA - Added a prerequisite about adding Hevo IP addresses to an access list.
- Added the subsection Allow connections from Hevo IP addresses to the Databricks workspace for the steps to create an IP access list.
Apr-25-2023 2.12 Updated section, Connect Using the Databricks Partner Connect (Recommended Method) to add information that you must specify all fields to create a Pipeline.
Nov-23-2022 2.02 - Added section, Connect Using the Databricks Partner Connect to mention about Databricks Partner Connect integration.
- Updated screenshots in the page to reflect the latest Databricks UI.
Oct-17-2022 NA Updated section, Limitations to add limitation regarding Hevo not supporting Databricks on Google Cloud.
Jan-03-2022 1.79 New document.

Tell us what went wrong