Integrating Hevo with Airflow

Last updated on Feb 20, 2026

How the Integration Works

Hevo Airflow Provider and Airflow work together as follows:

A Hevo Pipeline moves and transforms data from a Source to the Destination.
External APIs are used to trigger Pipeline runs and retrieve details of past and active jobs in the Pipelines.
Airflow DAGs orchestrate the flow by:
- Using Hevo-specific tasks, such as operators and sensors, to trigger the Pipeline jobs and wait for their completion.
- Chaining Hevo tasks with other processes, such as dbt transformations.

Airflow handles scheduling and monitoring, while Hevo manages the data flow. You can design Edge Pipelines so that Airflow can effectively orchestrate them. Read Hevo Airflow Provider for more information on its components.

Designing Pipelines for External Orchestration

You can design Pipelines for external orchestration based on whether you prefer Hevo or Airflow to control when the Pipeline runs. Refer to the following sections for guidelines on designing these Pipelines.

Pipelines with fixed schedules

Create a Pipeline that runs on a fixed schedule when you want:

Hevo to determine the data replication frequency, such as every 30 minutes or every 2 hours.
The orchestrator to wait for the latest run to complete before starting downstream tasks.

Designing the Pipeline

Configure the Edge Pipeline with the sync frequency set to Scheduled Run.
In Airflow, create a DAG using the Hevo Sensor to:
- Obtain the job details using the Pipeline Job APIs, such as List All Jobs for a Pipeline or Get Details of a Pipeline Job.
- Wait for the job to reach the Completed or Completed with Failures state.
Once the sensor succeeds, run your downstream tasks, such as dbt transformations on the Destination data.

This design is ideal when:

Pipelines are already created using Hevo’s native scheduling functionality.
The orchestrator does not need to modify the Pipeline’s sync frequency or trigger Pipeline runs.
Downstream tasks need to run only when new data is available at the Destination.

Pipelines with No Schedules

Create a Pipeline with no schedule when you want the orchestrator to:

Trigger the Pipeline run using Hevo APIs.
Poll the Pipeline job status until it completes.
Decide when to retry a run or start the next run.

Designing the No-Schedule Pipeline

Configure the Edge Pipeline with the sync frequency set to Sync On Demand. The Pipeline does not run automatically; it runs only when triggered manually or programmatically.
In Airflow:
- Use the Hevo Pipeline Operator to trigger a job for historical loads, incremental updates, or to resync the Pipeline.
- Use the Hevo Pipeline Operator in blocking mode or a Hevo Sensor to:
  - Obtain the job details using the Pipeline Job APIs, such as List All Jobs for a Pipeline or Get Details of a Pipeline Job.
  - Wait for the job to be in the Completed or Completed with Failures state.

This design is ideal when:

The Pipeline must run only after specific upstream events, such as the completion of a task in another system.
The workflow manager must own all schedules, dependencies, retries, and SLAs across various systems.
The Pipelines represent one stage in a larger multi-layer data flow.

Revision History

Refer to the following table for the list of key updates made to this page:

Date	Release	Description of Change
Feb-20-2026	NA	New document.