Hevo Operator
On This Page
Edge Pipeline is now available for Public Review. You can explore and evaluate its features and share your feedback.
HevoPipelineOperator is Hevo’s implementation of the Airflow Operator. It defines the Pipeline sync task in a Directed Acyclic Graph (DAG).
Parameters
The HevoPipelineOperator accepts the following parameters:
Required
| Name | Type | Default | Description |
|---|---|---|---|
pipeline_id (Required)Note: Supports Jinja templating. |
INT |
NA | The unique identifier of the Hevo Pipeline to be synced. Note: The Pipeline must be in the Initialized state. |
Optional
| Parameter | Type | Default | Description |
|---|---|---|---|
action |
PipelineAction |
SYNC_NOW | The operation to trigger for the specified Pipeline. Allowable values: SYNC_NOW: Trigger the incremental sync for the Pipeline. Note: The Pipeline must be in the Initialized state. RESYNC: Restart the historical load for all the active objects in the specified Pipeline. |
job_type |
- JobType- STR
|
INCREMENTAL or TRUNCATE_AND_LOAD | Identify the job to track after initiating a sync. Allowable values: INCREMENTAL: A job that syncs new and updated data. This is the default value for the SYNC_NOW action. HISTORICAL: A job that syncs all existing data from the Source. TRUNCATE_AND_LOAD: A job that runs when a Pipeline is resynced or is resumed after a pause. This is the default value for the RESYNC action. |
connection_id |
STR |
hevo_airflow_conn_id | The ID of the connection created in Airflow for Hevo API credentials and connection details. |
poll_interval |
INT |
15 | The time (in seconds) between status checks while waiting for the Pipeline job to complete. |
retry_limit |
INT |
10 | The maximum number of attempts after initiating the sync to identify the active job. |
deferrable |
BOOL |
True Note: It is recommended to set this parameter to True in production environments. |
A flag to determine whether the task should initiate the sync, release the worker slot, and defer the monitoring to the Airflow triggerer. Note: The Airflow Triggerer service must be running. |
wait_for_completion |
BOOL |
True | A flag to determine whether the Operator should wait for the Pipeline job to complete. Allowable values: True: Wait for the job to complete. False: Do not wait for the job to complete. Capture the job ID and transfer it via the cross-communication mechanism (Xcom) in Airflow to a HevoSensor. |
accept_completed_with_failures |
BOOL |
False | A flag to accept partial failures for the job. Allowable values: True: Treat the Pipeline job’s Completed with Failures status as success. False: Fail the task when the status of the Pipeline job is Completed with Failures. |
ensure_new_job |
BOOL |
True | A flag to prevent the operator from triggering a new job when previous jobs are still running. Allowable values: True: Fail the task if a job is already in progress in the Pipeline. False: Trigger a new job even when an active job is detected in the Pipeline. |
drop_and_load |
BOOL |
False | A flag to determine whether the Drop and Load operation should be run for the specified Pipeline during the RESYNC action. Allowable values: True: Drop existing data from the Destination tables for all active objects in the Pipeline and then load the ingested data into them. False: Evolve the schema and load data ingested from the Source for all active objects in the Pipeline into the existing Destination tables. |
The following table lists the methods that the HevoPipelineOperator provides:
| Method | Description |
|---|---|
execute |
Runs the sync operation for the specified Pipeline after validating its status. |
execute_complete |
Completes task execution when the Pipeline job reaches a final state, such as Completed, Completed with Failures, or Failed. |
_wait_synchronously |
Waits for the Pipeline job to complete, blocking the worker slot. |
Example
The following example creates three HevoPipelineOperator tasks: trigger_pipeline_sync, resync_pipeline, and trigger_only.
-
The
trigger_pipeline_synctask runs the SYNC_NOW action for the specified Pipeline in the Trigger and Asynchronous Wait mode, as indicated by thedeferrable=Trueandwait_for_completion=Trueparameters. -
The
resync_pipelinetask runs the RESYNC action for the specified Pipeline in the Trigger and Asynchronous Wait mode, as indicated by thedeferrable=Trueandwait_for_completion=Trueparameters. -
The
trigger_onlytask runs in the Trigger and Continue mode, as indicated by thedeferrable=Falseandwait_for_completion=Falseparameters.
from airflow import DAG
from airflow.hevo.models.job import JobType
from airflow.hevo.models.pipeline import PipelineAction
from airflow.hevo.operators import HevoPipelineOperator
# Deferrable operator with SYNC_NOW (default)
trigger_sync = HevoPipelineOperator(
task_id="trigger_pipeline_sync",
pipeline_id=123,
action=PipelineAction.SYNC_NOW, # Default - can be omitted
job_type=JobType.INCREMENTAL, # Default for SYNC_NOW - can be omitted
deferrable=True,
wait_for_completion=True,
poll_interval=10,
accept_completed_with_failures=False,
ensure_new_job=True, # Default: True - prevents duplicate jobs
connection_id="hevo_production"
)
# Full historical resync
resync_pipeline = HevoPipelineOperator(
task_id="resync_pipeline",
pipeline_id=123,
action=PipelineAction.RESYNC, # Full historical reload
deferrable=True,
wait_for_completion=True,
poll_interval=30, # Less frequent polling for long jobs
accept_completed_with_failures=True
)
# Trigger and continue
trigger_only = HevoOperator(
task_id="trigger_only",
pipeline_id=456,
deferrable=False,
wait_for_completion=False # Returns job_id via XCom
)
Revision History
Refer to the following table for the list of key updates made to this page:
| Date | Release | Description of Change |
|---|---|---|
| Feb-20-2026 | NA | New document. |