Ingestion Modes and Query Modes for Database Sources

Last updated on Jun 01, 2023

Once all the existing, or historical data in your Source is ingested, any new and changed data thereafter is ingested as incremental data. In case of RDBMS Sources like MySQL and PostgreSQL, you can define how Hevo fetches this data, using ingestion modes and query modes.

Ingestion Mode

Ingestion mode, or Pipeline mode defines how the incremental data in your Source is read. It is applicable for database Sources such as MySQL, MS SQL Server, Oracle, and PostgreSQL.

Hevo supports three ingestion modes:

  • Log-based

  • Table

  • Custom SQL

Note: Historical data is always fetched using queries.

Each ingestion mode offers different settings and configurations, such as the type of data it ingests or the query modes, which you can use to control the type and amount of data to be ingested.

Refer to the following table for more information on each ingestion mode:

Pipeline Mode Description
Log-based Data is read from the logs maintained by the Source for each transaction, such as, addition, deletion, or update of records. This mode is preferable when you want to:
- Replicate large volumes of incremental data in near real-time.
- Track any records deleted in your Source.
This mode applies only to RDBMS Sources.
Table Data is ingested by running SQL queries on your tables. This mode captures incremental data using modified or updated timestamp values, XMIN for PostgreSQL, and Change Tracking for SQL Server.
Read Query Modes for more information about the different query modes.
Deletes are not replicated to the Destination.
Custom SQL Data is ingested by running a custom SQL query provided by you on the database. With this mode, you can replicate database views or selective data sets retrieved through the SQL query.
Deletes are not replicated to the Destination.

Read Events Usage for Billing and Best Practices for Creating Database Pipelines to select a suitable ingestion mode.

Query Modes

Query modes are applicable for log-based and table-based Pipelines created with database Sources, such as MySQL, PostgreSQL, Oracle, and MS SQL. These define how Hevo queries the data from the Source tables, for example, based on an incrementing column or using a timestamp value in the Source table. In case of log-based Pipelines, the query mode applies only to the historical load ingestion as the incremental data is fetched using database logs. Read Factors Affecting Event Usage - Query Modes to understand how the query mode impacts the consumption of your Events quota.

Hevo identifies the most suitable query mode based on the schema of each Source table during Pipeline creation. For example, for PostgreSQL Sources, Hevo recommends the XMIN query mode, as that is the default mechanism for change data capture in PostgreSQL.

You can change the pre-selected query mode or modify it after Pipeline creation using the Edit Config action. The ingestion for the object restarts when you change its query mode. Please note that the entire data is ingested again from the beginning. The historical data is not billed during re-ingestion.

Read Query Modes and Events Quota Consumption to select a suitable query mode.

Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Mar-24-2023 NA Enhanced the content in the Query Modes section for clarity.
Feb-10-2023 NA Revised the content for clarity and currency.
Sep-21-2022 NA New document.

Tell us what went wrong