Handling of Deletes

In the case of Pipelines, while Hevo tracks the records deleted in the Source, it does not delete that data in the Destination. Instead, it updates the metadata column, __hevo__marked_deleted to True to identify deleted records. In the case of Activate, Hevo does not track delete operations performed on the Warehouse data.

Tracking Deletes

Hevo tracks deleted Events through the metadata column, __hevo__marked_deleted, which is of the boolean data type. Default value: False.

This column is created by Hevo in the Destination tables and can contain one of the following values:

  • False: If the ingested Event was not deleted in the Source.

  • True: If the ingested Event was deleted in the Source.

  • Null: If the metadata column is added to an existing table. All records with a null value in this column are considered not deleted.

You can add this column in your SQL query to ignore or include the deleted data from/in your analysis. You can also delete the data from your Destination table based on the value of this column.

Note:

  • The __hevo__marked_deleted column was visible in the Schema Mapper until before September 3, 2019. It is not visible in the Schema Mapper any longer but is added to the Destination tables automatically.

  • Hevo tracks deleted data in a Stripe Source through the metadata columns is_row_deleted and deleted_timestamp. Read Data Replication in Stripe.

Read Additions to the Destination Schema.


Handling Deletes in Destinations

In data warehouses and databases, deleted Events are handled during the deduplication process when primary keys are defined for the Destination tables. Read Loading Data in a Database and Loading Data to a Data Warehouse for details on the deduplication process.

In the case when no primary keys are defined, the deleted Events are not deduplicated; the append-only approach is taken, and the ingested data is appended at the end in the Destination tables. However, to identify a deleted Event, Hevo updates the metadata columns in the Event as follows:

  • Sets the metadata column, __hevo__marked_deleted to True.

  • Updates the __hevo__modified_source_at to the timestamp of when the Event was deleted in the Source.


Understanding Deletes in Sources

In Sources, a delete could mean one of the following:

  • A Schema Change: A field is dropped from the Event type resulting in a schema change. A schema change does not affect the Pipeline execution or the existing data in the dropped field. However, in subsequent Pipeline runs, null values are loaded to the dropped field in the Destination.

  • A Record Deleted in an RDBMS Source: For Pipelines created using the Custom SQL and Table ingestion modes, data is fetched as per the selected query mode using SQL queries. As it is not possible to identify delete operations from SQL queries, Hevo does not handle deleted Events for these ingestion modes or any of the query modes.

    In Pipelines created using the logical replication mode, Hevo uses the database logs to replicate data. As a result, Hevo can track all operations, such as insert, update, or delete, that take place in the database and replicate them accordingly. Read Handling Deletes in Log-based Pipelines.

  • A Record Deleted in a SaaS Source: SaaS Sources, such as Salesforce move the deleted data to a recycle bin, while some, such as Stripe and Shopify capture information about deleted records and objects as an event in their events log. Hevo replicates the deleted data to the Destination tables and uses the metadata columns, __hevo__marked_deleted or is_row_deleted and deleted_timestamp to track this information. Read the respective SaaS Source documentation for information on whether Hevo replicates deleted data for the Source.


Handling Deletes in Activate

Hevo does not track delete operations on the data in an Activate Warehouse. However, any changes to the schema in the Warehouse that affect the Activation SQL query fields cause the Activation to fail.

Activate allows deleting records in a Target object through the Delete synchronization behavior for Targets such as Intercom and Zendesk. While the identified object records are deleted in the Target object, the Warehouse data remains unaffected. Read Synchronization Behaviors.



Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Apr-11-2022 NA - Added information about deletes in SaaS Sources to the Understanding Deletes in Sources section.
- Changed the reference link in the Tracking Deletes section.
Mar-21-2022 NA New document.
Last updated on 11 Apr 2022