Destination FAQs

How do I enable or disable the deduplication of records in my Destination tables?

Use the Append Rows on Update option within a Destination table to indicate whether the ingested Events must be directly appended as new rows, or should these be checked for duplicates. You can specify this setting for each table.

Note: This feature is available only for Amazon Redshift, Google BigQuery, and Snowflake data warehouse Destinations. For RDBMS Destinations such as Aurora MySQL, MySQL, Postgres, and SQL Server, deduplication is always done.

In the Destination Detailed View page:

  1. Click the More icon next to the table name in the Destination Tables List.
  2. Update the Append rows on update option, as required:

    Option setting Description
    Enabled Events are appended as new rows without deduplication
    Disabled Events are de-duplicated

    Modify Append rows on update setting for a table

  3. Click OK, GOT IT in the confirmation dialog to apply this setting.

Note: If you disable this feature after having previously enabled it, uniqueness is ensured only for future records in case of Google BigQuery and Snowflake. Therefore, both old and new versions of the same record may exist. In case of Amazon Redshift, however, uniqueness can be achieved for the entire data upon disabling the feature.


How do I resolve duplicate records in the Destination table?

Duplicate records may occur in the Destination table in two scenarios:

  • When there is no primary key in the Destination table to carry out the deduplication of records.
  • When the Append Rows on Update setting is enabled for the table.

    Append Rows on Update

You can either set up the primary key and re-run the ingestion for that object from the Pipeline Overview tab of the Pipeline Detailed View, or, disable the Append Rows in Update setting for the table.

Note: The changes are applied only on the data loaded subsequently.


Can I have the same Source and Destination in the Pipeline?

Yes, you can create a Pipeline with the same Source and Destination instance. However, you cannot move data between the same database via the Pipeline.


How do I filter deleted Events from the Destination?

Events deleted at Source have the value of the boolean column, __hevo__marked_deleted, set to True in the Destination. You can filter these Events by creating a Model using the __hevo__marked_deleted column. To do this, use the following query in your Model:

SELECT *
FROM table_name
WHERE __hevo__marked_deleted = false

where table_name is the Destination table containing all Source Events. For more information, read Creating a Model.

After you run the Model, the Destination table has only those Events that are existing in the Source table.

Note: This method works only for database Sources having Pipeline in Logical Replication Mode. For non-database Sources and database Pipelines in other modes (Table and Custom SQL mode), you must manually delete the Events from the Destination table.


How can I change from a service account to a user account in my BigQuery Pipeline?

You cannot change from a service account to a user account or vice-versa once you have created the Pipeline with BigQuery as the Destination. This is because BigQuery Destinations configured with service and user accounts are treated as two different Destinations. Therefore, you must create a different Pipeline and authenticate Hevo through the user account.


How can I filter specific fields before loading data to the Destination?

You can filter specific fields of an Event before the Destination table gets created for the corresponding Event Type, by using any one of the following:

Note: Event Types from the Source are mapped to Tables in your Destination, while fields are mapped to the table columns.

Prerequisites

To filter columns using Python-based Transformations:

  1. Create the Pipeline with Auto Mapping disabled.
  2. Click Transformations to access the Python-based Transformation interface.

    Python based Transformation interface

  3. Write the following code in the CODE console of the Transformation:

     from io.hevo.api import Event
    
     def transform(event):
    
     properties = event.getProperties()
     #list of columns to be deleted, below is the sample for same
     del_columns=["item","qyt"]
     for i in properties.keys():
         if i in del_columns:
             del properties[i]
    
     return event
    
    
  4. Click GET SAMPLE and then click TEST to test the transformation.

  5. Click DEPLOY.

  6. Once the Transformation is deployed, enable Auto Mapping for the Event Type from the Schema Mapper. Auto Mapping automatically maps Event Types and fields from your Source to the corresponding tables and columns in the Destination excluding the filtered columns.

Read more at Python Code-Based Transformations.

To filter columns using Drag and Drop Transformations:

  1. Create the Pipeline with Auto Mapping disabled.

  2. Click Transformations to access the Python-based Transformation interface.

  3. In the Python-based Transformation interface, click ENABLE DRAG-AND-DROP INTERFACE.

    Drag-and-Drop Interface

  4. Drag the Drop Fields Transformation block to the canvas.

    Select Drop Fields

  5. Specify the Event Type from which you want to drop the fields. Alternatively, skip this filter to apply the transformation to all Event Types.

  6. Select the Field filters to specify the fields to be dropped.

  7. Click DEPLOY.

  8. Once the Transformation is deployed, enable Auto Mapping for the Event Type from the Schema Mapper. Auto Mapping automatically maps Event Types and fields from your Source to the corresponding tables and columns in the Destination excluding the filtered columns.

Read more at Drag and Drop Transformations.

To filter columns using Schema Mapper:

  1. Once the Pipeline gets created, in the Schema Mapper, select the required Event Type.

  2. Click CREATE TABLE & MAP.

    Create table and map option

  3. Deselect the columns that you want to exclude from the Destination table.

    Deselecting the columns to filter

  4. Specify the Destination Table Name and click CREATE TABLE & MAP.

  5. Once the table is created, enable Auto Mapping for the Event Type from the Schema Mapper. Auto Mapping automatically maps Event Types and fields from your Source to the corresponding tables and columns in the Destination excluding the filtered columns.

Read more at Schema Mapper.

The Destination table is created with the remaining columns, and the data is replicated into this table as per the Pipeline schedule.



Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Sep-20-2021 NA Added the FAQ, How can I filter specific fields before loading data to the Destination?
Sep-09-2021 NA Added the FAQ How can I change from a service account to a user account in my BigQuery Pipeline?
Aug-09-2021 NA Added the FAQ How do I filter deleted Events from the Destination?
Mar-09-2021 NA Added the FAQ How do I resolve duplicate records in the Destination table?
Last updated on 11 Oct 2021