Release Version 2.20

Last updated on Apr 15, 2024

In this Release

Early Access Features
New and Changed Features
Fixes and Improvements
Documentation Updates

Early Access Features

Data Ingestion

XMIN Ingestion Mode for PostgreSQL Sources
- Introduced a new ingestion mode, XMIN, which uses the system-generated XMIN column in PostgreSQL to capture new and changed data. The XMIN mode is especially useful when logical replication is not possible.
  
  Once this ingestion mode is enabled for you, the XMIN option is displayed as an ingestion mode in the Source configuration UI of all PostgreSQL Source types.
  
  Read XMIN.
  
  Request Early Access
  
  Refer to the Early Access page for the complete list of features available for early access.

Destinations

Amazon S3 as a Destination
- Integrated Amazon S3 as a file storage-based Destination for creating Pipelines. Amazon S3 is a cloud-based object storage service that can be used as a repository for backups, artifacts, content distribution, and hosting static websites.
  
  Once this Destination type is enabled for your team, you can create an Amazon S3 Destination and load data to your S3 bucket using Hevo Pipelines. You can configure Hevo to connect your Amazon S3 bucket using IAM role-based credentials or access credentials.
  
  Read Amazon S3 Destination.
  
  Request Early Access
  
  Refer to the Early Access page for the complete list of features available for early access.

Pipelines

Pipeline and Object-level Visibility into Data Replication (Job Monitoring)
- Introduced the Jobs view that provides visibility into the various components of the data replication stages at a single place. Each complete run of the Pipeline is considered a Job. You can view the jobs for all Pipelines created by your team or for a single Pipeline. Details such as the following help you identify any data mismatches, latencies, and Event consumption spikes in each job:
  - The duration of each replication task.
  - The data volume at ingestion and load.
  - The number of Events being replicated at any stage in the Pipeline run (Job).
  - Errors logged in each Pipeline run.
    
    Read Pipeline Jobs.
    
    Request Early Access
    
    Once this feature is enabled for you, the All Jobs and Jobs tab is displayed in the UI Navigation Bar and the Tools Bar of all your Pipelines.
    
    Refer to the Early Access page for the complete list of features available for early access.

New and Changed Features

Pipelines

Custom Fields for Unique Incrementing Append Only Query Mode (Added in Release 2.19.2)
- Provided the option to specify your own unique identifier columns for the Unique Incrementing Append Only (UIAO) query mode during Pipeline creation. You can use these columns along with or instead of the auto-incrementing columns suggested by Hevo.
  
  Read Unique Incrementing Append Only.

Sources

Migration to v18.0 of the Facebook Marketing API (Added in Release 2.19.3)

Effective Release 2.19.3, the Facebook Ads, Facebook Pages, and Instagram Business integrations use v18.0 of the Marketing API to fetch your data. Existing Pipelines created with these Sources require reauthorization. In addition, for the Instagram Business Source, the new API version impacts data ingestion for the following objects:

Objects	Changes
media_comments_insights	- Destination tables video_insights_lifetime, carousel_album_insights_lifetime, and image_insights_lifetime replaced by video_insights_lifetime_v2, carousel_album_insights_lifetime_v2, and image_insights_lifetime_v2.
story	- Destination table story_insights_lifetime replaced by story_insights_lifetime_v2.
user_insights_lifetime	- Source object schema changed and replaced by the new objects user_age_insights_lifetime, user_gender_insights_lifetime and user_country_insights_lifetime. Note: Hevo ingests the new objects by default in the existing Pipelines. You can skip them from the Pipeline Detailed View, if required. - Corresponding Destination tables created as user_age_insights_lifetime_v2, user_gender_insights_lifetime_v2, and user_country_insights_lifetime_v2.

Schema Version Selection in Stripe
- Provided the option to ingest data without expanding nested objects. You can do this by selecting Schema V2 during Source configuration.
  
  This option is available for all Pipelines created after Release 2.19. To enable this for an existing Pipeline you must recreate the Pipeline.
  
  Read Stripe.

Destinations

Removed Support for Hevo Managed BigQuery as a Destination
- Starting Release 2.20, Hevo has stopped supporting Hevo Managed BigQuery as a Destination type. As a result, you cannot configure new Hevo Managed BigQuery Destinations. Any existing Pipelines will continue to be supported.

User Experience

Removed Support for Streaming Inserts from Google BigQuery Destinations
- Starting Release 2.20, Hevo has stopped supporting the streaming inserts feature in Google BigQuery Destinations. Any existing Destination configured with this feature enabled can continue to be used to configure new and existing Pipelines.

Fixes and Improvements

Performance

Improved Job Scheduling to Reduce Pipeline Latency (Added in Release 2.19.3)
- Revamped the task executor functionality to schedule ingestion and loading tasks efficiently, thereby reducing the time taken to identify the tasks to be run. As the scheduling latency is reduced, Pipelines, especially those processing large amounts of data, see an increased throughput. This is an improvement over the earlier behavior where a latency of 5 minutes or more was seen when the task executor was required to schedule a large number of tasks.
  
  This change is currently implemented only for the AU (Australia) region to monitor the performance impact. Based on the results, it will be gradually implemented in other regions.
Optimized On-Demand Credit Events Update API (Added in Release 2.19.1)
- Removed requests to the On-Demand Credit Events Update API from the Pipeline and Source configuration updates flows, as these updates do not impact On-Demand Credit Events.

Pipelines

Handling of Open Sink Files Before Loading to S3
- Fixed an issue whereby the file uploader task uploaded open sink files to the S3 bucket for staging. Sink files are required for copying data to the Destination. If the reference to these files are kept open, Hevo continues trying to write data to them. As the files were already uploaded, it resulted in data loss. Hevo now checks whether the sink files are closed before uploading them to S3.

Sources

Handling of Decimal Values in PostgreSQL Source (Added in Release 2.19.2)
- Fixed an issue in the PostgreSQL integration to correctly handle decimal values with more than nine decimal digits by rounding off the data to nine decimal places. This fix resolves Pipeline errors when ingesting data from a Source table whose definition does not specify a scale (number of digits to the right of the decimal point) for a decimal column.
Handling of Hierarchical Updates from Multiple Flows for Stripe Objects (Added in Release 2.19.2)
- Fixed an issue in Stripe, whereby nested Source objects were not being expanded, leading to incorrect values in the Destination field and, therefore, to data mismatch issues. Now, Hevo obtains the field IDs from the nested object and makes an API call to ingest the fields. If any field refers to an unsupported API version, Hevo does not ingest it and instead, loads its ID into a separate table called invalid_records in the Destination. Customers can manually update these fields if required.
This fix applies to new and existing Pipelines.
Handling of Ingestion Optimization Issues in PostgreSQL Sources (Added in Release 2.19.1)
- Fixed an issue whereby the logic implemented to optimize data ingestion in long-running queries was unable to identify the updated data from the write-ahead logs correctly. This issue was seen when the records were updated within a short span of being inserted into the Source object, which led to a mismatch between the Source and Destination data.
  
  The fix applies to new and existing Pipelines. If you encounter this issue, contact Hevo Support to enable the fix for your Pipelines.
Handling of Objects in SQL Server Sources (Added in Release 2.19.1)
- Fixed an issue whereby Hevo was displaying temporary tables and views in the Select Objects page during Pipeline configuration. If these were selected, it caused the Pipelines to get stuck.
  
  This fix applies to Pipelines created after Release 2.19.1 with the SQL Server Source types.
  
  Read SQL Server.
Handling of Potential Data Mismatch Issues in Salesforce
- Hevo now maintains internal logs for each batch of Events ingested from your Salesforce account. This helps debug any potential data mismatch issues that may occur between the Source and Destination data.
Handling of Selection Issues during Facebook Ads Configuration (Added in Release 2.19.3)
- Fixed a bug whereby, if you selected any Dynamic creative asset(s) breakdown during Source configuration, you were not able to select any field corresponding to the reports.
Handling of XMIN Wraparound Issues in PostgreSQL Source
- Fixed an issue whereby Pipelines ingesting data using the XMIN column were missing data or getting stuck due to XMIN wraparound. This issue happens when the transaction ID exceeds the XMIN column’s upper limit of 4,294,967,295 and the XMIN value resets or “wraps around” to 1. As part of the fix, when Hevo identifies that a wraparound event has occurred for a transaction, it queries for records with XMIN value greater than the last-polled one or lesser than the current XMIN value.
  
  This fix applies to new and existing Pipelines.
Improved Ingestion Mechanism in MySQL (Added in Release 2.19.3)
- Enhanced the integration by improving the way Hevo ingests data from your MySQL database. Hevo now reads data from the start of a transaction in every run and skips the Events (if any) that have already been ingested in the previous run. This helps Hevo handle the data in large transactions (greater than 4 GB size) and eliminates data duplication issues in the Destination.
  
  This change applies to Pipelines created after Release 2.19.3 with the MySQL Source Types.
Optimizing Query Execution in Jira Cloud (Added in Release 2.19.1)
- Fixed an issue whereby the query executed by Hevo for fetching data from the Issues object was unable to capture the changes in the Source data since the last ingestion. As a result, the offset maintained by Hevo to get the next batch of records became incorrect, leading to a data mismatch between the Source and the Destination data.
  
  Post the fix, the query fetches data in batches grouped by the updated timestamp in milliseconds, based on the Unix epoch time, instead of minutes. Further, the offset is now maintained as the ID of the last record fetched and the updated timestamp in milliseconds, as opposed to earlier, where only the updated timestamp in minutes was used.
  
  This fix applies only to new Pipelines created after Release 2.19.1. If you observe a mismatch between your Source and Destination data due to an incorrect offset, contact Hevo Support to enable the fix for your team.

User Experience

Enhanced Object Selection Flow for PostgreSQL and Oracle Sources (Added in Release 2.19.2)
- For log-based Pipeline creation, the objects in the Select Objects page are now deselected by default to prevent the ingestion of unnecessary objects. If the Select Objects page does not load properly and users opt to skip object selection, all objects are skipped for ingestion and the Pipeline is created in Active state. Previously, all objects were selected by default, and if users skipped object selection, the Pipeline got created with all objects included for ingestion.
  
  This fix applies to Pipelines created with any variant of PostgreSQL and Oracle after Release 2.19.2. Contact Hevo Support to enable the fix for your Pipelines.

Documentation Updates

The following pages have been created, enhanced, or removed in Release 2.20:

Data Ingestion

Data Loading

Scheduling Data Load for a Destination

Destinations

Cloud Storage-Based (New)
- Amazon S3 Destination (New)
  - AWS Regions Supported by Hevo (New)
Google BigQuery
Hevo Managed Google BigQuery
Near Real-time Data Loading using Streaming

Release Version 2.20

On This Page

In this Release

Early Access Features

Data Ingestion

XMIN Ingestion Mode for PostgreSQL Sources

Destinations

Amazon S3 as a Destination

Pipelines

Pipeline and Object-level Visibility into Data Replication (Job Monitoring)

New and Changed Features

Pipelines

Custom Fields for Unique Incrementing Append Only Query Mode (Added in Release 2.19.2)

Sources

Migration to v18.0 of the Facebook Marketing API (Added in Release 2.19.3)

Schema Version Selection in Stripe

Destinations

Removed Support for Hevo Managed BigQuery as a Destination

User Experience

Removed Support for Streaming Inserts from Google BigQuery Destinations

Fixes and Improvements

Performance

Improved Job Scheduling to Reduce Pipeline Latency (Added in Release 2.19.3)

Optimized On-Demand Credit Events Update API (Added in Release 2.19.1)

Pipelines

Handling of Open Sink Files Before Loading to S3

Sources

Handling of Decimal Values in PostgreSQL Source (Added in Release 2.19.2)

Handling of Hierarchical Updates from Multiple Flows for Stripe Objects (Added in Release 2.19.2)

Handling of Ingestion Optimization Issues in PostgreSQL Sources (Added in Release 2.19.1)

Handling of Objects in SQL Server Sources (Added in Release 2.19.1)

Handling of Potential Data Mismatch Issues in Salesforce

Handling of Selection Issues during Facebook Ads Configuration (Added in Release 2.19.3)

Handling of XMIN Wraparound Issues in PostgreSQL Source

Improved Ingestion Mechanism in MySQL (Added in Release 2.19.3)

Optimizing Query Execution in Jira Cloud (Added in Release 2.19.1)

User Experience

Enhanced Object Selection Flow for PostgreSQL and Oracle Sources (Added in Release 2.19.2)

Documentation Updates

Data Ingestion

Data Loading

Destinations

Getting Started

Introduction

Pipelines

Sources

Was this page helpful?

Tell us what went wrong