Loading Events in Batches

Last updated on May 30, 2023

The writes to the warehouses include scanning of the tables for deduplication of the Events, which incurs costs for users. Major cloud-based data warehouses, such as, Amazon Redshift and Google BigQuery recommend loading Events through files in batches. Batches provide much better performance at a much lower cost compared to direct and individual writes to the tables.

Advantages of loading Events in batches

Batches allow Hevo to load millions of Events in the warehouse without consuming a lot of resource bandwidth.
Loading in batches is faster at scale than direct inserts.
Deduplication needs to be done fewer times for batches as compared to individual records.

Disadvantages of loading Events in batches

The batching process understandably introduces some delay in loading the data. The delay usually varies between 5-15 minutes. This means that once an Event is ingested by the Pipeline, and provided it is mapped and does not encounter any other failure, the Event should be visible in the Destination within 5-15 minutes.

In case you have stricter SLAs in terms of data latency, contact Hevo Support.

Revision History

Refer to the following table for the list of key updates made to this page:

Date	Release	Description of Change
Mar-10-2023	NA	Added Azure Synapse Analytics to the list of Destinations in the page overview.

Loading Events in Batches

On This Page

Advantages of loading Events in batches

Disadvantages of loading Events in batches

Revision History

Was this page helpful?

Tell us what went wrong