Batching events while loading in Warehouse Destinations

Major cloud warehouses like Amazon Redshift, Google BigQuery, and Snowflake recommend loading data through files in batches rather than performing inserts directly into the tables. This method of loading data in batches provides much better performance at a much lower cost as compared to the other approach.

Thus, Hevo loads data in batches in all warehouse-type destinations:

  • Amazon Redshift
  • Google BigQuery
  • Snowflake
  • S3
  • Hevo Data Lake

Pros:

  • This method allows Hevo to loads millions of events in the warehouse without consuming a lot of resource bandwidth
  • This method is also faster at scale than doing direct inserts

Cons:

The batching process understandably introduces some delay in loading the data. The delay usually varies between 5-15 minutes.

This means that once an event is ingested by a Hevo Pipeline and provided it is mapped and does not encounter any other failure, the event should be visible in the destination within the above specified time range.

In case you have stricter SLAs in terms of data latency, please reach out to Hevo support over chat and we will be able to guide you on the feasibility and method to configure your pipeline for lower latency.