Resolving Event Failures

The resolution of failed Events in Hevo is decided by the type of failure, configuration of the Pipeline, and the underlying failure reason. Based on this, the failures can be classified as:

  • Failures that you must resolve and replay manually. For example, in case of bad transformations code, you must fix the code, click DONE next to the failure message, and then, click the vertical ellipse icon, and Replay.

  • Failures that you must resolve, but which are auto-replayed by Hevo. For example, as soon as you correct the schema mapping error for an Event, Hevo immediately queues it up for auto-replay.
    For some of these errors, either you or Hevo may replay the Event. For example, if an Event fails due to insufficient disk space, Hevo automatically replays this Event every three hours assuming that you have fixed the issue. However, you can manually increase the disk space allocation and immediately replay the Event without waiting for Hevo.

  • Transient failures that are automatically resolved by Hevo. For example, Hevo may park some Events as Failed Events to reduce the data transfer load, or if some internal thresholds are reached. Similarly, an Event may fail and an error code may be generated for it, but if Hevo finds Auto Mapping enabled, it automatically fixes the issue, and the error is dismissed. These failures are not displayed in the Pipeline Activity page.

  • Unanticipated failures, which are individually investigated and resolved by Hevo.

Note: You may choose to resolve and replay only a partial list or just the critical failed Events for a failed Event Type. The resolved Events are processed and the remaining are again listed as failed Events. You can also permanently discard the failed Events.

Manually Resolving Event Failures

You can view the list of failed Event Types, the failure reason, and the number of Events failing for that reason in the Pipeline Activity page. To resolve an Event failure, you must rectify the issue, and confirm that you have taken the suggested action. Hevo automatically queues the corrected Event to be replayed.

To resolve an Event failure:

  • Click on the summary count of failed Events for a job to view the failure reason. For example, in the image below, clicking on 2.05k Events Failed shows that 93 Events have failed due to missing schema mapping.

    Break-up of total failed events count under different failure reasons

    • To resolve the failure:

      1. Click on the action button and make the required changes. For example, in the image above, click MAP SCHEMA to update the schema mapping. When finished, click DONE.

      2. Click the vertical ellipse icon and then, Replay. Choose Replay or Skip option

      Note: When replayed, Events are fed back into the Transformations stage of the Pipeline. In case the events were created through the Transformations code, they are fed back to the Schema Mapper stage rather than Transformations stage.

    • To discard failed Events for a specific Event Type and failure reason:

      1. Click the vertical ellipse icon in the Event Type row, and then, Skip. A warning message is displayed. Warning on Skipping Failed Event

      2. Click YES, GO AHEAD to purge the current set of failed Events.

      Note: If ingestion is being done via log-based jobs, or through a webhook, or is from a DynamoDB Source, then it may not be possible to re-ingest the data again if you skip the failed Events, as the Source system itself would have purged the data.

Resolving Event failures through transformations

You can apply transformations on failed Events to enable them to be loaded successfully to the Destination. Read Transformations for steps to do this. The Transformations page also allows you to specifically test these transformations on sample failed Events.

Auto-replaying Failed Events

Hevo auto-replays failed Events as frequently as five minutes. The total time taken for the Event to be processed, however, may be determined by other factors too, such as, the amount of time taken by the job to complete.

Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Mar-23-2021 1.59 Added the section, Resolving Event Failures through Transformations.
Last updated on 19 Nov 2021